Predicting fishing activity from vessel tracking data is crucial for quantifying fishing effort. This study addresses this challenge by classifying the fishing versus non-fishing activity status of small-scale vessels using passive gears, with a suite of different algorithms, ranging from basic statistics (Logistic Regression - LoRe) to Machine Learning (Decision Trees - Dtree, Random Forests - RaFo, and Extreme Gradient Boosting - XGBo). Results demonstrate that the Machine Learning (ML) ensemble significantly outperformed LoRe, especially with XGBo and Dtree achieving comparable high accuracy and robustness across training, validation, and test sets. By employing SHAP (SHapley Additive exPlanations), we demonstrate that the vessel speed (SPEED) and course variations (course_diff), the hour of the day (hours), and the distance from the coast (distance_from_coast) or the bathymetric depth (depth), are the primary mechanistic drivers for discerning fishing operations in passive-gear small-scale fisheries (SSF). We provide a fully reproducible workflow and a unique, high-resolution dataset of manually labelled tracking data to address the critical scarcity of validated resources in this field. This framework provides a timely, scalable solution for high-resolution tracking analysis, directly addressing the technical needs arising from upcoming EU mandates (Control Regulation 2023/2842) for small-scale vessel monitoring. The shared code and data enable researchers to evaluate model transferability and generalisation, providing a standardised approach to harmonise fishing effort estimation across diverse geographic contexts. Finally, the provided code is structured as an accessible framework for fisheries scientists with limited ML experience, offering a practical foundation for implementing automated activity classification.
Predicting fishing vs. not-fishing in small-scale fisheries: A sample vessel tracking dataset and a reproducible machine learning approach / Lattanzi, Pamela; Vasapollo, Claudio; Samarão, Joao; Galdelli, Alessandro; Mendo, Tania; Rufino, Marta; Bolognini, Luca; Tassetti, Anna Nora. - In: SOFTWAREX. - ISSN 2352-7110. - 34:(2026). [10.1016/j.softx.2026.102579]
Predicting fishing vs. not-fishing in small-scale fisheries: A sample vessel tracking dataset and a reproducible machine learning approach
Lattanzi, Pamela
;Galdelli, Alessandro;Bolognini, Luca;Tassetti, Anna Nora
2026-01-01
Abstract
Predicting fishing activity from vessel tracking data is crucial for quantifying fishing effort. This study addresses this challenge by classifying the fishing versus non-fishing activity status of small-scale vessels using passive gears, with a suite of different algorithms, ranging from basic statistics (Logistic Regression - LoRe) to Machine Learning (Decision Trees - Dtree, Random Forests - RaFo, and Extreme Gradient Boosting - XGBo). Results demonstrate that the Machine Learning (ML) ensemble significantly outperformed LoRe, especially with XGBo and Dtree achieving comparable high accuracy and robustness across training, validation, and test sets. By employing SHAP (SHapley Additive exPlanations), we demonstrate that the vessel speed (SPEED) and course variations (course_diff), the hour of the day (hours), and the distance from the coast (distance_from_coast) or the bathymetric depth (depth), are the primary mechanistic drivers for discerning fishing operations in passive-gear small-scale fisheries (SSF). We provide a fully reproducible workflow and a unique, high-resolution dataset of manually labelled tracking data to address the critical scarcity of validated resources in this field. This framework provides a timely, scalable solution for high-resolution tracking analysis, directly addressing the technical needs arising from upcoming EU mandates (Control Regulation 2023/2842) for small-scale vessel monitoring. The shared code and data enable researchers to evaluate model transferability and generalisation, providing a standardised approach to harmonise fishing effort estimation across diverse geographic contexts. Finally, the provided code is structured as an accessible framework for fisheries scientists with limited ML experience, offering a practical foundation for implementing automated activity classification.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


