Multimodal deep learning for cetacean distribution modeling of fin whales (Balaenoptera physalus) in the western Mediterranean Sea
Abstract
Cetacean Distribution Modeling (CDM) is used to quantify mobile marine species distributions and densities. It is essential to better understand and protect whales and their relatives. Current CDM approaches often fail in capturing general species-environment relationships, which would be valid within a broader range of environmental conditions that characterize the surveyed regions. This paper aims at investigating the usefulness of deep learning based schemes, namely multi-task and transfer learning, in CDM. Co-training of a stochastic presence-background model on a classification task and a deterministic rule-based model on a regression task was performed. Whale presence-only records were used for the first task, and index outputs of a feeding habitat occurrence model for the second one. This new approach has been experimented through the study case of fin whales in the western Mediterranean Sea. To evaluate our approach, a new metric called True Positive rate per unit of Surface Habitat (TPSH) and an original multimodal fully-connected neural networks were developed. A Generalized Additive Model (GAM)—a standard CDM method—was also used as a reference for performance. Results show that our multi-task learning model improves both the feeding habitat model by 10.8% and data-driven models such as GAM by 16.5% on our TPSH metric in relative terms, revealing a higher accuracy of our approach in estimating whale presence. Such trends in results have been further supported by the use of two other independent datasets that forced models to generalize beyond their training dataset of species-environment relationships. Performance could be further improved by adopting more optimal thresholds as observed from Receiver Operating Characteristic curves, e.g. the multi-task learning model could reach absolute gains up to 10% in the median of the True Positive Rate while maintaining its habitat spatial spreading. Globally, our work confirmed our working hypothesis that expert information on whale behaviour represent a good knowledge base for model generalization. This result can be further improved by a concurrent learning of more local species-environment relationships from in-situ presence data.
Keywords
Western Mediterranean Sea
Ecosystems
Fins (heat exchange)
Knowledge based systems
Learning systems
Multi-task learning
Stochastic models
Stochastic systems
Transfer learning
Environmental conditions
Fully connected neural network
Generalized additive model
Learning based schemes
Model generalization
Receiver operating characteristic curves
Species-environment relationships