Density estimation of the main structuring sessile species in underwater marine caves with a deep learning approach

Sergio Sierra, Elena Prado, Luis Rodríguez-Cobo, Carla Quiles-Pons, Pablo Roldán-Varona, David Díaz-Viñolas, Pedro Anuarbe-Cortés, Adolfo Cobo, Francisco Sánchez

Article ID: 1980
Vol 6, Issue 1, 2023

VIEWS - 380 (Abstract) 196 (PDF)


Monitoring marine biodiversity is a challenge in some vulnerable and difficult-to-access habitats, such as underwater caves. Underwater caves are a great focus of biodiversity, concentrating a large number of species in their environment. However, most of the sessile species that live on the rocky walls are very vulnerable, and they are often threatened by different pressures. The use of these spaces as a destination for recreational divers can cause different impacts on the benthic habitat. In this work, we propose a methodology based on video recordings of cave walls and image analysis with deep learning algorithms to estimate the spatial density of structuring species in a study area. We propose a combination of automatic frame overlap detection, estimation of the actual extent of surface cover, and semantic segmentation of the main 10 species of corals and sponges to obtain species density maps. These maps can be the data source for monitoring biodiversity over time. In this paper, we analyzed the performance of three different semantic segmentation algorithms and backbones for this task and found that the Mask R-CNN model with the Xception101 backbone achieves the best accuracy, with an average segmentation accuracy of 82%.


Marine Biodiversity; Underwater Caves; Underwater Images; Deep Learning; Semantic Segmentation

Full Text:



1. Gerovasileiou V, Bianchi CN. Mediterranean marine caves: A synthesis of current knowledge. Boca Raton: CRC Press; 2021. p. 87.

2. Navarro-Barranco C, Ambroso S, Gerovasileiou V, et al. Conservation of dark habitats. In: Espinosa F (editor). Anonymous coastal habitat conservation. Cambridge: Academic Press; 2023. p. 147–170.

3. Gerovasileiou V, Voultsiadou E. Marine caves of the Mediterranean Sea: A sponge biodiversity reservoir within a biodiversity hotspot. PLoS One 2012; 7(7): e39873. doi: 10.1371/journal.pone.0039873.

4. Montefalcone M, De Falco G, Nepote E, et al. Thirty year ecosystem trajectories in a submerged marine cave under changing pressure regime. Marine Environmental Research 2018; 137: 98–110. doi: 10.1016/j.marenvres.2018.02.022.

5. Gerovasileiou V, Trygonis V, Sini M, et al. Three-dimensional mapping of marine caves using a handheld echosounder. Marine Ecology Progress Series 2013; 486: 13–22. doi: 10.3354/meps10374.

6. Quiles-Pons C, Baena I, Calvo-Manazza M, et al. Monitoring the complex benthic habitat on semi-dark underwater marine caves using photogrammetry-based 3D reconstructions. In: Proceedings of 3rd Mediterranean Symposium on the Conservation of the Dark Habitats; 2022 Sep 21–22; Genoa. Palma De Mallorca: Centro Oceanográfico de Baleares; 2022.

7. Dimarchopoulou D, Gerovasileiou V, Voultsiadou E. Spatial variability of sessile benthos in a semi-submerged marine cave of a remote Aegean Island (eastern Mediterranean Sea). Regional Studies in Marine Science 2018; 17: 102–111. doi: 10.1016/j.rsma.2017.11.015.

8. Er MJ, Chen J, Zhang Y, Gao W. Research challenges, recent advances, and popular datasets in deep learning-based underwater marine object detection: A review. Sensors 2023; 23(4): 1990. doi: 10.3390/s23041990.

9. Mohamed H, Nadaoka K, Nakamura T. Automatic semantic segmentation of benthic habitats using images from towed underwater camera in a complex shallow water environment. Remote Sensing 2022; 14(8): 1818. doi: 10.3390/rs14081818.

10. Abad-Uribarren A, Prado E, Sierra S, et al. Deep learning-assisted high-resolution mapping of vulnerable habitats within the Capbreton Canyon System, Bay of Biscay. Estuarine, Coastal and Shelf Science 2022; 275: 107957. doi: 10.1016/j.ecss.2022.107957.

11. Pierce JP, Rzhanov Y, Lowell K, Dijkstra JA. Reducing annotation times: Semantic segmentation of coral reef survey images. In: Proceedings of Global Oceans 2020: Singapore–U.S. Golf Coast; 2020 Oct 5–30; Biloxi. New York: IEEE; 2020. p. 1–9.

12. Stobart B, Díaz D, Álvarez F, et al. Performance of baited underwater video: Does it underestimate abundance at high population densities? PLoS One 2015; 10(5): e0127559. doi: 10.1371/journal.pone.0127559.

13. Zhang S, Zhao S, An D, et al. Visual SLAM for underwater vehicles: A survey. Computer Science Review 2022; 46: 100510. doi: 10.1016/j.cosrev.2022.100510.

14. Lindeberg T. Scale invariant feature transform. Scholarpedia 2012; 7(5): 10491. doi: 10.4249/scholarpedia.10491.

15. Moreno-Barea FJ, Jerez JM, Franco L. Improving classification accuracy using data augmentation on small data sets. Expert Systems with Applications 2020; 161: 113696. doi: 10.1016/j.eswa.2020.113696.

16. Han F, Yao J, Zhu H, Wang C. Underwater image processing and object detection based on deep CNN method. Journal of Sensors 2020; 2020. doi: 10.1155/2020/6707328.

17. Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation. In: Navab N, Hornegger J, Wells W, Frangi A (editors). Proceedings of Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015: 18th International Conference; 2015 Oct 5–9; Munich. Cham: Springer International Publishing; 2015. p. 234–241.

18. Zhang H, Wu C, Zhang Z, et al. ResNeSt: Split-attention networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops; 2022 Jun 19–20; New Orleans. New York: IEEE; 2022. p. 2736–2746.

19. Liu Z, Mao H, Wu C, et al. A ConvNet for the 2020s. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); 2022 Jun 18–24; New Orleans. New York: IEEE; 2022. p. 11976–11986.

20. Gulati M. How to choose evaluation metrics for classification models [Internet]. Gurgaon: Analytics Vidhya; 2020 [updated 2020 Oct 11]. Available from:

21. He K, Gkioxari G, Dollár P, Girshick R. Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision; 2017 Oct 22–29; Venice. New York: IEEE; 2018. p. 2961–2969.



  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

This site is licensed under a Creative Commons Attribution 4.0 International License.