Deep learning object detection for fossil diatom counting: assessing the impact of fossil preservation and intraspecific morphological variation

2026-03-18

Saki Ishino, Takuya Itaki, Motohisa Fukuda,
Deep learning object detection for fossil diatom counting: assessing the impact of fossil preservation and intraspecific morphological variation,
Marine Micropaleontology,
Volume 201,
2025,
102519,
ISSN 0377-8398,
https://doi.org/10.1016/j.marmicro.2025.102519.
(https://www.sciencedirect.com/science/article/pii/S0377839825000842)
Abstract: Recent evidence suggests that object detection techniques based on deep learning are evidently useful for automating microfossil analysis, particularly by enabling the rapid and accurate extraction of target particles. While the assemblage and morphometric analysis of fossil diatoms requires unique procedures, such as including fragmented specimens in counts and accounting for intra-morphometric variation, little is known regarding how these factors affect detection accuracy or how to efficiently construct training datasets for data-driven methods such as deep learning. In this study, we experimentally evaluated the use of the YOLOv5 object detection model to detect Eucampia antarctica, a key paleoenvironmental indicator, across sites in the Southern Ocean that vary in sedimentological and biogeographical characteristics. Detection accuracy was assessed using the datasets from fourteen test sites for the models trained on datasets from four individual sites, that vary within E. antarctica morphology and fossil preservation state, as well as models trained on pairwise combinations of these sites. Our results show that morphological variation of E. antarctica did not significantly affect detection performance, but models trained on datasets of moderately preserved fossils slightly outperformed those trained on datasets of well-preserved fossils. Furthermore, the findings suggest that incorporating diverse non-target particles including other diatom fragments and sediment particles in training data is critical for developing robust models that maintain consistently high performance in diverse regions. Our experiments demonstrate that object detection models allow rapid and accurate counting of E. antarctica, thereby improving its use in paleoenvironmental reconstructions, including past sea ice and surface temperatures.
Keywords: Fossil diatoms; Eucampia antarctica; Deep learning; Object detection; Biogeography; Paleoceanography; Paleoenvironmental indicator