Application of machine learning in radiolarian taxonomy: A case study on Early Cretaceous Turbocapsula lineage

2025-12-17

Xin-Yi Zhang, Han-Ting Zhong, Xin Li, Hui Chen, Shan Ye, Ming-Cai Hou, Chao Ma,
Application of machine learning in radiolarian taxonomy: A case study on Early Cretaceous Turbocapsula lineage,
Palaeoworld,
2025,
201005,
ISSN 1871-174X,
https://doi.org/10.1016/j.palwor.2025.201005.
(https://www.sciencedirect.com/science/article/pii/S1871174X25000988)
Abstract: Due to the great abundance of microfossils even in a small sample, they are ideal specimens for machine learning, which needs sufficient sample size. Taking the taxonomic controversy in a certain radiolarian lineage as a case study, a quantitative and objective approach and its advantage to fossil taxonomy is discussed in this study. Unsupervised machine learning algorithms are used to determine the species of radiolarians based on their morphological characteristics. K-Means, Agglomerative Clustering, and Meanshift are applied to build clustering models, with the centroid-based K-Means algorithm providing the most accurate classification results at a 92.26% accuracy. This method improves the efficiency of fossil identification and presents an accurate and objective method for assessing controversies associated with traditional methods.
Keywords: K-Means; clustering analysis; radiolarian; morphology; taxonomy