Machine Learning Approaches for Detecting Abnormalities in Female Fetuses

Authors

DOI:

https://doi.org/10.71451/ISTAER2550

Keywords:

Female fetal abnormality; NIPT; Machine learning; Random Forest; SMOTE

Abstract

Non-invasive prenatal testing (NIPT) plays a vital role in the early detection of female fetal abnormalities, which is essential for birth defect prevention. In this study, clinical data containing Z-scores of chromosomes 21, 18, and 13, GC content, X chromosome concentration, read count ratio, and maternal BMI were analyzed. To address the class imbalance caused by the limited number of abnormal cases, the Synthetic Minority Over-sampling Technique (SMOTE) was applied, and stratified sampling was used to divide the dataset into training, validation, and testing sets (7:2:1). Multiple machine learning models, including XGBoost, Decision Tree, CNN, MLP, SVM, and Random Forest, were developed and evaluated with accuracy, precision, recall, F1-score, and AUC-ROC metrics. Results demonstrated that Random Forest outperformed other models, achieving an AUC of 0.997 with strong stability and generalization. These findings highlight the effectiveness of machine learning combined with proper data preprocessing in enhancing female fetal abnormality detection.

References

[1] Tímea Csákvári, Diána Elmer, Krisztina Palkovics, Luca Fanni Sántics Kajos, Bettina Kovács, Kálmán Kovács... & Imre Boncz. (2025). Trends and Projections of the Prevalence of Diabetes Mellitus in Pregnancy and Fetal–Neonatal Metabolic Disorders, 2010–2035: A Nationwide Population-Based Study from Hungary. Journal of Clinical Medicine,14(16),5740-5740. DOI: https://doi.org/10.3390/jcm14165740 DOI: https://doi.org/10.3390/jcm14165740

[2] Ji Eun Hong, Yeon Eun Kim, Yun Soo Kang, Dong Hyeok Choi, So Hyun Ahn & Jeongshin An. (2025). SMOTE-augmented machine learning model predicts recurrent and metastatic breast cancer from microbiome analysis. Scientific Reports,15(1),33096-33096. DOI: https://doi.org/10.1038/s41598-025-16790-z DOI: https://doi.org/10.1038/s41598-025-16790-z

[3] Akash Chauhan & Indrajeet Kumar. (2025). Deep feature extraction and optimized VGG16-SVM architecture for breast cancer characterization. Discover Computing,28(1),208-208. DOI: https://doi.org/10.1007/s10791-025-09736-6 DOI: https://doi.org/10.1007/s10791-025-09736-6

[4] Yesim Yekta Yuruk. (2025). Uncover This Tech Term: Random Forest. Korean journal of radiology,26(10),998-1001. DOI: https://doi.org/10.3348/kjr.2025.0800 DOI: https://doi.org/10.3348/kjr.2025.0800

[5] Molly Asher, Yannick Oswald & Nick Malleson. (2025). Understanding pedestrian dynamics using machine learning with real-time urban sensors. Environment and Planning B: Urban Analytics and City Science,52(8),1994-2017. DOI: https://doi.org/10.1177/23998083251319058 DOI: https://doi.org/10.1177/23998083251319058

[6] Zhao, T., Chen, G., Suraphee, S., Phoophiwfa, T., & Busababodhin, P. (2025). A hybrid TCN-XGBoost model for agricultural product market price forecasting. PLoS One, 20(5), e0322496. DOI: https://doi.org/10.1371/journal.pone.0322496 DOI: https://doi.org/10.1371/journal.pone.0322496

[7] Aliasghar Bazrafkan, Hannah Worral, Nonoy Bandillo & Paulo Flores. (2025). Multispectral data and random forest model outperform deep learning in predicting lentil maturity using UAS imagery. Journal of Agriculture and Food Research,23,102202-102202. DOI: https://doi.org/10.1016/j.jafr.2025.102202 DOI: https://doi.org/10.1016/j.jafr.2025.102202

[8] Luigi Lavazza, Sandro Morasca & Gabriele Rotoloni. (2025). Software Defect Prediction evaluation: New metrics based on the ROC curve. Information and Software Technology,187,107865-107865 DOI: https://doi.org/10.1016/j.infsof.2025.107865 DOI: https://doi.org/10.1016/j.infsof.2025.107865

[9] Bruno X Ferreira, Alline V B de Oliveira, João Cajaiba, Vinicius Kartnaller & Brunno F Santos. (2025). Machine learning models for measurement of pH using a low-cost image analysis strategy. Measurement Science and Technology,36(9),096013-096013. DOI: https://doi.org/10.1088/1361-6501/adffa0 DOI: https://doi.org/10.1088/1361-6501/adffa0

[10] Chenglong Yao, Yinglan A, Guoqiang Wang, Baolin Xue, Jin Wu & Xianglong Dai. (2025). Evaluation of grassland biomass and driving factors in the Hailar river basin based on random forest model. Journal of Cleaner Production,526,146590-146590. DOI: https://doi.org/10.1016/j.jclepro.2025.146590 DOI: https://doi.org/10.1016/j.jclepro.2025.146590

Downloads

Published

2025-10-11

Issue

Section

Research Article

How to Cite

Machine Learning Approaches for Detecting Abnormalities in Female Fetuses. (2025). International Scientific Technical and Economic Research , 25-33. https://doi.org/10.71451/ISTAER2550

Similar Articles

1-10 of 34

You may also start an advanced similarity search for this article.