Optimization of genomic breeding value prediction for growth traits in Rongchang pigs through machine learning techniques
Pingxian Wu, Junge Wang, Xinyou Chen, Tao Wang, Zongyi Guo, Shuqi Diao, Jinyong Wang,
Optimization of genomic breeding value prediction for growth traits in Rongchang pigs through machine learning techniques,
Machine Learning with Applications,
Volume 22,
2025,
100747,
ISSN 2666-8270,
https://doi.org/10.1016/j.mlwa.2025.100747.
(https://www.sciencedirect.com/science/article/pii/S2666827025001306)
Abstract: Background
The increasing volume of genome sequencing data poses significant challenges for traditional genome-wide prediction methods in handling large datasets. Machine learning (ML) techniques are well-suited for processing high-dimensional data, and offer promising solutions. This study aimed to identify an optimal genome-wide prediction approach for local pig breeds using 10 datasets with varying single nucleotide polymorphism (SNP) densities, derived from imputed sequencing data of 485 Rongchang pigs and the results of genome-wide association studies (GWAS). Three growth traits, namely, backfat (BF) thickness, loin and thoracic height (LTH), and girth circumference (GC), were predicted using six traditional methods and six ML-based methods, including Kernel Ridge Regression (KRR), Support Vector Regression (SVR), Random Forest, Gradient Boosting Decision Tree, Light Gradient Boosting Machine, and Adaboost.
Results
The efficacy of the different methods was evaluated using a five-fold cross-validation strategy and independent tests. The predictive performance of both the traditional and ML-based methods was initially enhanced through the incorporation of significantly associated SNPs and weighted data, with the KRR method exhibiting exceptional resistance to overfitting at a SNP density of 300,000. The ML-based methods outperformed the traditional methods, with improvements of 6.6–8.1 %. The integration of GWAS data enhanced the prediction accuracy of the ML-based methods. KRR and Gradient Boosting Decision Tree demonstrated significant computational efficiency, indicating their potential as promising strategies for genomic prediction in livestock breeding.
Conclusions
This study provides a comprehensive analysis of genome-wide predictions in Rongchang pigs, and highlights the potential of ML-based techniques in enhancing prediction accuracy and efficiency. The study provides valuable insights into GP and holds key implications for advancing genome breeding practices in local pig breeds.
Keywords: Rongchang pigs; Machine learning; Genomic prediction; GWAS