Application of deep learning in detecting and classification of osteonecrosis: A systematic review and meta-analysis

2026-01-31

Morteza Gholipour, Farnoosh Ghomi, Amirhossein Salmannezhad, Alireza Motamedi, Mahdi Zahmatyar, Yasaman Rahimi, Fatemeh Abbasi,
Application of deep learning in detecting and classification of osteonecrosis: A systematic review and meta-analysis,
Journal of Orthopaedic Reports,
2025,
100815,
ISSN 2773-157X,
https://doi.org/10.1016/j.jorep.2025.100815.
(https://www.sciencedirect.com/science/article/pii/S2773157X2500267X)
Abstract: Background
Osteonecrosis, also known as avascular necrosis (AVN), is a complicated condition that can develop after total hip arthroplasty, trauma, chronic corticosteroid use, alcohol abuse, and in association with medical conditions like sickle cell anemia and lupus. AVN can potentially lead to osteoarthritis and poor postoperative outcomes (e.g., sepsis and readmission), highlighting the importance of early diagnosis. Deep learning approaches have been deployed in various medical fields to improve and facilitate diagnosis, prognosis, and classification. This meta-analysis aimed to evaluate the diagnostic accuracy of deep learning (DL) models in detecting and classifying hip AVN, by synthesizing pooled sensitivity and specificity estimates and exploring key sources of heterogeneity.
Methods
A comprehensive systematic search was executed across PubMed, Web of Science, Scopus, and Cochrane library. Studies were included if they measured osteonecrosis on X-ray, Computed Tomography (CT), and MRI using deep learning models.
Results
A total of 21 publications were included, containing 19,168 hips with osteonecrosis of the femoral head (ONFH); 47,389 healthy hips; and 6784 hips with other conditions; 319 AVN lunate with 1228 controls. Across nine studies with sufficient data, the pooled sensitivity and specificity averaged 90 % (95 % CI: 80–95) and 98 % (95 % CI: 95–99), respectively. Among these, CNN-based models demonstrated high diagnostic accuracy, with several studies reporting sensitivity and specificity exceeding 90 %. However, substantial heterogeneity was detected, with I2 values of 89.82 % for sensitivity and 81.44 % for specificity (p < 0.0001).
Conclusion
Although our meta-analysis did not reveal significant difference in sensitivity and specificity, it has been observed that convolutional neural networks(CNNs) outperformed conventional techniques in the detection and classification of ONFH. With high accuracy and specificity, they have shown promising potential to reduce misdiagnosis and facilitate early detection of this condition.
Keywords: Deep learning; Osteonecrosis; Femoral head osteonecrosis; ONFH; Artificial intelligence