Quantitative radiomic analysis of computed tomography scans using machine and deep learning techniques accurately predicts histological subtypes of non-small cell lung cancer: A retrospective analysis
Suhrud Panchawagh, Arita Halder, Saloni Haldule, Vivek Sanker, Devansh Lalwani, Rachel Sequeria, Harshika Naik, Atman Desai,
Quantitative radiomic analysis of computed tomography scans using machine and deep learning techniques accurately predicts histological subtypes of non-small cell lung cancer: A retrospective analysis,
European Journal of Surgical Oncology,
Volume 51, Issue 10,
2025,
110376,
ISSN 0748-7983,
https://doi.org/10.1016/j.ejso.2025.110376.
(https://www.sciencedirect.com/science/article/pii/S0748798325008042)
Abstract: Background
Non-small cell lung cancer (NSCLC) histological subtypes impact treatment decisions. While pre-surgical histopathological examination is ideal, it's not always possible. CT radiomic analysis shows promise in predicting NSCLC histological subtypes.
Objective
To predict NSCLC histological subtypes using machine learning and deep learning models using Radiomic features.
Methods
422 lung CT scans from The Cancer Imaging Archive (TCIA) were analyzed. Primary neoplasms were segmented by expert radiologists. Using PyRadiomics, 2446 radiomic features were extracted; post-selection, 179 features remained. Machine learning models like logistic regression (LR), Support vector machine (SVM), Random Forest (RF), XGBoost, LightGBM, and CatBoost were employed, alongside a deep neural network (DNN) model.
Results
RF demonstrated the highest accuracy at 78 % (95 % CI: 70 %–84 %) and AUC-ROC at 94 % (95 % CI: 90 %–96 %). LightGBM, XGBoost, and CatBoost had AUC-ROC values of 95 %, 93 %, and 93 % respectively. The DNN's AUC was 94.4 % (95 % CI: 94.1 %–94.6 %). Logistic regression had the least efficacy. For histological subtype prediction, random forest, boosting models, and DNN were superior.
Conclusions
Quantitative radiomic analysis with machine learning can accurately determine NSCLC histological subtypes. Random forest, ensemble models, and DNNs show significant promise for pre-operative NSCLC classification, which can streamline therapy decisions.
Keywords: Lung cancer; Computed tomography; Radiomics; Histopathology; Artificial intelligence; Machine learning; Classification