Source apportionment of on-site paper-based combustion residues through interpretable machine learning and HS-GC-IMS fingerprint analysis in public security

2025-11-18

Wenhui Lu, Liangliang Zhang, Zixuan Nie, Xian Wu,
Source apportionment of on-site paper-based combustion residues through interpretable machine learning and HS-GC-IMS fingerprint analysis in public security,
Microchemical Journal,
Volume 218,
2025,
115811,
ISSN 0026-265X,
https://doi.org/10.1016/j.microc.2025.115811.
(https://www.sciencedirect.com/science/article/pii/S0026265X25031595)
Abstract: Fire investigation plays a crucial role in determining the origin and cause of fire incidents, thereby influencing subsequent criminal liability assessments. The collection and analysis of trace evidence play a significant role in evidence confirmation, case reconstruction, and advancing forensic science for solving fire-related cases. On-site combustion residues present significant challenges due to their complex sources, minute quantities, and compromised integrity, complicating trace evidence retrieval in public security operations. Combustion-derived odor traces offer potential sources for volatile-based evidence analysis. This study utilized Headspace Gas Chromatography-Ion Mobility Spectrometry (HS-GC-IMS) to comprehensively profile volatile organic compounds (VOCs) in paper-based combustion residues. Tree-based machine learning algorithms were integrated to identify key VOC markers, enabling rapid source apportionment of multi-category and low-concentration volatiles. A total of 51 different types of volatiles were qualitatively identified, along with the visualization of topological distribution and fingerprint. The Categorical Boosting (CatBoost) model was identified as optimal, achieving high accuracy (96.67 %), precision (96.51 %), recall (97.22 %), F1-score (96.72 %), and an AUC value of 100 %. The SHapley Additive exPlanations (SHAP) framework was applied to interpret feature importance, enhancing model credibility and operational transparency. This study has demonstrated that integrating interpretable machine learning with HS-GC-IMS strategies can elucidate complex relationships between diverse on-site combustion residue sources and their characteristic VOC profiles. These findings are expected to provide a scientific foundation for fire case investigations, compensation claims, and liability assessments.
Keywords: Combustion residues; Interpretable machine learning; Volatile organic components; HS-GC-IMS; Public security