Evaluating ensemble machine learning and feature optimization for mapping peatland vegetation using UAS-derived data

2025-12-18

Alireza Hamedianfar, Liisa Maanavilja, Heikki Sutinen,
Evaluating ensemble machine learning and feature optimization for mapping peatland vegetation using UAS-derived data,
International Journal of Applied Earth Observation and Geoinformation,
Volume 144,
2025,
104901,
ISSN 1569-8432,
https://doi.org/10.1016/j.jag.2025.104901.
(https://www.sciencedirect.com/science/article/pii/S1569843225005485)
Abstract: To address biodiversity loss and restore ecosystem services, it is important to monitor ecosystem degradation and recovery. Unmanned aerial systems (UAS) provide high-resolution data for site-specific monitoring. During degradation or recovery in peatlands, vegetation composition is a crucial indicator of ecosystem condition. In this study, UAS multispectral images, vegetation indices, and digital surface models (DSM) were used to classify peatland vegetation and land cover types. The classes included hummock and lawn Sphagnum, cottongrass Eriophorum vaginatum, evergreen dwarf shrubs, bare peat and water. Previous research has focused on categorizing peatland vegetation, with Random Forest (RF) being the most widely used machine learning algorithms. This study aims to evaluate the classification performance of different ensemble machine learning algorithms, including RF, Extreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), and Categorical Boosting (CatBoost). Classifications were performed using both default and optimized hyperparameters. Optuna was applied to improve classifier performance by systematically exploring a wide range of hyperparameter values. Additionally, Shapley Additive Expansions (SHAP) feature importance was utilized to improve the explainability of the classification algorithms and pinpoint the critical features that drive classification outcomes. The results indicate that hyperparameter optimization improved overall accuracy from 81 to 86% (default) to 86–87% (optimized). Despite their similar performance, RF showed better differentiation of lawn Sphagnum with both default and optimized hyperparameter settings, while CatBoost had the better accuracies to characterize Eriophorum vaginatum tussocks and hummock Sphagnum. DSM and Red edge were found to be the influential features according to the SHAP results. These findings emphasize the value of hyperparameter optimization and feature importance analysis in improving classification accuracy and feature interpretability for peatland vegetation classes.
Keywords: Image classification; Peatlands; Restoration monitoring; Sphagnum; Ensemble algorithms; Optuna hyperparameter optimization; SHAP feature importance