Predicting wildfire occurrences in Portugal using machine learning classification models
Jorge Caiado, Mariana Marques,
Predicting wildfire occurrences in Portugal using machine learning classification models,
Ecological Informatics,
Volume 92,
2025,
103455,
ISSN 1574-9541,
https://doi.org/10.1016/j.ecoinf.2025.103455.
(https://www.sciencedirect.com/science/article/pii/S1574954125004649)
Abstract: Wildfires pose significant environmental and socio-economic challenges, particularly in fire-prone regions such as Portugal. The ability to predict wildfire occurrences is essential for improving preparedness and mitigation strategies. This study evaluates the effectiveness of three machine learning classification models (Logistic Regression, Random Forest and XGBoost) in forecasting wildfire occurrences across four Portuguese districts: Lisbon, Porto, Setúbal and Viseu. Using historical fire occurrence data and meteorological variables, the models were trained and tested on different land-use categories, including settlements, brush and agriculture. The results indicate that brush fires are the most predictable due to strong climatic influences, with models achieving F1-scores above 0.93. Settlement fires, in contrast, were more challenging to predict, likely due to human-driven variability, whereas agricultural fires exhibited intermediate predictability. To address dataset imbalances, the Synthetic Minority Oversampling Technique (SMOTE) was applied, leading to improvements in recall but a trade-off in precision. Feature importance analysis highlights the influence of long-term temporal trends, meteorological conditions and human activity on wildfire risk. These findings demonstrate the potential of machine learning models in wildfire forecasting and provide valuable insights for policymakers and fire management authorities in designing targeted prevention strategies.
Keywords: Wildfire prediction; Machine learning; Fire risk assessment; Climate and fire modeling; Land-use and wildfires; SMOTE and imbalanced data