Using machine learning to predict child active transportation prevalence
Tate HubkaRao, Alberto Nettel-Aguirre, Marie-Soleil Cloutier, Brent E. Hagel,
Using machine learning to predict child active transportation prevalence,
Journal of Transport & Health,
Volume 45,
2025,
102197,
ISSN 2214-1405,
https://doi.org/10.1016/j.jth.2025.102197.
(https://www.sciencedirect.com/science/article/pii/S2214140525002178)
Abstract: Background
Active school transportation (AST) can have a host of physical and mental health benefits. Unfortunately, child AST rates have declined over the last few decades. Changes to the built environment can improve AST prevalence. Due to the complexity within the road system, machine learning models may hold promise to accurately predict factors related to child AST. As such, our aim was to train and evaluate a machine learning algorithm to predict the prevalence of child AST.
Methods
Data were collected from The CHASE (CHild Active-transportation Safety and the Environment) study's geodatabase, including seven Canadian municipalities/regions. The proportion of enrolled students using AST at each school was assessed by observing students arrive to school in May/June of 2018 or 2019. Data were aggregated at the school catchment zone as the unit of analysis. Both national and city-specific models were trained and validated. Root mean squared error was used to assess prediction accuracy. A measure of variable importance was also calculated.
Results
A total of 541 elementary schools were included. Median city AST prevalence ranged from 0.4 (Calgary) to 0.73 (Montreal). National and city-specific models resulted in similar prediction accuracy. Population density, Walk Score®, proportion of child population enrolled in the school, and size of residential area within each school catchment zone were frequently highly ranked in importance.
Conclusions
Population and housing density were the two most important predictors of AST prevalence. Policies that can increase population and housing density will, therefore, likely increase AST among school-aged children.
Keywords: Machine learning; active school transportation; Children; Decision-trees; Prediction