Application of YOLOv10 Integrated with Attention Mechanism in the Senseless Monitoring of Students' Classroom Psychological State

Shaochong Yao1*
1 School of Information Engineering, Xi'an Mingde Institute of Technology, Xi'an, Shaanxi, China
* Corresponding author: Shaochong Yao, Email: yaoshaochong@163.com
International Scientific Technical and Economic Research 2026, Vol. 4, No. 2, pp. 186-205
DOI: 10.71451/ISTAER2621
Received: 7 February 2026; Revised: 15 March 2026; Accepted: 24 April 2026; Published: 8 May 2026
Abstract

This paper proposes a YOLOv10 model integrated with an attention mechanism for the senseless monitoring of students' psychological states in class, aiming to achieve high-precision, real-time, and non-invasive psychological state recognition. The method introduces a multi-layer attention module in both channel and spatial dimensions to enhance the representation ability of key features. At the same time, collaborative optimization of detection and mental state recognition is achieved by combining lightweight feature enhancement with an end-to-end mental state classification network. The model is validated on a large-scale real classroom dataset (561,200 images covering multiple disciplines, different lighting, and occlusion conditions). It achieves an mAP@0.5 of 0.873, a psychological state classification accuracy of 0.835, and an F1-score of 0.812, while maintaining a real-time performance of 69 FPS. Ablation experiments show that the attention module and the feature enhancement module contribute 4.4% and 5.3% to mAP, respectively, demonstrating the model's robustness in complex scenes. The stability and long-term monitoring capability of the system are further verified in 50 real classroom deployment experiments. The results show that this method achieves high-precision, real-time, and deployable monitoring of students' psychological states in intelligent education scenarios, providing quantifiable data support for classroom management and teaching optimization.

Keywords
YOLOv10 Attention mechanism Classroom psychological state Senseless monitoring Real-time target detection
References
  1. Hickey, B. A., Chalmers, T., Newton, P., Lin, C. T., Sibbritt, D., McLachlan, C. S., ... & Lal, S. (2021). Smart devices and wearable technologies to detect and monitor mental health conditions and stress: A systematic review. Sensors, 21(10), 3461. DOI: 10.3390/s21103461
  2. Gomes, N., Pato, M., Lourenco, A. R., & Datia, N. (2023). A survey on wearable sensors for mental health monitoring. Sensors, 23(3), 1330. DOI: 10.3390/s23031330
  3. Sheikh, M., Qassem, M., & Kyriacou, P. A. (2021). Wearable, environmental, and smartphone-based passive sensing for mental health monitoring. Frontiers in Digital Health, 3, 662811. DOI: 10.3389/fdgth.2021.662811
  4. Gopalakrishnan, A., Gururajan, R., Zhou, X., Venkataraman, R., Chan, K. C., & Higgins, N. (2024). A survey of autonomous monitoring systems in mental health. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 14(3), e1527. DOI: 10.1002/widm.1527
  5. Vamshi Krishna, B., Padmavathy, N., & Kumar, A. (2025). Deep Learning Models for Monitoring Student's Emotion During the Class: A Comprehensive Survey. In Artificial Intelligence and IoT in Online Education Systems: Monitoring, Assessment, and Evaluation, 165-201. DOI: 10.1002/9781394302666.ch6
  6. Zhang, X., Ding, Y., Huang, X., Li, W., Long, L., & Ding, S. (2024). Smart classrooms: How sensors and ai are shaping educational paradigms. Sensors, 24(17), 5487. DOI: 10.3390/s24175487
  7. Leo, M., Carcagnì, P., Mazzeo, P. L., Spagnolo, P., Cazzato, D., & Distante, C. (2020). Analysis of facial information for healthcare applications: A survey on computer vision-based approaches. Information, 11(3), 128. DOI: 10.3390/info11030128
  8. Manakitsa, N., Maraslidis, G. S., Moysis, L., & Fragulis, G. F. (2024). A review of machine learning and deep learning for object detection, semantic segmentation, and human action recognition in machine and robotic vision. Technologies, 12(2), 15. DOI: 10.3390/technologies12020015
  9. Jiang, Z., Luskus, M., Seyedi, S., Griner, E. L., Rad, A. B., Clifford, G. D., ... & Cotes, R. O. (2022). Utilizing computer vision for facial behavior analysis in schizophrenia studies: A systematic review. PLoS One, 17(4), e0266828. DOI: 10.1371/journal.pone.0266828
  10. Sarma, D., & Bhuyan, M. K. (2021). Methods, databases and recent advancement of vision-based hand gesture recognition for hci systems: A review. SN Computer Science, 2(6), 436. DOI: 10.1007/s42979-021-00827-x
  11. Tong, F. (2025). Edge-Assisted CNN-Attention Model for Real-Time Multimodal Learner State Recognition in IoT-Enhanced Educational Systems. Informatica, 49(32). DOI: 10.31449/inf.v49i32.10569
  12. Rasheed, S. (2026). Lightweight Deep Learning Models for Face Mask Detection in Real-Time Edge Environments: A Review and Future Research Directions. Machine Learning and Knowledge Extraction, 8(4), 102. DOI: 10.3390/make8040102
  13. Hosain, M. T., Zaman, A., Abir, M. R., Akter, S., Mursalin, S., & Khan, S. S. (2024). Synchronizing object detection: Applications, advancements and existing challenges. IEEE Access, 12, 54129-54167. DOI: 10.1109/access.2024.3388889
  14. Saikrishna, P. S. (2026). Affective Edge Computing: Challenges and Opportunities in Decoding Emotional States. In Bridging the Gap between Mind and Machine: Exploring the Future of Human-AI-Neurotechnology Integration, 41-63. DOI: 10.1007/978-3-032-06713-5_3
  15. Elhanashi, A., Dini, P., Saponara, S., & Zheng, Q. (2023). Integration of deep learning into the iot: A survey of techniques and challenges for real-world applications. Electronics, 12(24), 4925. DOI: 10.3390/electronics12244925
  16. Li, H., Yue, X., & Meng, L. (2022). Enhanced mechanisms of pooling and channel attention for deep learning feature maps. PeerJ Computer Science, 8, e1161. DOI: 10.7717/peerj-cs.1161
  17. Zhu, Y., Han, G., Zhu, H., & Zhang, F. (2025). Feature Description Attention: Channel-independent local–global fusion for multi-scale feature representation. Engineering Applications of Artificial Intelligence, 161, 112139. DOI: 10.1016/j.engappai.2025.112139
  18. Liu, T., Luo, R., Xu, L., Feng, D., Cao, L., Liu, S., & Guo, J. (2022). Spatial channel attention for deep convolutional neural networks. Mathematics, 10(10), 1750. DOI: 10.3390/math10101750
  19. Li, X., Lei, L., Sun, Y., Li, M., & Kuang, G. (2020). Multimodal bilinear fusion network with second-order attention-based channel selection for land cover classification. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 13, 1011-1026. DOI: 10.1109/jstars.2020.2975252
  20. Caicedo, J. E., Agudelo-Martínez, D., Rivas-Trujillo, E., & Meyer, J. (2023). A systematic review of real-time detection and classification of power quality disturbances. Protection and Control of Modern Power Systems, 8(1), 1-37. DOI: 10.1186/s41601-023-00277-y
  21. Panigrahi, R., Borah, S., Bhoi, A. K., Ijaz, M. F., Pramanik, M., Jhaveri, R. H., & Chowdhary, C. L. (2021). Performance assessment of supervised classifiers for designing intrusion detection systems: a comprehensive review and recommendations for future research. Mathematics, 9(6), 690. DOI: 10.3390/math9060690
  22. Thakkar, A., & Lohiya, R. (2022). A survey on intrusion detection system: feature selection, model, performance measures, application perspective, challenges, and future research directions. Artificial Intelligence Review, 55(1), 453-563. DOI: 10.1007/s10462-021-10037-9
  23. Şentaş, A., Tashiev, İ., Küçükayvaz, F., Kul, S., Eken, S., Sayar, A., & Becerikli, Y. (2020). Performance evaluation of support vector machine and convolutional neural network algorithms in real-time vehicle type and color classification. Evolutionary Intelligence, 13(1), 83-91. DOI: 10.1007/s12065-018-0167-z
  24. Mateen, M., Wen, J., Hassan, M., Nasrullah, N., Sun, S., & Hayat, S. (2020). Automatic detection of diabetic retinopathy: a review on datasets, methods and evaluation metrics. IEEE Access, 8, 48784-48811. DOI: 10.1109/access.2020.2980055
  25. Fan, C., Ghaemi, S., Khazaei, H., & Musilek, P. (2020). Performance evaluation of blockchain systems: A systematic survey. IEEE Access, 8, 126927-126950. DOI: 10.1109/access.2020.3006078
  26. Atilgan, C., & Mercimek, M. (2025). Balancing Precision and Speed: Introducing The Performance Efficiency Evaluation Ratio (PEER) in Visual Odometry. IEEE Access. DOI: 10.1109/access.2025.3571921
  27. Vdoviak, G., Sledevič, T., Serackis, A., Plonis, D., Matuzevičius, D., & Abromavičius, V. (2025). Evaluation of deep learning models for insects detection at the hive entrance for a bee behavior recognition system. Agriculture, 15(10), 1019. DOI: 10.3390/agriculture15101019
  28. Zhang, X., Zhang, Y., Li, Z., Song, Y., Chen, S., Mao, Z., ... & Nie, L. (2025). A real-time cell image segmentation method based on multi-scale feature fusion. Bioengineering, 12(8), 843. DOI: 10.3390/bioengineering12080843
  29. Raghavan, K., B, S., & v, K. (2024). Attention guided grad-CAM: an improved explainable artificial intelligence model for infrared breast cancer detection. Multimedia Tools and Applications, 83(19), 57551-57578. DOI: 10.1007/s11042-023-17776-7
  30. Zhang, Y., Zhu, Y., Liu, J., Yu, W., & Jiang, C. (2024). An interpretability optimization method for deep learning networks based on Grad-CAM. IEEE Internet of Things Journal, 12(4), 3961-3970. DOI: 10.1109/jiot.2024.3485765