Sichuan Knowledgeable Intelligent Sciences International Scientific Technical and Economic Research 2959-1309 4 2 2026 04 08 Research on Transformer-Based Action Sequence Modeling of Intangible Cultural Heritage Shadow Play Using Attention Mechanisms 51 77 10.71451/ISTAER2615 eng Yuxiao Liu Art and Design, Beijing City University, Shunyi District, Beijing, China 0009-0003-7951-314X Shuolei Feng Department of Information Science, Beijing City University, Shunyi District, Beijing, China 0009-0003-8967-5134 Mengyu Liu Art and Design, Beijing City University, Shunyi District, Beijing, China 0009-0003-0522-0280 2026 04 08 Shadow puppet movements are characterized by long-range spatiotemporal dependence, pronounced stylization, and complex control and transmission relationships, these characteristics pose two major challenges to digital modeling: capturing long-range dependencies and preserving artistic style expression. This paper proposes an improved Transformer model incorporating a multi-level attention mechanism for modeling and generating action sequences of intangible cultural heritage shadow play. The model designs three types of collaborative attention modules: spatial attention introduces bone adjacency priors to enhance structural rationality; temporal attention captures cross-frame long-range dependencies; and style-aware attention adjusts local computations via global feature statistics to preserve genre-specific performance styles. Furthermore, an enhanced architecture alternately stacking graph convolution and Transformer is adopted, and sparse and hierarchical modeling strategies are introduced to reduce computational complexity from quadratic to approximately linear in sequence length. The experimental results show that the average joint position error of the proposed method in motion prediction tasks is 31.4, which is 11.8 lower than that of the standard Transformer; Style loss decreased by 24.6%; Under the extreme condition of 50% missing key points, the error ratio is 1.31, which is significantly better than the comparison method. The proposed model provides effective technical support for the digital protection and intelligent inheritance of intangible cultural heritage.