To address the problem that sports event ticket revenue is affected by the coupling of multiple factors and exhibits significant dynamic changes, this paper proposes a multimodal prediction model based on Transformer. This method integrates historical ticketing data, event information, user behavior, and public opinion data, constructs a cross‑modal dynamic alignment Transformer architecture (CMDA‑Transformer), and introduces a multi‑scale time series modeling mechanism and a demand sensitivity enhancement module to improve the model's ability to capture complex temporal patterns and multi‑source information. Experiments were conducted on about 24,000 real event records. The results show that compared with traditional ARIMA, XGBoost, and deep learning models (LSTM, GRU, and standard Transformer), the RMSE of this method is 18.72, which is about 5.7% lower than that of the baseline Transformer, the MAPE is about 8.3% lower, and the coefficient of determination R² increases to 0.912. At the same time, it still maintains stable performance under different types of events and noise interference scenarios, which verifies the generalization ability and robustness of the model. The results show that this method can effectively improve the accuracy of ticket revenue prediction, and provide reliable support for dynamic pricing and ticket optimization.
This work was supported by the Key Research Base for Humanities and Social Sciences (Sports Social Science), Sichuan Federation of Social Sciences & Sichuan Provincial Department of Education [Grant No. TY2025306].