Cross-Border Trade Fraud Detection via Integrated Heterogeneous Graph Neural Network and XGBoost
DOI:
https://doi.org/10.71451/ISTAER2603Keywords:
Cross-border trade fraud detection, Heterogeneous graph neural network, Integrated learning, XGBoost, Risk identificationAbstract
Because cross-border trade fraud involves multiple types of entities, multiple business relationships and complex interactive structures, it exhibits high heterogeneity and strong concealment, which has brought significant challenges to the traditional risk identification methods. Aiming at the problem that existing methods struggle to balance the ability of structural modeling and classification performance, this paper proposes a cross-border trade fraud detection framework based on heterogeneous graph neural network (HGNN) and gradient lifting tree model XGBoost. Firstly, the cross-border trade system is modeled as a heterogeneous graph of multi type entities and multi relationship interactions, and HGNN is used to learn the high-order structural semantic representation of entities in complex trade networks; Then, the graph embedding features and statistical features are input into XGBoost to achieve high-precision classification of fraud. The experimental results on the real cross-border trade data set show that the AUC of the proposed model on the test set reaches 0.966, which is 18.7% and 3.4% higher than using XGBoost and HGNN alone, and significantly improves the recall rate of fraud samples in a variety of typical fraud scenarios. Ablation experiments further verified the key role of heterogeneous relationship modeling, attention mechanism and integration strategy in performance improvement. The above results show that HGNN–XGBoost integration framework has good detection performance and engineering application potential in complex heterogeneous scenes.
References
[1] Liang, Y. (2025). Financial Legal Risks and Prevention Mechanisms in Cross-Border Mergers and Acquisitions: A Systemic Analysis. Law and Economy, 4(4), 18-27. DOI: https://doi.org/10.63593/LE.2788-7049.2025.05.003
[2] Yu, L., Cong, Q., & Li, S. (2024). Study on International Cooperation to Address Cross-border Telecommunication Network Fraud Offence. Journal of Politics and Law., 17, 51. DOI: https://doi.org/10.5539/jpl.v17n2p51
[3] Howson, K., Ferrari, F., Ustek-Spilda, F., Salem, N., Johnston, H., Katta, S., ... & Graham, M. (2022). Driving the digital value network: Economic geographies of global platform capitalism. Global Networks, 22(4), 631-648. DOI: https://doi.org/10.1111/glob.12358
[4] Bokrantz, J., Shurrab, H., Johansson, B., & Skoogh, A. (2025). Unravelling supply chain complexity in maintenance operations of battery production. Production Planning & Control, 36(13), 1752-1773. DOI: https://doi.org/10.1080/09537287.2024.2414334
[5] Siqi, C., Rajamanickam, R., Manap, N. A., & Zahir, Z. M. (2024). Application of Blockchain Technology in Cross-Border Telecommunications Network Fraud to Ensure China’s Judicial Justice. Jurnal IUS Kajian Hukum dan Keadilan, 12(3), 472-486. DOI: https://doi.org/10.29303/ius.v12i3.1554
[6] Wang, L., Han, M., Li, X., Zhang, N., & Cheng, H. (2021). Review of classification methods on unbalanced data sets. Ieee Access, 9, 64606-64628. DOI: https://doi.org/10.1109/ACCESS.2021.3074243
[7] Yan, S., Liu, R., Zhang, Y., Yao, X., Yang, Y., Wang, Q., ... & Wang, S. (2024). Investigation and application of data balancing and combined discriminant model in rock burst severity prediction. Scientific Reports, 14(1), 29657. DOI: https://doi.org/10.1038/s41598-024-81307-z
[8] Kyriazos, T., & Poga, M. (2024). Application of machine learning models in social sciences: managing nonlinear relationships. Encyclopedia, 4(4), 1790-1805. DOI: https://doi.org/10.3390/encyclopedia4040118
[9] Shahbazi, M. A., & Azadeh-Fard, N. (2025). Hierarchical data modeling: A systematic comparison of statistical, tree-based, and neural network approaches. Machine Learning with Applications, 100688. DOI: https://doi.org/10.1016/j.mlwa.2025.100688
[10] Yan, L., & Xu, Y. (2024). XGBoost-Enhanced Graph Neural Networks: A New Architecture for Heterogeneous Tabular Data. Applied Sciences (2076-3417), 14(13). DOI: https://doi.org/10.3390/app14135826
[11] Deng, D., Chen, X., Zhang, R., Lei, Z., Wang, X., & Zhou, F. (2021). XGraphBoost: extracting graph neural network-based features for a better prediction of molecular properties. Journal of chemical information and modeling, 61(6), 2697-2705. DOI: https://doi.org/10.1021/acs.jcim.0c01489
[12] Mosa, M. A. (2025). Optimizing text classification accuracy: a hybrid strategy incorporating enhanced NSGA-II and XGBoost techniques for feature selection. Progress in Artificial Intelligence, 1-25. DOI: https://doi.org/10.1007/s13748-025-00365-0
[13] Demir, S., & Sahin, E. K. (2023). An investigation of feature selection methods for soil liquefaction prediction based on tree-based ensemble algorithms using AdaBoost, gradient boosting, and XGBoost. Neural Computing and Applications, 35(4), 3173-3190. DOI: https://doi.org/10.1007/s00521-022-07856-4
[14] Yaqoob, A., Verma, N. K., Aziz, R. M., & Shah, M. A. (2024). Optimizing cancer classification: a hybrid RDO-XGBoost approach for feature selection and predictive insights. Cancer Immunology, Immunotherapy, 73(12), 261. DOI: https://doi.org/10.1007/s00262-024-03843-x
[15] Kumar, K., Samui, P., & Choudhary, S. S. (2026). Prediction and interpretation of liquefaction occurrences using explainable machine learning models. Sādhanā, 51(1), 4. DOI: https://doi.org/10.1007/s12046-025-03005-4
[16] Chang, Y., Iakovou, E., & Shi, W. (2020). Blockchain in global supply chains and cross border trade: a critical synthesis of the state-of-the-art, challenges and opportunities. International Journal of Production Research, 58(7), 2082-2099. DOI: https://doi.org/10.1080/00207543.2019.1651946
[17] Liu, Y. J., & Ha-Brookshire, J. E. (2025, January). Mapping Success: A Study on Firm Capabilities in Chinese Cross-Border E-Commerce. In International Textile and Apparel Association Annual Conference Proceedings (Vol. 81, No. 1). Iowa State University Digital Press. DOI: https://doi.org/10.31274/itaa.18539
[18] Nguyen, H., Vu, T., Vo, T. P., & Thai, H. T. (2021). Efficient machine learning models for prediction of concrete strengths. Construction and Building Materials, 266, 120950. DOI: https://doi.org/10.1016/j.conbuildmat.2020.120950
[19] Luo, Z., Li, Z., Dong, C., Dai, X., Shen, X., Li, J., & Bi, G. (2024). Multi-participants trading mode in Cross-Border electricity Market: A non-cooperative game approach. International Journal of Electrical Power & Energy Systems, 160, 110093. DOI: https://doi.org/10.1016/j.ijepes.2024.110093
[20] Luo, Z., Dong, C., Dai, X., Wang, H., Bi, G., & Shen, X. (2024). Research on decision-making behavior of multi-agent alliance in cross-border electricity market environment: an evolutionary game. Global Energy Interconnection, 7(6), 707-722. DOI: https://doi.org/10.1016/j.gloei.2024.11.009
[21] Rubin-Delanchy, P., Cape, J., Tang, M., & Priebe, C. E. (2022). A statistical interpretation of spectral embedding: the generalised random dot product graph. Journal of the Royal Statistical Society Series B: Statistical Methodology, 84(4), 1446-1473. DOI: https://doi.org/10.1111/rssb.12509
[22] Gharagoz, M. M., Noureldin, M., & Kim, J. (2025). Explainable machine learning (XML) framework for seismic assessment of structures using Extreme Gradient Boosting (XGBoost). Engineering Structures, 327, 119621. DOI: https://doi.org/10.1016/j.engstruct.2025.119621
[23] Bachiri, K., Yahyaouy, A., Malek, M., & Rogovschi, N. (2025). MM-HGNN: Multimodal Representation Learning Heterogeneous Graph Neural Network. International Journal of Computational Intelligence Systems, 18(1), 178. DOI: https://doi.org/10.1007/s44196-025-00820-9
[24] Yu, L., Sun, L., Du, B., Liu, C., Lv, W., & Xiong, H. (2022). Heterogeneous graph representation learning with relation awareness. IEEE Transactions on Knowledge and Data Engineering, 35(6), 5935-5947. DOI: https://doi.org/10.1109/TKDE.2022.3160208
[25] Chicco, D., & Jurman, G. (2020). The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC genomics, 21(1), 6. DOI: https://doi.org/10.1186/s12864-019-6413-7
[26] Melnykova, N., Patereha, Y., Skopivskyi, S., Farion, M., Fedushko, S., & Drohomyretska, K. (2025). Machine learning for stroke prediction using imbalanced data. Scientific Reports, 15(1), 33773. DOI: https://doi.org/10.1038/s41598-025-01855-w
[27] Carrington, A. M., Manuel, D. G., Fieguth, P. W., Ramsay, T., Osmani, V., Wernly, B., ... & Holzinger, A. (2022). Deep ROC analysis and AUC as balanced average accuracy, for improved classifier selection, audit and explanation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(1), 329-341. DOI: https://doi.org/10.1109/TPAMI.2022.3145392
Downloads
Published
Data Availability Statement
The data that support the findings of this study are available upon request from the corresponding authors, X.Z.
Issue
Section
License
Copyright (c) 2026 International Scientific Technical and Economic Research

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
This work is licensed under the Creative Commons Attribution International License (CC BY 4.0).