Research on Multi-Agent Collaborative Decision-Making Algorithm for Supply Chain Management

Addressing the key challenges of fuzzy credit allocation, low exploration efficiency, and insufficient robustness in multi-node collaborative decision-making in supply chain management, this paper proposes a hybrid local-global credit allocation multi-agent collaborative decision-making algorithm (HGA-MADDPG). This algorithm introduces a hierarchical graph attention mechanism to dynamically represent the state of the supply chain network topology. It quantifies the contribution of individual actions to sub-chain objectives and system-level indicators through local and global credit networks, respectively, and designs an adaptive fusion weight based on marginal returns to dynamically balance local and global credit. Furthermore, an adversarial disturbance and resilient training architecture is constructed, including modeling three types of disturbances: demand mutation, node failure, and transportation delay, as well as adversarial agent injection, a dynamic environment replay buffer, and a two-stage training strategy. In a baseline scenario of a four-level supply chain and a dynamic environment driven by real data based on SCDL and WSN, compared with eight baseline algorithms, experimental results show that HGA-MADDPG achieves a total cost reduction rate of 26.2%, a service level improvement rate of 42.8%, and a stockout rate controlled at 3.2%. In the extreme scenario of triple perturbation, the cost deviation rate (29.6%) and recovery time (58 hours) are significantly better than existing methods. It still maintains a cost reduction rate of 21.5% in a 120-node ultra-large-scale supply chain. Ablation experiments and scalability analysis further verify the effectiveness of each core module.

Zhou, H., Yip, W. S., Ren, J., & To, S. (2020). An interaction investigation of the contributing factors of the bullwhip effect using a bi-level social network analysis approach. IEEE Access, 8, 208737-208752. DOI: 10.1109/ACCESS.2020.3038680
Tao, J., Aamir, M., Shoaib, M., Yasir, N., & Babar, M. (2025). Bridging the gap between supply chain risk and organizational performance conditioning to demand uncertainty. Sustainability, 17(6), 2462. DOI: 10.3390/su17062462
Ivanov, D., & Dolgui, A. (2025). Tariff shocks, ripple effect, and deep uncertainty in supply chains: we are entering a turbulence zone, please fasten your seatbelts. International Journal of Production Research, 63(19), 7305-7317. DOI: 10.1080/00207543.2025.2520598
Theodorakopoulos, L., Theodoropoulou, A., & Halkiopoulos, C. (2024). Enhancing decentralized decision-making with big data and blockchain technology: A comprehensive review. Applied Sciences, 14(16), 7007. DOI: 10.3390/app14167007
Patari, N., Venkataramanan, V., Srivastava, A., Molzahn, D. K., Li, N., & Annaswamy, A. (2021). Distributed optimization in distribution systems: Use cases, limitations, and research needs. IEEE Transactions on Power Systems, 37(5), 3469-3481. DOI: 10.1109/TPWRS.2021.3132348
Liu, J., Du, Y., Yang, K., Wu, J., Wang, Y., Hu, X., ... & Leung, V. C. (2026). Edge-cloud collaborative computing on distributed intelligence and model optimization: A survey. IEEE Communications Surveys & Tutorials. DOI: 10.1109/COMST.2026.3669216
Lee, H., Lee, S. H., & Quek, T. Q. (2022). Artificial intelligence meets autonomy in wireless networks: A distributed learning approach. IEEE Network, 36(6), 100-107. DOI: 10.1109/MNET.105.2100450
Tanwar, S., Popat, A., Bhattacharya, P., Gupta, R., & Kumar, N. (2022). A taxonomy of energy optimization techniques for smart cities: Architecture and future directions. Expert Systems, 39(5), e12703. DOI: 10.1111/exsy.12703
Canese, L., Cardarilli, G. C., Di Nunzio, L., Fazzolari, R., Giardino, D., Re, M., & Spanò, S. (2021). Multi-agent reinforcement learning: A review of challenges and applications. Applied Sciences, 11(11), 4948. DOI: 10.3390/app11114948
Bahrpeyma, F., & Reichelt, D. (2022). A review of the applications of multi-agent reinforcement learning in smart factories. Frontiers in Robotics and AI, 9, 1027340. DOI: 10.3389/frobt.2022.1027340
Li, T., Zhu, K., Luong, N. C., Niyato, D., Wu, Q., Zhang, Y., & Chen, B. (2022). Applications of multi-agent reinforcement learning in future internet: A comprehensive survey. IEEE Communications Surveys & Tutorials, 24(2), 1240-1279. DOI: 10.1109/COMST.2022.3160697
Kumar, V. (2025). Interoperable Knowledge Graphs for Localized Supply Chains: Leveraging Graph Databases and RDF Standards. Logistics, 9(4), 144. DOI: 10.3390/logistics9040144
Wiedmer, R., & Griffis, S. E. (2021). Structural characteristics of complex supply chain networks. Journal of Business Logistics, 42(2), 264-290. DOI: 10.1111/jbl.12283
Tsantis, A., Mangan, J., & Palacin, R. (2026). Trade shocks and direct shipping connections: causal insights into network adaptability and supply chain resilience. WMU Journal of Maritime Affairs, 1-33. DOI: 10.1007/s13437-025-00399-0
Feng, L. (2025). Joint optimization algorithm for vehicle scheduling and supply chain inventory management based on multi-agent deep reinforcement learning. Neural Computing and Applications, 37(34), 28643-28669. DOI: 10.1007/s00521-025-11661-0
Feizabadi, J., Gligor, D., & Alibakhshi, S. (2021). Strategic supply chains: a configurational perspective. The International Journal of Logistics Management, 32(4), 1093-1123. DOI: 10.1108/IJLM-09-2020-0383
Azadegan, A., & Dooley, K. (2021). A typology of supply network resilience strategies: complex collaborations in a complex world. Journal of Supply Chain Management, 57(1), 17-26. DOI: 10.1111/jscm.12256
Kano, L., Tsang, E. W., & Yeung, H. W. C. (2020). Global value chains: A review of the multi-disciplinary literature. Journal of International Business Studies, 51(4), 577-622. DOI: 10.1057/s41267-020-00304-2
Zhang, S., Zheng, N., & Wang, D. L. (2022). A novel attention-based global and local information fusion neural network for group recommendation. Machine Intelligence Research, 19(4), 331-346. DOI: 10.1007/s11633-022-1336-1
Liu, L., Shi, Y., Pi, Y., Guo, W., & Wang, S. (2025). Efficient multi-view graph convolutional networks via local aggregation and global propagation. Expert Systems with Applications, 266, 126131. DOI: 10.1016/j.eswa.2024.126131
Liu, X., Wang, Q., Wei, X., & Liang, H. (2025). Hierarchical Attention-Driven Dynamic Graph Neural Networks for Accurate Supply Chain Demand Forecasting. In International Conference on Intelligent Computing (pp. 471-483). Singapore: Springer Nature Singapore. DOI: 10.1007/978-981-95-0009-3_40
Farag, W. (2020). Multi-agent reinforcement learning using the deep distributed distributional deterministic policy gradients algorithm. In 2020 International Conference on Innovation and Intelligence for Informatics, Computing and Technologies (3ICT) (pp. 1-6). IEEE. DOI: 10.1109/3ICT51146.2020.9311945
Fan, D., Shen, H., & Dong, L. (2021). Multi-agent distributed deep deterministic policy gradient for partially observable tracking. In Actuators (Vol. 10, No. 10, p. 268). MDPI. DOI: 10.3390/act10100268
Ikpe, V., & Shamsuddoha, M. (2024). Functional model of supply chain waste reduction and control strategies for retailers—The USA retail industry. Logistics, 8(1), 22. DOI: 10.3390/logistics8010022
Ovezmyradov, B. (2022). Product availability and stockpiling in times of pandemic: causes of supply chain disruptions and preventive measures in retailing. Annals of Operations Research, 1-33. DOI: 10.1007/s10479-022-05091-7
Prashanth, L. A., & Michael, C. F. (2022). Risk-sensitive reinforcement learning via policy gradient search. Foundations and Trends in Machine Learning, 15(5), 537-693. DOI: 10.1561/9781638280279
Moghaddam, A. R., & Kebriaei, H. (2024). Expected policy gradient for network aggregative Markov games in continuous space. IEEE Transactions on Neural Networks and Learning Systems, 36(4), 7372-7381. DOI: 10.1109/TNNLS.2024.3387871
Tatarenko, T., Shi, W., & Nedić, A. (2020). Geometric convergence of gradient play algorithms for distributed Nash equilibrium seeking. IEEE Transactions on Automatic Control, 66(11), 5342-5353. DOI: 10.1109/TAC.2020.3046232
Ma, C., Zhang, L., You, L., & Tian, W. (2024). A review of supply chain resilience: A network modeling perspective. Applied Sciences, 15(1), 265. DOI: 10.3390/app15010265
Wang, J., Pal, A., Yang, Q., Kant, K., Zhu, K., & Guo, S. (2022). Collaborative machine learning: Schemes, robustness, and privacy. IEEE Transactions on Neural Networks and Learning Systems, 34(12), 9625-9642. DOI: 10.1109/TNNLS.2022.3169347