Sinkhorn imputation-based deep reinforcement learning approach for radio frequency parameter optimizations
Shihao Jia, Keqin Zhang, Zhiwei Chen, Xiaohu Ge,
Sinkhorn imputation-based deep reinforcement learning approach for radio frequency parameter optimizations,
Journal of Information and Intelligence,
2025,
,
ISSN 2949-7159,
https://doi.org/10.1016/j.jiixd.2025.12.006.
(https://www.sciencedirect.com/science/article/pii/S2949715925000769)
Abstract: With the evolution of fifth-generation (5G) and upcoming sixth-generation (6G) mobile communication systems, the design of the air interface must take into account two primary objectives: flexibility and low power consumption. Under a flexible physical-layer architecture, the joint configuration of the associated communication parameters becomes increasingly complex. Traditional optimization methods often focus on a single parameter, which fail to capture the intricate coupling between the baseband and Radio Frequency (RF) parameters. To address this issue, the multiple RF parameter-aware power consumption minimization is formulated as a black-box nonlinear optimization problem. To solve the formulated problem, the RF data imputation-based Deep Reinforcement Learning (DRL) technique is proposed to reduce the power consumption by jointly optimizing the parameters of bandwidth, number of data streams, precoding scheme, number of RF chains, DAC resolution, and the transmit power. In particular, an optimal transport imputation technique is introduced to estimate the relationship between the mentioned RF parameters and the power consumption, data rate, and the adjacent channel leakage ratio, which will output a complete RF lookup table. Based on the estimated lookup table, the Deep Q-Network (DQN) is trained efficiently to output a flexible RF configuration policy. This procedure is summarized as a two-stage approach, referred to as the Sinkhorn imputation-based DRL method. Simulation results demonstrate that the proposed approaches achieve average power reductions of 60.8% and 59.3%, respectively, compared to typical 5G configurations. Moreover, relative to the RF switch-off scheme commonly used in practice, the proposed approaches still provide additional power savings of 47.7% and 46.6%, respectively.
Keywords: Radio frequency (RF) optimization; Deep reinforcement learning; Flexible radio; Low power consumption; Sinkhorn imputation