TY - JOUR
T1 - Low-Bitrate High-Quality Digital Semantic Communication Based on RVQGAN
AU - Chen, Xiaojiao
AU - Wang, Jing
AU - Huang, Jingxuan
AU - Zeng, Ming
AU - Zheng, Zhong
AU - Fei, Zesong
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2025
Y1 - 2025
N2 - Digital semantic communication has attracted considerable attention attributed to its potential for integration with modern digital communication systems, which has demonstrated significant performance gains. However, despite its ability to save transmission bandwidth, digital semantic communication can degrade the performance of tasks at the receiver, particularly in low-bitrate scenarios. In this article, we propose a novel low-bitrate digital semantic communication method based on a generative model for speech transmission to achieve high-quality reconstructed speech at low-bitrate transmission. In particular, we first investigate a multiscale semantic codec based on residual vector quantization with a generative adversary network (RVQGAN) model for extracting semantic information and obtaining high speech reconstruction quality while transmitting at a low bitrate. We then, design a channel noise suppression (CNS) module based on U-Net to alleviate the channel effect at low signal-to-noise ratio (SNR) by restoring high-quality semantic features, which is capable of improving the performance of the proposed method under challenging channel conditions. Moreover, a Transformer-based code predictor is utilized to further improve the robustness of the proposed method by accounting for both the channel impact and reconstruction quality. Finally, a three-stage training strategy is also presented in this article to ensure the effective operation of the proposed multiscale semantic codec, CNS module, and code predictor module. Experimental results demonstrate that the proposed method operating at 3 kb/s can save at least 50% of bandwidth while achieving higher speech restoration quality than the baseline method.
AB - Digital semantic communication has attracted considerable attention attributed to its potential for integration with modern digital communication systems, which has demonstrated significant performance gains. However, despite its ability to save transmission bandwidth, digital semantic communication can degrade the performance of tasks at the receiver, particularly in low-bitrate scenarios. In this article, we propose a novel low-bitrate digital semantic communication method based on a generative model for speech transmission to achieve high-quality reconstructed speech at low-bitrate transmission. In particular, we first investigate a multiscale semantic codec based on residual vector quantization with a generative adversary network (RVQGAN) model for extracting semantic information and obtaining high speech reconstruction quality while transmitting at a low bitrate. We then, design a channel noise suppression (CNS) module based on U-Net to alleviate the channel effect at low signal-to-noise ratio (SNR) by restoring high-quality semantic features, which is capable of improving the performance of the proposed method under challenging channel conditions. Moreover, a Transformer-based code predictor is utilized to further improve the robustness of the proposed method by accounting for both the channel impact and reconstruction quality. Finally, a three-stage training strategy is also presented in this article to ensure the effective operation of the proposed multiscale semantic codec, CNS module, and code predictor module. Experimental results demonstrate that the proposed method operating at 3 kb/s can save at least 50% of bandwidth while achieving higher speech restoration quality than the baseline method.
KW - Digital semantic communication
KW - generative model
KW - low bitrate
KW - speech transmission
UR - http://www.scopus.com/pages/publications/85219654138
U2 - 10.1109/JIOT.2025.3534462
DO - 10.1109/JIOT.2025.3534462
M3 - Article
AN - SCOPUS:85219654138
SN - 2327-4662
VL - 12
SP - 13525
EP - 13537
JO - IEEE Internet of Things Journal
JF - IEEE Internet of Things Journal
IS - 10
ER -