Low-Bitrate High-Quality Digital Semantic Communication Based on RVQGAN

Xiaojiao Chen, Jing Wang, Jingxuan Huang*, Ming Zeng, Zhong Zheng, Zesong Fei

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

2 引用 (Scopus)

摘要

Digital semantic communication has attracted considerable attention attributed to its potential for integration with modern digital communication systems, which has demonstrated significant performance gains. However, despite its ability to save transmission bandwidth, digital semantic communication can degrade the performance of tasks at the receiver, particularly in low-bitrate scenarios. In this article, we propose a novel low-bitrate digital semantic communication method based on a generative model for speech transmission to achieve high-quality reconstructed speech at low-bitrate transmission. In particular, we first investigate a multiscale semantic codec based on residual vector quantization with a generative adversary network (RVQGAN) model for extracting semantic information and obtaining high speech reconstruction quality while transmitting at a low bitrate. We then, design a channel noise suppression (CNS) module based on U-Net to alleviate the channel effect at low signal-to-noise ratio (SNR) by restoring high-quality semantic features, which is capable of improving the performance of the proposed method under challenging channel conditions. Moreover, a Transformer-based code predictor is utilized to further improve the robustness of the proposed method by accounting for both the channel impact and reconstruction quality. Finally, a three-stage training strategy is also presented in this article to ensure the effective operation of the proposed multiscale semantic codec, CNS module, and code predictor module. Experimental results demonstrate that the proposed method operating at 3 kb/s can save at least 50% of bandwidth while achieving higher speech restoration quality than the baseline method.

源语言英语
页(从-至)13525-13537
页数13
期刊IEEE Internet of Things Journal
12
10
DOI
出版状态已出版 - 2025
已对外发布

指纹

探究 'Low-Bitrate High-Quality Digital Semantic Communication Based on RVQGAN' 的科研主题。它们共同构成独一无二的指纹。

引用此