Low-Bitrate High-Quality Digital Semantic Communication Based on RVQGAN

Xiaojiao Chen, Jing Wang, Jingxuan Huang*, Ming Zeng, Zhong Zheng, Zesong Fei

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)

Abstract

Digital semantic communication has attracted considerable attention attributed to its potential for integration with modern digital communication systems, which has demonstrated significant performance gains. However, despite its ability to save transmission bandwidth, digital semantic communication can degrade the performance of tasks at the receiver, particularly in low-bitrate scenarios. In this article, we propose a novel low-bitrate digital semantic communication method based on a generative model for speech transmission to achieve high-quality reconstructed speech at low-bitrate transmission. In particular, we first investigate a multiscale semantic codec based on residual vector quantization with a generative adversary network (RVQGAN) model for extracting semantic information and obtaining high speech reconstruction quality while transmitting at a low bitrate. We then, design a channel noise suppression (CNS) module based on U-Net to alleviate the channel effect at low signal-to-noise ratio (SNR) by restoring high-quality semantic features, which is capable of improving the performance of the proposed method under challenging channel conditions. Moreover, a Transformer-based code predictor is utilized to further improve the robustness of the proposed method by accounting for both the channel impact and reconstruction quality. Finally, a three-stage training strategy is also presented in this article to ensure the effective operation of the proposed multiscale semantic codec, CNS module, and code predictor module. Experimental results demonstrate that the proposed method operating at 3 kb/s can save at least 50% of bandwidth while achieving higher speech restoration quality than the baseline method.

Original languageEnglish
Pages (from-to)13525-13537
Number of pages13
JournalIEEE Internet of Things Journal
Volume12
Issue number10
DOIs
Publication statusPublished - 2025
Externally publishedYes

Keywords

  • Digital semantic communication
  • generative model
  • low bitrate
  • speech transmission

Fingerprint

Dive into the research topics of 'Low-Bitrate High-Quality Digital Semantic Communication Based on RVQGAN'. Together they form a unique fingerprint.

Cite this