A Latent Diffusion Model for Heart Sound Synthesis

Yang Tan, Haojie Zhang, Yi Chang, Qingrong Wu, Kun Qian*, Bin Hu*, Bjorn W. Schuller, Yoshiharu Yamamoto

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

There are already many analyses of heart sounds used to develop diagnostic systems for the heart condition. However, the existing heart sound database state is not sufficient to train a large number of deep learning models and is not balanced - the normal heart sounds are always more present than abnormal heart sounds. Hence, we are interested in algorithms that can generate heart sounds as they can enhance the current database state. We enter the field of large-scale modelling in medical synthesis by proposing a latent diffusion model for heart sound generation, which can generate highly realistic heart sounds. We further guide the synthesis process through text prompts and labels, revealing the research field of prompted heart sound synthesis. In terms of experimental results, our proposed method achieves good results compared to existing mathematical models. At the same time, when the generated audio is used as an unlabelled dataset in semi-supervised learning, it shows an improvement effect on the classification model. This indicates that the heart sounds generated by the diffusion model bear great value.

Original languageEnglish
Title of host publication2025 4th International Symposium on Computer Applications and Information Technology, ISCAIT 2025
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1314-1319
Number of pages6
ISBN (Electronic)9798331542856
DOIs
Publication statusPublished - 2025
Externally publishedYes
Event4th International Symposium on Computer Applications and Information Technology, ISCAIT 2025 - Xi'an, China
Duration: 21 Mar 202523 Mar 2025

Publication series

Name2025 4th International Symposium on Computer Applications and Information Technology, ISCAIT 2025

Conference

Conference4th International Symposium on Computer Applications and Information Technology, ISCAIT 2025
Country/TerritoryChina
CityXi'an
Period21/03/2523/03/25

Keywords

  • Data augmentation
  • Heart sound
  • Latent diffusion models
  • Semi-supervised learning
  • Sound synthesis

Fingerprint

Dive into the research topics of 'A Latent Diffusion Model for Heart Sound Synthesis'. Together they form a unique fingerprint.

Cite this