Reclamation-based-Voice(talk)-Conversion

Reclamation-based-Voice-
Conversion(RVC)
GURUMURTHY.V(421122102050)
DHIVYAKUMARAN.K(421122102037)
GOKULRAJ.R(421122102044)
PRESENTED BY
Under The Guidance of
MRS.K.KALAISELVI/AP/CSE

PROBLEM STATEMENT
•Naturalness: Maintaining the naturalness of the source
speaker's voice while converting it to the target speaker is
challenging. Artifacts and distortions can occur.
•Data Quality: Noise, background interference, and
inconsistent recording conditions can negatively impact the
performance of the model.
•Accurate Retrieval: The quality of the converted speech
depends on the accuracy of the retrieved target speaker
utterances. Ineffective retrieval can result in mismatched
voice characteristics.

PROBLEM STATEMENT DIAGRAM
• The quality of the data
significantly impacts the
model's performance.
• Noise, background
interference, and
inconsistent recording
conditions can degrade .
• The development of timbre
conversion and imitation is
not perfect.

• Voice conversion
models require a
vast amount of
data to learn
intricate voice
characteristics.
• Collecting and processing such a large dataset can
be time-consuming and expensive.

PROPOSED SYSTEM
•A powerful speech encoder is trained on a large dataset
to extract high-level acoustic features that capture
speaker identity and linguistic content.
• Given a target speaker's voice, a set of representative
speech segments is extracted and encoded into the
same feature space.
•For a given input speech, the encoder generates its
corresponding feature representation.

EXISTING SYSTEM
S. NO TITLE AUTHOR
YEAR
PUBLISHED
DRAWBACK
1. Voice Conversion from
a Single Speaker Example
Jun Wu, Yi
Li, and
Yiheng Liu
2015 Requires a large amount of
training data for each speaker.
2. A Waveform Generation
Model for High-Quality
Speech Synthesis
Tomoki
Kaneko,
Kazuhiro
Takahashi,
and Shinji
Nakamura
2018 Can produce artifacts in
the synthesized speech,
especially for unseen speakers.
3. A Generative
Adversarial Network
Approach to Text-to-
Speech Synthesis
Zheng Wang,
Jonathan
Sotelo,
Haoyu Wu,
Gregor Kurz
2017 The generated speech quality
can be sensitive to the
quality of the text-to-
speech (TTS) model used.

S. NO TITLE AUTHOR YEAR PUBLISHED DRAWBACK
4. A Generative Model
for Raw Audio
Aaron van den Oord,
Nal Kalchbrenner, and
Koray Kavukcuoglu
2016 Requires a significant
amount of
computational resources
for training and
inference.
5. A Flow-based
Generative Model for
Text-to-Speech
Synthesis
Jongwook Kim,
Hyunjun Kim, and Ho-
Sang Lee
2020 Can struggle with
preserving the
naturalness and
expressiveness of the
original voice.

Reclamation-based-Voice(talk)-Conversion

More Related Content

Similar to Reclamation-based-Voice(talk)-Conversion (20)

Recently uploaded (20)

Reclamation-based-Voice(talk)-Conversion