More fixes for the Spark BiCodec module #137

lucasnewman · 2025-05-12T04:57:41Z

There were a few issues across the various modules that didn't allow tokenization and reconstruction to work:

They use a mel spec directly, not log scaled.
Fixes in the ECAPA_TDNN encoder, Perceiver resamples, and factorized vector quantizer for residuals / norms.

With this I can get matching semantic & global tokenization and detokenization with the torch version.

mlx_audio/tts/models/spark/modules/residual.py

Blaizzy

Thanks @lucasnewman this fixes the voice cloning ✅

LGTM!

lucasnewman and others added 10 commits May 11, 2025 21:50

More fixes for the Spark BiCodec module.

007bed9

Formatting.

ee81fb2

remove unused

6b38585

remove casting

8c2f1d0

set gender none when voice cloning

324a455

support path and mx.array as ref_audio

e348b50

fix conv padding error

0241a20

add model type method

284b831

add normalize audio for SparkTTS

f4b52b0

add clear cache to avoid memory issues

2fb8a59

Blaizzy reviewed May 12, 2025

View reviewed changes

mlx_audio/tts/models/spark/modules/residual.py Outdated Show resolved Hide resolved

Update mlx_audio/tts/models/spark/modules/residual.py

7f32434

Blaizzy approved these changes May 12, 2025

View reviewed changes

Blaizzy merged commit bd54740 into Blaizzy:main May 12, 2025

Blaizzy linked an issue May 13, 2025 that may be closed by this pull request

SparkTTS Voice cloning (Wav2vec) #119

Closed

3 tasks

lucasnewman deleted the more-spark-fixes branch May 18, 2025 19:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

More fixes for the Spark BiCodec module #137

More fixes for the Spark BiCodec module #137

Uh oh!

lucasnewman commented May 12, 2025

Uh oh!

Uh oh!

Blaizzy left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

More fixes for the Spark BiCodec module #137

More fixes for the Spark BiCodec module #137

Uh oh!

Conversation

lucasnewman commented May 12, 2025

Uh oh!

Uh oh!

Blaizzy left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants