Add voice matching support for Dia #93

lucasnewman · 2025-04-25T17:01:26Z

This allows for consistent voices for each generation. Using the example prompt from their repo:

python -m mlx_audio.tts.generate --model mlx-community/Dia-1.6B --text "[S1] Testing, 1, 2 3. Is this thing on? [S2] Yeah, it's working. (laughs) [S1] OK, phew. (laughs)" --ref_audio example_prompt.mp3 --ref_text "[S1] Dia is an open weights text to dialogue model. [S2] You get full control over scripts and voices. " --sample_rate 44100 --play

lucasnewman · 2025-04-25T17:25:52Z

mlx_audio/codec/tests/test_descript.py

        self.assertEqual(latents.shape, (1, 96, 250))

        y = model.decode(z).squeeze(-1)
-        self.assertEqual(y.shape, (1, 79_992))


The test changes are unrelated, but due to an MLX change in the FFT implementation these need to be updated to clear the tests.

lucasnewman added 2 commits April 25, 2025 09:57

Add voice matching support for Dia.

eb34b65

Fix codec test failures from MLX update.

91e1fc0

lucasnewman commented Apr 25, 2025

View reviewed changes

Merge branch 'main' into dia-voice-matching

88ec555

Blaizzy merged commit 77aaefa into Blaizzy:main Apr 26, 2025
1 check passed

lucasnewman deleted the dia-voice-matching branch May 2, 2025 22:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add voice matching support for Dia #93

Add voice matching support for Dia #93

Uh oh!

lucasnewman commented Apr 25, 2025

Uh oh!

lucasnewman Apr 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Add voice matching support for Dia #93

Add voice matching support for Dia #93

Uh oh!

Conversation

lucasnewman commented Apr 25, 2025

Uh oh!

lucasnewman Apr 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants