Skip to content

Commit f068e80

Browse files
authored
Fix SparkTTS Detokenize (TTS) (#132)
* fix voice cloning and TTS detokenize * bump version * format
1 parent c685a8e commit f068e80

File tree

2 files changed

+4
-4
lines changed

2 files changed

+4
-4
lines changed

mlx_audio/tts/models/spark/modules/speaker/speaker_encoder.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -98,9 +98,9 @@ def tokenize(self, mels: mx.array) -> mx.array:
9898
return indices
9999

100100
def detokenize(self, indices: mx.array) -> mx.array:
101-
zq = self.quantizer.get_output_from_indices(
102-
indices.transpose(0, 3, 1, 2)
103-
).transpose(0, 3, 1, 2)
101+
zq = self.quantizer.get_output_from_indices(indices.swapaxes(-1, -2)).swapaxes(
102+
-1, -2
103+
)
104104
x = zq.reshape(zq.shape[0], -1)
105105
d_vector = self.project(x)
106106
return d_vector

mlx_audio/version.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
__version__ = "0.2.0"
1+
__version__ = "0.2.1"

0 commit comments

Comments
 (0)