-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Checks
- This template is only for bug reports, usage problems go with 'Help Wanted'.
- I have thoroughly reviewed the project documentation but couldn't find information to solve my problem.
- I have searched for existing issues, including closed ones, and couldn't find a solution.
- I am using English to submit this issue to facilitate community communication.
Environment Details
Python 3.10, cuda 12.8, Ubuntu 22.04
Steps to Reproduce
- pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128
- clone and navigate to the repository
- pip install -e .
- f5-tts_finetune-gradio --share
✔️ Expected Behavior
Transcripts should be generated successfully
❌ Actual Behavior
Using chunk_length_s is very experimental with seq2seq models. The results will not necessarily be entirely accurate and will have caveats. More information: huggingface/transformers#20104. Ignore this warning with pipeline(..., ignore_warning=True). To use Whisper for long-form transcription, use rather the model's generate method directly as the model relies on it's own chunking mechanism (cf. Whisper original paper, section 3.8. Long-form Transcription).
Error:
transcribe complete samples : 0
path : /teamspace/studios/this_studio/F5-TTS/src/f5_tts/../../data/test_telugu_pinyin/wavs
error files : 1 (only 1 file is in dataset)
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working