Skip to content

Bug Report: File Path Resolution in TTS Generation with Chatterbox via OpenAI API #542

@rohan-sircar

Description

@rohan-sircar

Hi, I encountered an issue with the chatterbox openai api integration where it complained about not being able to find voice files

So I took a stab at trying to fix it and was successful. My fix is just modifying the selection value in the react ui testing page to prepend the correct relative file and it works fine. Let me know if this works for you or you would like to use a different approach.

Below is an AI assisted bug report to give an overview:


Issue Summary

When attempting to generate TTS audio using Chatterbox via the OpenAI API, the system fails to locate the reference audio file Sloane.wav due to incorrect path resolution.

Reproduction Steps

  1. Navigate to: http://react-ui/tools/api_config
  2. Click chatterbox_default preset
  3. Try to select a voice file from checklist e.g. Sloane
  4. Observe the configuration value: "audio_prompt_path": "Sloane.wav" in the text area
  5. Attempt to run a test TTS generation with Chatterbox

Error Details

The system produces the following error stack trace:

Traceback (most recent call last):
  File "/opt/nvme/1/TTS-WebUI/installer_files/env/lib/python3.10/site-packages/gradio/queueing.py", line 624, in process_events
    response = await route_utils.call_process_api(
  File "/opt/nvme/1/TTS-WebUI/installer_files/env/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
    output = await app.get_blocks().process_api(
  File "/opt/nvme/1/TTS-WebUI/installer_files/env/lib/python3.10/site-packages/gradio/blocks.py", line 2015, in process_api
    result = await self.call_function(
  File "/opt/nvme/1/TTS-WebUI/installer_files/env/lib/python3.10/site-packages/gradio/blocks.py", line 1562, in call_function
    prediction = await anyio.to_thread.run_sync(  # type: ignore
  File "/opt/nvme/1/TTS-WebUI/installer_files/env/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "/opt/nvme/1/TTS-WebUI/installer_files/env/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2470, in run_sync_in_worker_thread
    return await future
  File "/opt/nvme/1/TTS-WebUI/installer_files/env/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 967, in run
    result = context.run(func, *args)
  File "/opt/nvme/1/TTS-WebUI/installer_files/env/lib/python3.10/site-packages/gradio/utils.py", line 865, in wrapper
    response = f(*args, **kwargs)
  File "/opt/nvme/1/TTS-WebUI/installer_files/env/lib/python3.10/site-packages/extension_kokoro_tts_api/main.py", line 151, in test_api_with_open_ai
    result = preset_adapter(request, text)
  File "/opt/nvme/1/TTS-WebUI/installer_files/env/lib/python3.10/site-packages/extension_kokoro_tts_api/api.py", line 189, in preset_adapter
    audio_result = generic_tts_adapter(text, params, model)
  File "/opt/nvme/1/TTS-WebUI/installer_files/env/lib/python3.10/site-packages/extension_kokoro_tts_api/api.py", line 177, in generic_tts_adapter
    return chatterbox_adapter(text, params)
  File "/opt/nvme/1/TTS-WebUI/installer_files/env/lib/python3.10/site-packages/extension_kokoro_tts_api/api.py", line 232, in wrapper
    return func(*args, **kwargs)
  File "/opt/nvme/1/TTS-WebUI/installer_files/env/lib/python3.10/site-packages/extension_kokoro_tts_api/api.py", line 258, in chatterbox_adapter
    return tts(text, **params)
  File "/opt/nvme/1/TTS-WebUI/installer_files/env/lib/python3.10/site-packages/extension_chatterbox/api.py", line 301, in tts
    raise gr.Error(f"Error: {e}")
gradio.exceptions.Error: "Error: [Errno 2] No such file or directory: 'Sloane.wav'"
{
  type: 'status',
  endpoint: '/open_ai_api_test_voice_preset',
  fn_index: 303,
  time: 2025-07-07T06:00:04.232Z,
  original_msg: undefined,
  queue: true,
  message: "Error: [Errno 2] No such file or directory: 'Sloane.wav'",
  visible: true,
  duration: 10,
  stage: 'error',
  code: undefined,
  success: false
}

Key error message:

{
  "message": "Error: [Errno 2] No such file or directory: 'Sloane.wav'"
}

Root Cause Analysis

The error occurs because:

  1. The application attempts to load Sloane.wav from the current working directory
  2. The actual voice files are stored in the voices/chatterbox directory
  3. The React UI config generator doesn't prepend the correct path to voice file selections

Proposed Solution

Update the React UI config generator code to:

  1. Prepend the correct base path voices/chatterbox/ to voice file selections
  2. This will update the configuration to use the correct relative path

Configuration Changes

Before:

"audio_prompt_path": "Sloane.wav"

After:

"audio_prompt_path": "voices/chatterbox/Sloane.wav"

Verification

After implementing the fix:

  1. The TTS generation works successfully
  2. The system correctly loads the reference audio file from:
    'audio_prompt_path': 'voices/chatterbox/Sloane.wav'
    
  3. Output shows successful generation:
    Using chatterbox with params: ('Hello, this is a test of the voice synthesis.', {'exaggeration': 0.5, 'cfg_weight': 0.5, 'temperature': 0.8, 'model_name': 'just_a_placeholder', 'device': 'auto', 'dtype': 'bfloat16', 'audio_prompt_path': 'voices/chatterbox/Sloane.wav'}), {}
    Using device: cuda
    Using cached model 'Chatterbox on cuda with torch.bfloat16' in namespace 'chatterbox'.
    Generating chunk: Hello, this is a test of the voice synthesis.
    Estimated token count: 66
    Sampling:  12%|█▏        | 120/1000 [00:02<00:19, 45.02it/s]
    

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions