fix(openai): Fix and improve OpenAI audio handling #511

aimeos · 2025-07-23T08:28:25Z

Uploading audio files to OpenAI doesn't work at all because of incorrect usage of the Laravel HTTP facade for multipart requests. This PR fixes the upload problem and improves handling different response format from whisper-1 API depending on the sent provider specific parameters.

sixlive · 2025-07-24T18:39:23Z

@pushpak1300 can you take a look at this?

kinsta · 2025-07-25T13:35:19Z

Preview deployments for prism ⚡️

Status	Branch preview	Commit preview
✅ Ready	Visit preview	Visit preview

Commit: 5efb9f2f73418ead19347486e7fb11fd942b4d62

Deployment ID: d8821fe4-2419-4f4b-8263-0307c74262f6

Static site name: prism-97nz9

pushpak1300 · 2025-07-26T16:28:04Z

@aimeos Hey can you provide me the code which will reproduce the issue?

With the below code i was able to generate the transcriptions..

 $audioFile = Audio::fromLocalPath('audio.mp3');

        $response = Prism::audio()
            ->using('openai', 'whisper-1')
            ->withInput($audioFile)
            ->asText();
 dd($response)

The result i am getting

Output:                                                                                                               
  ================                                                                                                      
  Prism\Prism\Audio\TextResponse {#2415                                                                                 
    +text: "Hello, world."                                                                                              
    +usage: Prism\Prism\ValueObjects\Usage {#2412                                                                       
      +promptTokens: 0                                                                                                  
      +completionTokens: 0                                                                                              
      +cacheWriteInputTokens: null                                                                                      
      +cacheReadInputTokens: null                                                                                       
      +thoughtTokens: null                                                                                              
    }                                                                                                                   
    +additionalContent: array:2 [                                                                                       
      "text" => "Hello, world."                                                                                         
      "usage" => array:2 [                                                                                              
        "type" => "duration"                                                                                            
        "seconds" => 1                                                                                                  
      ]                                                                                                                 
    ]                                                                                                                   
  } // /Users/pushpakchhajed/Projects/prism/tests/Providers/OpenAI/AudioTest.php:156

pushpak1300 · 2025-07-26T16:35:54Z

I noticed another issue #515 here is the fix for that where usage is not correctly shown

aimeos · 2025-07-26T16:39:11Z

I use this code, which is almost the same but it doesn't work with Prism 0.80:

        $prism = Prism::audio()->using( config( 'cms.ai.audio', 'openai' ), config( 'cms.ai.audio-model', 'whisper-1' ) );
        $file = Audio::fromBase64( base64_encode( $upload->getContent() ), $upload->getMimeType() );

        $response = $prism->withInput( $file )
            ->withProviderOptions([
                'response_format' => 'vtt',
            ])->asText();

        return $response->text;

Which Laravel version do you use? Maybe, there's a difference because the original code passes a specific structure of parameters and file. The code in the PR is much more robust.

pushpak1300 · 2025-07-26T16:58:05Z

I am using this on laravel 12.20 with prism main branch.

Which laravel version you are using ?

aimeos · 2025-07-26T21:17:47Z

I'm using v12.21.0.
The way it was implemented before isn't stated as working in the Laravel docs and I've found at least one reference that states that it's not working to pass file and key/value pairs that way.

aimeos · 2025-08-01T06:20:48Z

@pushpak1300 @sixlive
Audio upload to the OpenAI server is definitively not working, in no installation. The OpenAi server always responds:

  Sending to model (whisper-1) failed: HTTP request returned status code 400:
  {
  "error": {
  "message": "[{'type': 'value_error', 'loc': ('body', 'file'), 'msg': \"Value error, Expected UploadFi (truncated...)

The tests aren't doing a real request, so they don't cover the problem of a wrong multipart HTTP request.

sixlive · 2025-08-01T23:21:56Z

Closing in favor of #535

aimeos added 5 commits July 23, 2025 10:21

Fixed audio upload

ea727e2

Added strict_types declaration again

c464e37

Delete src/Providers/OpenAI/Maps/SpeechToTextRequestMapper.php

2bf8e93

Delete src/Providers/OpenAI/Maps/TextToSpeechRequestMapper.php

8b60198

Removed unused "use" statements

3f865d6

sixlive closed this Aug 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

fix(openai): Fix and improve OpenAI audio handling #511

fix(openai): Fix and improve OpenAI audio handling #511

Uh oh!

aimeos commented Jul 23, 2025

Uh oh!

sixlive commented Jul 24, 2025

Uh oh!

kinsta bot commented Jul 25, 2025 •

edited

Loading

Uh oh!

pushpak1300 commented Jul 26, 2025 •

edited

Loading

Uh oh!

pushpak1300 commented Jul 26, 2025

Uh oh!

aimeos commented Jul 26, 2025

Uh oh!

pushpak1300 commented Jul 26, 2025

Uh oh!

aimeos commented Jul 26, 2025

Uh oh!

aimeos commented Aug 1, 2025

Uh oh!

sixlive commented Aug 1, 2025

Uh oh!

Uh oh!

Uh oh!

fix(openai): Fix and improve OpenAI audio handling #511

fix(openai): Fix and improve OpenAI audio handling #511

Uh oh!

Conversation

aimeos commented Jul 23, 2025

Uh oh!

sixlive commented Jul 24, 2025

Uh oh!

kinsta bot commented Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Preview deployments for prism ⚡️

Uh oh!

pushpak1300 commented Jul 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pushpak1300 commented Jul 26, 2025

Uh oh!

aimeos commented Jul 26, 2025

Uh oh!

pushpak1300 commented Jul 26, 2025

Uh oh!

aimeos commented Jul 26, 2025

Uh oh!

aimeos commented Aug 1, 2025

Uh oh!

sixlive commented Aug 1, 2025

Uh oh!

Uh oh!

kinsta bot commented Jul 25, 2025 •

edited

Loading

pushpak1300 commented Jul 26, 2025 •

edited

Loading