Skip to content

Conversation

aimeos
Copy link
Contributor

@aimeos aimeos commented Jul 23, 2025

Uploading audio files to OpenAI doesn't work at all because of incorrect usage of the Laravel HTTP facade for multipart requests. This PR fixes the upload problem and improves handling different response format from whisper-1 API depending on the sent provider specific parameters.

@sixlive
Copy link
Contributor

sixlive commented Jul 24, 2025

@pushpak1300 can you take a look at this?

Copy link

kinsta bot commented Jul 25, 2025

Preview deployments for prism ⚡️

Status Branch preview Commit preview
✅ Ready Visit preview Visit preview

Commit: 5efb9f2f73418ead19347486e7fb11fd942b4d62

Deployment ID: d8821fe4-2419-4f4b-8263-0307c74262f6

Static site name: prism-97nz9

@pushpak1300
Copy link
Contributor

pushpak1300 commented Jul 26, 2025

@aimeos Hey can you provide me the code which will reproduce the issue?

With the below code i was able to generate the transcriptions..

 $audioFile = Audio::fromLocalPath('audio.mp3');

        $response = Prism::audio()
            ->using('openai', 'whisper-1')
            ->withInput($audioFile)
            ->asText();
 dd($response)
 

The result i am getting

Output:                                                                                                               
  ================                                                                                                      
  Prism\Prism\Audio\TextResponse {#2415                                                                                 
    +text: "Hello, world."                                                                                              
    +usage: Prism\Prism\ValueObjects\Usage {#2412                                                                       
      +promptTokens: 0                                                                                                  
      +completionTokens: 0                                                                                              
      +cacheWriteInputTokens: null                                                                                      
      +cacheReadInputTokens: null                                                                                       
      +thoughtTokens: null                                                                                              
    }                                                                                                                   
    +additionalContent: array:2 [                                                                                       
      "text" => "Hello, world."                                                                                         
      "usage" => array:2 [                                                                                              
        "type" => "duration"                                                                                            
        "seconds" => 1                                                                                                  
      ]                                                                                                                 
    ]                                                                                                                   
  } // /Users/pushpakchhajed/Projects/prism/tests/Providers/OpenAI/AudioTest.php:156                                    

@pushpak1300
Copy link
Contributor

I noticed another issue #515 here is the fix for that where usage is not correctly shown

@aimeos
Copy link
Contributor Author

aimeos commented Jul 26, 2025

I use this code, which is almost the same but it doesn't work with Prism 0.80:

        $prism = Prism::audio()->using( config( 'cms.ai.audio', 'openai' ), config( 'cms.ai.audio-model', 'whisper-1' ) );
        $file = Audio::fromBase64( base64_encode( $upload->getContent() ), $upload->getMimeType() );

        $response = $prism->withInput( $file )
            ->withProviderOptions([
                'response_format' => 'vtt',
            ])->asText();

        return $response->text;

Which Laravel version do you use? Maybe, there's a difference because the original code passes a specific structure of parameters and file. The code in the PR is much more robust.

@pushpak1300
Copy link
Contributor

I am using this on laravel 12.20 with prism main branch.

Which laravel version you are using ?

@aimeos
Copy link
Contributor Author

aimeos commented Jul 26, 2025

I'm using v12.21.0.
The way it was implemented before isn't stated as working in the Laravel docs and I've found at least one reference that states that it's not working to pass file and key/value pairs that way.

@aimeos
Copy link
Contributor Author

aimeos commented Aug 1, 2025

@pushpak1300 @sixlive
Audio upload to the OpenAI server is definitively not working, in no installation. The OpenAi server always responds:

  Sending to model (whisper-1) failed: HTTP request returned status code 400:
  {
  "error": {
  "message": "[{'type': 'value_error', 'loc': ('body', 'file'), 'msg': \"Value error, Expected UploadFi (truncated...)

The tests aren't doing a real request, so they don't cover the problem of a wrong multipart HTTP request.

@sixlive
Copy link
Contributor

sixlive commented Aug 1, 2025

Closing in favor of #535

@sixlive sixlive closed this Aug 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants