Skip to content

Conversation

afirstenberg
Copy link
Contributor

  • Adds support for the preview TTS models
    • Adds a "speechConfig" configuration parameter which takes both Googles structure and a simplified version of it.
    • Results are stored in a "media" message content, which includes data and mime type.
    • Data is PCM format (as provided by Google). No conversion to WAV or anything is done
  • Tests
    • Single speaker
    • Multiple speakers
    • Streaming

Fixes #8260

Enable a simplified version of the speech configuration.
Convert speech output to a media content.
Initial testing.
Copy link

vercel bot commented Jun 11, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchainjs-docs ✅ Ready (Inspect) Visit Preview Jun 13, 2025 1:03am
1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
langchainjs-api-refs ⬜️ Ignored (Inspect) Jun 13, 2025 1:03am

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. auto:enhancement A large net-new component, integration, or chain. Use sparingly. The largest features labels Jun 11, 2025
@dosubot dosubot bot added the lgtm PRs that are ready to be merged as-is label Jun 13, 2025
@hntrl hntrl merged commit c5e68a0 into langchain-ai:main Jun 13, 2025
27 checks passed
@Shivansh-Verma7719
Copy link

what is the relevant Langchain docs section on this. Can't seem to find it.

@afirstenberg
Copy link
Contributor Author

what is the relevant Langchain docs section on this. Can't seem to find it.

Yeah, I'm behind in documenting things in the docs themselves (partly because I also can't find the most relevant section). I should have a blog post about this out by end of month.

@Shivansh-Verma7719
Copy link

Ok thanks. How do you advice I proceed to implement this? If you have any sample implementation code that'll be great

@afirstenberg
Copy link
Contributor Author

Ok thanks. How do you advice I proceed to implement this? If you have any sample implementation code that'll be great

Sorry this took so long @Shivansh-Verma7719

Some documentation and examples at

@Shivansh-Verma7719
Copy link

Shivansh-Verma7719 commented Jun 29, 2025

No problem. Has this been implemented in langchain (https://github.com/langchain-ai/langchain). I couldn't find any PRs or open issues, if not I'll put a PR for Python package.

Thank you for the sample code

@afirstenberg
Copy link
Contributor Author

I don't know the status of the Python side of things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto:enhancement A large net-new component, integration, or chain. Use sparingly. The largest features lgtm PRs that are ready to be merged as-is size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Gemini: support TTS models and configuration

3 participants