🖇️ feat: Send Attachments Directly to Providers #9088

dustinhealy · 2025-08-16T02:47:01Z

Summary

This pull request extends support for sending attachments directly to LLM providers using the new multimodal 'Upload to Provider' button in the AttachFileMenu component for the following supported endpoints: Anthropic, OpenAI, AzureOpenAI, and Google (and Agents created which utilize those endpoints).

All four of the listed providers will now support PDF attachments in addition to their existing support for image attachments, with Google conversations now additionally supporting video and audio attachments as well when used in conjunction with compatible multimodal models. In order to achieve this functionality new encoding and formatting utilities for these file types have been introduced, as well as the agent client having been updated to categorize, process, and validate all supported attachments. Alongside this, two new SVGs have been introduced to properly display attached video and audio files in chat.

A small change has been made in PR #14 in the agents repo in order to support the new audio and video filetypes for Google conversations.

For more specific information about the changes made, the constituent PRs adding support for each endpoint can be referenced individually:

Change Type

New feature (non-breaking change which adds functionality)
This change requires a documentation update

Checklist

My code adheres to this project's style guidelines
I have performed a self-review of my own code
I have made pertinent documentation changes
My changes do not introduce new warnings
Local unit tests pass with my changes
A pull request for updating the documentation has been submitted.

* feat: implement Anthropic native PDF support with document preservation - Add comprehensive debug logging throughout PDF processing pipeline - Refactor attachment processing to separate image and document handling - Create distinct addImageURLs(), addDocuments(), and processAttachments() methods - Fix critical bugs in stream handling and parameter passing - Add streamToBuffer utility for proper stream-to-buffer conversion - Remove api/agents submodule from repository 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]> * chore: remove out of scope formatting changes * fix: stop duplication of file in chat on end of response stream * chore: bring back file search and ocr options * chore: localize upload to provider string in file menu * refactor: change createMenuItems args to fit new pattern introduced by anthropic-native-pdf-support * feat: add cache point for pdfs processed by anthropic endpoint since they are unlikely to change and should benefit from caching * feat: combine Upload Image into Upload to Provider since they both perform direct upload and change provider upload icon to reflect multimodal upload * feat: add citations support according to docs * refactor: remove redundant 'document' check since documents are handled properly by formatMessage in the agents repo now * refactor: change upload logic so anthropic endpoint isn't exempted from normal upload path using Agents for consistency with the rest of the upload logic * fix: include width and height in return from uploadLocalFile so images are correctly identified when going through an AgentUpload in addImageURLs * chore: remove client specific handling since the direct provider stuff is handled by the agent client * feat: handle documents in AgentClient so no need for change to agents repo * chore: removed unused changes * chore: remove auto generated comments from OG commit * feat: add logic for agents to use direct to provider uploads if supported (currently just anthropic) * fix: reintroduce role check to fix render error because of undefined value for Content Part * fix: actually fix render bug by using proper isCreatedByUser check and making sure our mutation of formattedMessage.content is consistent --------- Co-authored-by: Andres Restrepo <[email protected]> Co-authored-by: Claude <[email protected]>

* refactor: change references from direct upload to direct attach to better reflect functionality since we are just using base64 encoding strategy now rather than Files/File API for sending our attachments directly to the provider, the upload nomenclature no longer makes sense. direct_attach better describes the different methods of sending attachments to providers anyways even if we later introduce direct upload support * feat: add upload to provider option for openai (and agent) ui * chore: move anthropic pdf validator over to packages/api * feat: simple pdf validation according to openai docs * feat: add provider agnostic validatePdf logic to start handling multiple endpoints * feat: add handling for openai specific documentPart formatting * refactor: move require statement to proper place at top of file * chore: add in openAI endpoint for the rest of the document handling logic * feat: add direct attach support for azureOpenAI endpoint and agents * feat: add pdf validation for azureOpenAI endpoint * refactor: unify all the endpoint checks with isDocumentSupportedEndpoint * refactor: consolidate Upload to Provider vs Upload image logic for clarity * refactor: remove anthropic from anthropic_multimodal fileType since we support multiple providers now

* feat: add validation for google PDFs and add google endpoint as a document supporting endpoint * feat: add proper pdf formatting for google endpoints (requires PR #14 in agents) * feat: add multimodal support for google endpoint attachments * feat: add audio file svg * fix: refactor attachments logic so multi-attachment messages work properly * feat: add video file svg * fix: allows for followup questions of uploaded multimodal attachments * fix: remove incorrect final message filtering that was breaking Attachment component rendering

… picked up due to case insensitivity in dir name

…iltering

* refactor: move audio encode over to TS * refactor: audio encoding now functional in LC again * refactor: move video encode over to TS * refactor: move document encode over to TS * refactor: video encoding now functional in LC again * refactor: document encoding now functional in LC again * fix: extend file type options in AttachFileMenu to include 'google_multimodal' and update dependency array to include agent?.provider * feat: only accept pdfs if responses api is enabled for openai convos

dustinhealy changed the title ~~WIP: Direct Provider Uploads~~ WIP: Send Attachments Directly to Providers Aug 16, 2025

dustinhealy force-pushed the feat/direct-provider-upload-v2 branch from e7412bd to 2c09ca4 Compare August 17, 2025 07:30

dustinhealy and others added 4 commits August 18, 2025 05:50

fix: manualy rename 'documents' to 'Documents' in git since it wasn't…

08ddf5e

… picked up due to case insensitivity in dir name

dustinhealy force-pushed the feat/direct-provider-upload-v2 branch from c9173f2 to 08ddf5e Compare August 18, 2025 12:51

dustinhealy added 2 commits August 18, 2025 06:06

chore: update unittest since mp3 filetypes are now partially supported

18866e3

fix: add logic so filepicker for a google agent has proper filetype f…

2b5065e

…iltering

dustinhealy changed the title ~~WIP: Send Attachments Directly to Providers~~ 🖇️ feat: Send Attachments Directly to Providers Aug 18, 2025

dustinhealy marked this pull request as ready for review August 18, 2025 14:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

🖇️ feat: Send Attachments Directly to Providers #9088

🖇️ feat: Send Attachments Directly to Providers #9088

dustinhealy commented Aug 16, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

🖇️ feat: Send Attachments Directly to Providers #9088

Are you sure you want to change the base?

🖇️ feat: Send Attachments Directly to Providers #9088

Conversation

dustinhealy commented Aug 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Change Type

Checklist

Uh oh!

Uh oh!

dustinhealy commented Aug 16, 2025 •

edited

Loading