-
-
Notifications
You must be signed in to change notification settings - Fork 116
feat: add multimodal UIMessage support #230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
📝 WalkthroughWalkthroughThis pull request extends UIMessage to support multimodal content types (images, audio, video, documents) by adding new ContentPart interfaces and updating message conversion logic to preserve multimodal content structure through ModelMessage and UIMessage transformations. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
| * Convert ContentPart array to MessagePart array | ||
| * Preserves all multimodal content (text, image, audio, video, document) | ||
| */ | ||
| function contentPartsToMessageParts( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused, don't these two types match identically? from what I see what you're doing is just coping the old data into the new one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might've gotten carried away here and overcomplicated things.
I initially thought ContentPart (used in ModelMessage.content) and MessagePart (used in UIMessage.parts) were separate type systems for model and ui that needed their own definitions.
Pushed changes to simplify and resolve this.
|
I would also like to send media messages from the client, I need this feature |
🎯 Changes
When calling
append()with aModelMessagecontaining multimodal content (images, audio, files), the content was stripped during theModelMessage → UIMessageconversion becausemodelMessageToUIMessage()only extracted text viagetTextContent(). Along this, thepartsof a message doesn't include multimodal parts, making it impossible to build chat UIs that preserve and display multimodal content.Added new message part types and updated the conversion functions to preserve multimodal content during round-trips:
New Types (@tanstack/ai and @tanstack/ai-client):
ImageMessagePart- preserves image data with source and optional metadataAudioMessagePart- preserves audio dataVideoMessagePart- preserves video data - (NOT TESTED)DocumentMessagePart- preserves document data (e.g., PDFs) - (NOT TESTED)Updated Conversion Functions:
modelMessageToUIMessage()- now convertsContentPart[]to correspondingMessagePart[]instead of discarding non-text partsuiMessageToModelMessages()- now buildsContentPart[]when multimodal parts are present, preserving part orderingExample:
Demo
Images:
https://github.com/user-attachments/assets/5f62ab32-9f11-44f7-bfc0-87d00678e265
Audio:
https://github.com/user-attachments/assets/bbbdc2f9-f8d7-4d74-99c2-23d15a3278a3
Closes #200
Note
This contribution touches core message handling. Let me know if the approach doesn't align with the project's vision, I am happy to iterate on it :)
This PR is not ready to be merged because:
✅ Checklist
pnpm run test:pr.🚀 Release Impact
Summary by CodeRabbit
Release Notes
New Features
Tests
✏️ Tip: You can customize this high-level summary in your review settings.