Skip to content

feat(genai): Support multimodal file inputs and display_name in function responses#1834

Open
Saurav Gupta (Saurav-Gupta-9741) wants to merge 6 commits into
langchain-ai:mainfrom
Saurav-Gupta-9741:fix-multimodal-function-responses
Open

feat(genai): Support multimodal file inputs and display_name in function responses#1834
Saurav Gupta (Saurav-Gupta-9741) wants to merge 6 commits into
langchain-ai:mainfrom
Saurav-Gupta-9741:fix-multimodal-function-responses

Conversation

@Saurav-Gupta-9741

Copy link
Copy Markdown

Description

This PR resolves the issue where multimodal files were disconnected from the FunctionResponse and stripped of their metadata. It updates _convert_tool_message_to_parts to accurately parse file, media, and image_url blocks from a ToolMessage and maps them into FunctionResponsePart objects.

By natively integrating FunctionResponseFileData and FunctionResponseBlob from the google-genai SDK, this strictly maintains data associations and preserves the display_name, ensuring the Gemini API can successfully distinguish between multiple files generated by a single tool call.

Testing

  • Updated test_convert_tool_message_to_parts_list_content_with_media to assert the correct bundled Part structure.
  • Added test_convert_tool_message_to_parts_with_display_name to explicitly verify display_name metadata preservation.
  • Verified all 230 PyTest unit tests pass successfully.

Sir, if anything misses out, please let me know and I will fix it according to your expectation. Thank you so much for looking into it!

@Saurav-Gupta-9741

Copy link
Copy Markdown
Author

Respected Sir Bagatur (@baskaryan) / Sir Harrison Chase (@hwchase17),

I have fully resolved the issue regarding the missing display_name metadata in multimodal function responses.

The architecture has been cleanly upgraded to natively parse interleaved media, file, and image_url blocks directly into the FunctionResponsePart and FunctionResponseFileData schemas as required by the new google-genai SDK standards.

I have also included a comprehensive unit test to verify this behavior, and all strict typing (mypy) and formatting (ruff) checks have successfully passed without any workarounds.

Thank you very much, Sir, for this wonderful opportunity to contribute to the LangChain ecosystem. It is an honor to help improve this library. Please let me know if there are any further adjustments you would like me to make, and I will be happy to implement them immediately!

looking forward for your response

@Saurav-Gupta-9741

Copy link
Copy Markdown
Author

"Hi maintainers, it looks like the langchain-google-genai-us integration test hit the 60-minute timeout limit on Google Cloud Build. Could someone with write access please re-run the failed jobs when you get a chance? All other 12 checks and unit tests have passed successfully. Thank you!"

@Saurav-Gupta-9741

Copy link
Copy Markdown
Author

Respected Mason Daugherty (@mdrxy), Bagatur (@baskaryan), and Eugene Yurtsev (@eyurtsev),

I hope you are having a great week! I wanted to gently bump this PR regarding support for multimodal file inputs and display_name within function responses for the Google GenAI integration.

The code is fully complete and all 12 core unit checks passed successfully. However, it looks like the langchain-google-genai-us (llm-integration-tests) job hit the 60-minute Google Cloud Build timeout limit (likely an infrastructure flake). Could someone with write access please re-run the failed jobs when you have a chance?

I know you all manage an incredible volume of work, so I have the utmost respect for your time. Whenever you have the bandwidth to review, I am highly available to make any structural adjustments you recommend to ensure this aligns perfectly with your vision for the package.

Looking forward to your guidance, and thank you for all your hard work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant