Skip to content

fix(genai): resolve single aggregated embedding bug for gemini-embedding models#1817

Open
Sai Teja Bandaru (saitejabandaru-in) wants to merge 6 commits into
langchain-ai:mainfrom
saitejabandaru-in:feature-fix-gemini-embedding-batch
Open

fix(genai): resolve single aggregated embedding bug for gemini-embedding models#1817
Sai Teja Bandaru (saitejabandaru-in) wants to merge 6 commits into
langchain-ai:mainfrom
saitejabandaru-in:feature-fix-gemini-embedding-batch

Conversation

@saitejabandaru-in

Copy link
Copy Markdown

Description

This Pull Request resolves a critical integration issue (#37728) where calling GoogleGenerativeAIEmbeddings.embed_documents (or aembed_documents) with gemini-embedding-2 always returns a list of exactly 1 vector, regardless of how many documents are passed.

Root Cause

Unlike traditional text embeddings (e.g., text-embedding-004), multimodal gemini-embedding models (such as gemini-embedding-2) treat list inputs in standard embed_content calls as parts of a single aggregated multimodal document (designed for cross-modal retrieval, combining text/images/video/etc.). Consequently, they return exactly one merged vector for the entire batch.

Solution

  1. Parallelized Individual Embedding: Checks if the target model is a gemini-embedding model. If so, it embeds each document individually in parallel to prevent aggregation:
    • Synchronous path uses a standard ThreadPoolExecutor for concurrent network requests.
    • Asynchronous path uses asyncio.gather for non-blocking concurrent awaits.
  2. Backward Compatibility: Retains standard sequential/prepared batching for non-Gemini models (like text-embedding-004) to maximize network efficiency for those models.
  3. Comprehensive Unit Tests:
    • Updates legacy tests to use text-embedding-004 to preserve test coverage of the traditional batching logic.
    • Adds new test_embed_documents_gemini_embedding_2 and test_aembed_documents_gemini_embedding_2 unit tests targeting gemini-embedding-2-preview to verify correct multi-call dispatching and output reconstruction.

…ing models

Unlike traditional text embeddings (e.g. text-embedding-004), multimodal gemini-embedding models (such as gemini-embedding-2) treat list inputs in embed_content as parts of a single aggregated multimodal document, returning exactly one vector regardless of how many strings are passed.

This change checks if the target model is a gemini-embedding model, and if so, runs individual embeds in parallel using a ThreadPoolExecutor (sync path) and asyncio.gather (async path) to correctly return a distinct embedding for each document in the input list, aligning with the LangChain Embeddings interface spec.
Changes standard unit test MODEL_NAME to text-embedding-004 to maintain coverage for standard list batching. Adds dedicated sync and async tests targeting gemini-embedding-2-preview to verify the new parallel ThreadPoolExecutor and asyncio.gather execution paths and ensure regression safety.
…ing models

Unlike traditional text embeddings (e.g. text-embedding-004), multimodal gemini-embedding models (such as gemini-embedding-2) treat list inputs in embed_content as parts of a single aggregated multimodal document, returning exactly one vector regardless of how many strings are passed.

This change checks if the target model is a gemini-embedding model, and if so, runs individual embeds in parallel using a ThreadPoolExecutor (sync path) and asyncio.gather (async path) to correctly return a distinct embedding for each document in the input list, aligning with the LangChain Embeddings interface spec.
Changes standard unit test MODEL_NAME to text-embedding-004 to maintain coverage for standard list batching. Adds dedicated sync and async tests targeting gemini-embedding-2-preview to verify the new parallel ThreadPoolExecutor and asyncio.gather execution paths and ensure regression safety.
…ing models

Unlike traditional text embeddings (e.g. text-embedding-004), multimodal gemini-embedding models (such as gemini-embedding-2) treat list inputs in embed_content as parts of a single aggregated multimodal document, returning exactly one vector regardless of how many strings are passed.

This change checks if the target model is a gemini-embedding model, and if so, runs individual embeds in parallel using a ThreadPoolExecutor (sync path) and asyncio.gather (async path) to correctly return a distinct embedding for each document in the input list, aligning with the LangChain Embeddings interface spec.
Changes standard unit test MODEL_NAME to text-embedding-004 to maintain coverage for standard list batching. Adds dedicated sync and async tests targeting gemini-embedding-2-preview to verify the new parallel ThreadPoolExecutor and asyncio.gather execution paths and ensure regression safety.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant