Skip to content

embed_content with gemini-embedding-2* silently returns 1 embedding instead of N #2523

@niyazmft

Description

@niyazmft

Description

client.models.embed_content(model="gemini-embedding-2", contents=[...]) silently returns len(result.embeddings) == 1 regardless of how many items are passed in contents, when using gemini-embedding-2-preview or gemini-embedding-2.

This causes ValueError from zip(..., strict=True) in downstream consumers because the number of returned embeddings does not match the number of inputs.

gemini-embedding-001 does not exhibit this behavior — it correctly returns one embedding per input.

Reproduction

from google import genai
from google.genai import types

client = genai.Client(...)

texts = [
    "What is the meaning of life?",
    "How does gravity work?",
    "What is machine learning?",
]

result = client.models.embed_content(
    model="gemini-embedding-2-preview",
    contents=texts,
)

print(f"Inputs: {len(texts)}, Embeddings returned: {len(result.embeddings)}")
# Expected: Inputs: 3, Embeddings returned: 3
# Actual:   Inputs: 3, Embeddings returned: 1

Root Cause

gemini-embedding-2* treats embed_content(contents=[list_of_strings]) as parts of a single document rather than a batch. The proper batched API for these models is asyncBatchEmbedContent, but the SDKs embed_content method maps to embedContent which does not batch correctly for the -2* family.

Workaround

Wrapping each string in types.Content(parts=[types.Part(text=s)]) produces correct results:

result = client.models.embed_content(
    model="gemini-embedding-2-preview",
    contents=[types.Content(parts=[types.Part(text=s)]) for s in texts],
)

Downstream Impact

This has been hit by at least 3 projects:

  • graphiti (getzep): #1467 → fixed via #1474 (batch_size=1)
  • pydantic/pydantic-ai: #4872 → fixed via #4873 (Content wrapper)
  • plastic-labs/honcho: currently fixing with the same Content wrapper workaround

Expected Behavior

embed_content(contents=[...]) should either:

  1. Return the correct number of embeddings for gemini-embedding-2* (one per input), or
  2. Raise an error clearly indicating that batching via contents list is not supported for this model, pointing users to the correct API.

Environment

  • google-genai version: latest (>=1.71.0)
  • Models: gemini-embedding-2-preview, gemini-embedding-2

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions