Skip to content

fix(graphiti_core): default list fields in response models to prevent ValidationError on empty LLM output#1416

Open
kromanow94 wants to merge 2 commits intogetzep:mainfrom
kromanow94:fix/default-empty-list-response-models
Open

fix(graphiti_core): default list fields in response models to prevent ValidationError on empty LLM output#1416
kromanow94 wants to merge 2 commits intogetzep:mainfrom
kromanow94:fix/default-empty-list-response-models

Conversation

@kromanow94
Copy link
Copy Markdown

Summary

Several Pydantic response models crash with ValidationError when the LLM returns an empty dict {} as structured output. This happens because list fields are defined as required (Field(...)) with no default value.

LLM providers that don't enforce required fields server-side (Anthropic, Gemini) can return {} when there's nothing to extract. This causes response_model(**{}) to fail, and retries don't help since the model makes the same decision.

This PR changes the affected list fields from Field(...) to Field(default_factory=list) so that empty LLM output is treated as "nothing found" instead of crashing.

Type of Change

  • Bug fix
  • New feature
  • Performance improvement
  • Documentation/Tests

Bug Details

Error logs:

graphiti_core.llm_client.anthropic_client - WARNING - Retrying after error (attempt 1/2):
1 validation error for ExtractedEntities
extracted_entities
  Field required [type=missing, input_value={}, input_type=dict]
graphiti_core.llm_client.anthropic_client - WARNING - Retrying after error (attempt 1/2):
1 validation error for ExtractedEdges
edges
  Field required [type=missing, input_value={}, input_type=dict]

After 2 retries, the error propagates and the episode fails to process.

Call path (Anthropic):

add_episode()
  -> llm_client.generate_response(response_model=ExtractedEntities)
    -> _generate_response() returns tool_args = {}        # content_item.input is {}
    -> generate_response() calls response_model(**{})     # ExtractedEntities(**{}) -> ValidationError
    -> retry appends error to messages -> model returns {} again -> same error
    -> after max_retries, episode fails

Affected clients:

Client Validates with Pydantic? Affected?
Anthropic (anthropic_client.py) Yes — response_model(**response) at line 403 Yes
Gemini (gemini_client.py) Yes — response_model.model_validate(...) at line 329 Yes
OpenAI (openai_base_client.py) No — trusts structured output (constrained decoding) No
Groq (inherits OpenAI base) No No

OpenAI's structured output feature enforces the schema via constrained decoding, guaranteeing valid output. Anthropic and Gemini do not enforce required fields server-side.

Changes

6 fields across 4 files, all Field(...) -> Field(default_factory=list):

File Model Field
prompts/extract_nodes.py ExtractedEntities extracted_entities
prompts/extract_nodes.py SummarizedEntities summaries
prompts/extract_edges.py ExtractedEdges edges
prompts/dedupe_edges.py EdgeDuplicate duplicate_facts
prompts/dedupe_edges.py EdgeDuplicate contradicted_facts
prompts/dedupe_nodes.py NodeResolutions entity_resolutions

Before / After:

# Before (crashes on empty input)
class ExtractedEntities(BaseModel):
    extracted_entities: list[ExtractedEntity] = Field(..., description='List of extracted entities')

# After (gracefully handles empty input)
class ExtractedEntities(BaseModel):
    extracted_entities: list[ExtractedEntity] = Field(default_factory=list, description='List of extracted entities')

Why this is safe:

  • An empty list is the correct semantic for "nothing found" - downstream code already handles empty lists (no nodes/edges/duplicates created).
  • The tool schema still includes the field with its description, so the model is guided to populate it when it has results.
  • Behavior is identical when the model returns populated fields like {"extracted_entities": [...]}.

Testing

  • Unit tests added/updated
  • Integration tests added/updated
  • All existing tests pass

Added tests/prompts/test_response_models.py with 10 tests covering all 5 affected models:

  • 5 empty-input tests — verify Model(**{}) produces valid instances with empty lists (the regression scenario)
  • 5 populated-input tests — verify normal operation is unchanged

Breaking Changes

  • This PR contains breaking changes

Checklist

  • Code follows project style guidelines (make lint passes)
  • Self-review completed
  • Documentation updated where necessary
  • No secrets or sensitive information committed

Environment

  • graphiti-core: 0.28.2
  • LLM providers confirmed affected: Anthropic (Claude), Gemini
  • pydantic: 2.12+

…r on empty LLM output

When Anthropic or Gemini models return empty tool input `{}` (e.g. when
no entities or edges are found), Pydantic validation fails because list
fields are defined as required with no default. This changes them to
default to empty lists, which is the correct semantic for "nothing found"
and is already handled by downstream code.

Affected models: ExtractedEntities, SummarizedEntities, ExtractedEdges,
EdgeDuplicate, NodeResolutions.
Covers all 5 affected models with both empty input (the bug scenario)
and populated input (normal operation) to prevent regressions.
@kromanow94
Copy link
Copy Markdown
Author

I have read the CLA Document and I hereby sign the CLA

@danielchalef
Copy link
Copy Markdown
Member

danielchalef commented Apr 14, 2026

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

danielchalef added a commit that referenced this pull request Apr 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants