Speed up Python GraphBinary deserialization by kirill-stepanishin · Pull Request #3493 · apache/tinkerpop

kirill-stepanishin · 2026-06-29T22:44:55Z

The GraphBinary reader built a DataType enum member from the type byte for every object it decoded. That per-object enum construction heavily degrades deserialization performance on large result sets.

The reader now builds a {type code: deserializer} lookup table once up front and dispatches on the raw integer instead, avoiding per-object enum construction. Behavior is unchanged: an unknown type code still raises ValueError("... is not a valid DataType").

Performance

Benchmarked on two cross-region EC2 instances (server in US-EAST-2, client in US-WEST-2) to capture realistic network latency, against the Modern graph over GraphBinary V4 on Python 3.11. Each query was run with and without this change, alternating back to back across 3 sweeps, reporting the median.

Query	Before	After	Change
`g.V().repeat(both()).times(12)` (~200k results)	7.97 s	5.85 s	26% faster
`g.V()` (6 results)	0.107 s	0.109 s	no change

The improvement is significant on large result sets, where per-object deserialization cost dominates, and scales with the number of objects returned.

Assisted-by: Claude Code:claude-opus-4-8

codecov-commenter · 2026-06-29T22:59:17Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 76.11%. Comparing base (a28cd1f) to head (2cfc742).
⚠️ Report is 167 commits behind head on master.

Additional details and impacted files

@@             Coverage Diff              @@
##             master    #3493      +/-   ##
============================================
- Coverage     76.35%   76.11%   -0.25%     
- Complexity    13424    13861     +437     
============================================
  Files          1012     1030      +18     
  Lines         60341    62712    +2371     
  Branches       7075     7338     +263     
============================================
+ Hits          46076    47731    +1655     
- Misses        11548    12018     +470     
- Partials       2717     2963     +246

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

kenhuuu · 2026-06-30T16:32:47Z

VOTE +1

Cole-Greer

VOTE +1

Add int-keyed deserializer dispatch to reader

3ebd857

Assisted-by: Claude Code:claude-opus-4-8

kenhuuu reviewed Jun 30, 2026

View reviewed changes

Comment thread gremlin-python/src/main/python/gremlin_python/structure/io/graphbinaryV4.py Outdated

Rename dispatch table to deserializer by type code

2cfc742

Cole-Greer approved these changes Jun 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Speed up Python GraphBinary deserialization#3493

Speed up Python GraphBinary deserialization#3493
kirill-stepanishin wants to merge 2 commits into
apache:masterfrom
kirill-stepanishin:python-graphbinary-int-dispatch

kirill-stepanishin commented Jun 29, 2026

Uh oh!

codecov-commenter commented Jun 29, 2026 •

edited

Loading

Uh oh!

Uh oh!

kenhuuu commented Jun 30, 2026

Uh oh!

Cole-Greer left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

kirill-stepanishin commented Jun 29, 2026

Performance

Uh oh!

codecov-commenter commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

kenhuuu commented Jun 30, 2026

Uh oh!

Cole-Greer left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov-commenter commented Jun 29, 2026 •

edited

Loading