Skip to content

fix info-header string bound check in THeaderTransport::readString#3610

Closed
dxbjavid wants to merge 2 commits into
apache:masterfrom
dxbjavid:header-string-bound
Closed

fix info-header string bound check in THeaderTransport::readString#3610
dxbjavid wants to merge 2 commits into
apache:masterfrom
dxbjavid:header-string-bound

Conversation

@dxbjavid

@dxbjavid dxbjavid commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

The info-header string length is bounded against the header section before ptr moves past the length varint, so the remaining count is overstated by the varint width and a negative length is never rejected. A regression test is added to ThrifttReadCheckTests.

  • Did you create an Apache Jira ticket? (Request account here, not required for trivial changes)
  • If a ticket exists: Does your pull request title follow the pattern "THRIFT-NNNN: describe my issue"?
  • Did you squash your changes to a single commit? (not required, but preferred)
  • Did you do your best to avoid breaking changes? If one was needed, did you label the Jira ticket with "Breaking-Change"?
  • If your change does not involve any code, include [skip ci] anywhere in the commit message to free up build resources.

@mergeable mergeable Bot added the c++ Pull requests that update C++ code label Jun 30, 2026
@Jens-G

Jens-G commented Jun 30, 2026

Copy link
Copy Markdown
Member

Pls read your email. Now.

@Jens-G

Jens-G commented Jun 30, 2026

Copy link
Copy Markdown
Member

Code review

Found 3 issues (adversarially verified with failing tests):

  1. Exploit description in test inline comment (AGENTS.md §6)

Lines 345–347 of the new test describe the byte-level overread path in terms that name the exact exploitation mechanism. AGENTS.md §6 requires neutral functional language in inline comments; the description of what the old code allowed to happen should be rephrased in terms of the invariant being enforced rather than the failure mode.

// Header-format frame whose info-header key length (4) is checked before the
// length varint itself is consumed, so the string read used to start one byte
// later and run a byte past the end of the receive buffer. The header section

Verified: text absent from base commit d68305a73, present in PR code.

  1. strLen < 0 branch has no test

The new guard catches negative strLen values, but no test in the PR exercises this path. A 5-byte varint with bit 31 set (e.g. 0x80 0x80 0x80 0x80 0x08) produces a negative int32_t from readVarint32 and reaches this branch; without the guard, str.assign(ptr, (size_t)(-2147483648)) propagates a std::length_error rather than a TTransportException.

ptr += bytes;
if (strLen < 0 || strLen > headerBoundary - ptr) {

Verified: Docker test confirmed the path is reachable and behaves differently on base vs. PR code.

  1. ptr is advanced before the bounds check, contradicting the "Advances ptr on success" doc comment

Moving ptr += bytes to before the if means ptr (a non-const reference) is left pointing past the varint bytes when readString throws. The Doxygen comment documents that the function advances ptr only on success. Current callers in readHeaderFormat do not catch readString exceptions, so there is no immediate runtime effect, but the documented contract no longer holds and would mislead any subclass that catches the exception and then inspects ptr. Either the doc comment or the advance placement should be updated.

// size_t conversion in assign() below cannot run off the end of the buffer.
ptr += bytes;
if (strLen < 0 || strLen > headerBoundary - ptr) {
throw TTransportException(TTransportException::CORRUPTED_DATA,
"Info header length exceeds header size");
}

Verified: Docker two-state proof via subclass harness — assertion that ptr is unchanged after exception fails on PR code, passes on base.

🤖 Generated with Claude Code

- If this code review was useful, please react with 👍. Otherwise, react with 👎.

Client: cpp

when reading the key/value info headers of a THeader frame, readString reads
the length varint and then bounds it against the bytes left in the header
section. the comparison uses ptr before it is moved past the varint, so the
remaining count is overstated by the width of the length field, and a negative
length (a varint with the high bit set) is not rejected at all. with a header
section sized to fill the receive buffer, either case lets a wire-supplied
length exceed the header bytes that are actually present.

bound the length against the position that follows the varint, reject a
negative length, and only advance ptr once those checks pass so the documented
advance-on-success behaviour still holds. regression tests covering both the
oversized and the negative length are added to ThrifttReadCheckTests.
@dxbjavid dxbjavid force-pushed the header-string-bound branch from 5c190a7 to 050a1ce Compare July 1, 2026 14:18
@dxbjavid

dxbjavid commented Jul 1, 2026

Copy link
Copy Markdown
Contributor Author

all three addressed in the latest push.

  1. reworded the test comment to describe the invariant (the key length has to fit within the header bytes that remain after the length varint) rather than the failure mode. did the same for the commit body.

  2. added test_theadertransport_info_header_string_negative_length. it feeds a 5-byte key-length varint with the top bit set (0x80 0x80 0x80 0x80 0x08, decoding to INT32_MIN) and asserts a TTransportException, where the old code would let it reach assign and raise length_error.

  3. fair point on the contract. rather than loosen the doc comment i kept the advance-on-success behaviour instead: the bounds check now runs against a local pointer past the varint and ptr is only moved once the checks pass, so a caller that catches the exception still sees ptr unchanged.

also read your email, thanks.

Client: cpp

Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
@Jens-G Jens-G closed this in f961cdb Jul 2, 2026
@dxbjavid

dxbjavid commented Jul 3, 2026

Copy link
Copy Markdown
Contributor Author

Thank you @Jens-G. Appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

c++ Pull requests that update C++ code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants