Skip to content

roachtest/operations: add hold-connection operation#171302

Open
Dev-Kyle wants to merge 1 commit into
cockroachdb:masterfrom
Dev-Kyle:drt
Open

roachtest/operations: add hold-connection operation#171302
Dev-Kyle wants to merge 1 commit into
cockroachdb:masterfrom
Dev-Kyle:drt

Conversation

@Dev-Kyle

@Dev-Kyle Dev-Kyle commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds a new DRT operation, hold-connection, that opens a single SQL session and keeps it alive for 30 days, periodically running queries with varied fingerprints. The intent is to exercise long-lived session state under steady-state DRT load.

The op is a small piece of orchestration: pin a single physical connection via pool.Conn(ctx), run a keepAlive query every 5 min from a 7-entry rotation, and emit a heartbeat status line every hour so the long run is observable rather than silent.

Notes for reviewers

  • The 30-day timeout is ~7.5x longer than any existing operation in pkg/cmd/roachtest/operations/ (the prior max is 96h on backup_restore). Please flag any DRT scheduler concerns with an op that holds a slot for this long.
  • The keepalive query set is intentionally small and hardcoded — the goal is fingerprint variety with minimal mechanism. Open to feedback on whether a different set better exercises the cache surface.

Resolves: #122155
Epic: none

@Dev-Kyle Dev-Kyle requested a review from a team as a code owner June 1, 2026 16:14
@Dev-Kyle Dev-Kyle requested review from cpj2195 and williamchoe3 and removed request for a team June 1, 2026 16:14
@trunk-io

trunk-io Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Merging to master in this repository is managed by Trunk.

  • To merge this pull request, check the box to the left or comment /trunk merge below.

After your PR is submitted to the merge queue, this comment will be automatically updated with its status. If the PR fails, failure details will also be posted here

@cockroach-teamcity

Copy link
Copy Markdown
Member

This change is Reviewable

@blathers-crl

blathers-crl Bot commented Jun 1, 2026

Copy link
Copy Markdown

Detected infrastructure failure (matched: self-hosted runner lost communication with the server). Automatically rerunning failed jobs. (run link)

@cockroach-teamcity cockroach-teamcity added the X-perf-gain Microbenchmarks CI: Added if a performance gain is detected label Jun 1, 2026
The hold-connection operation opens a single SQL session against the
cluster and holds it open for an extended duration (30 days),
periodically running a small rotation of distinct queries to keep the
session alive on the server.

The operation pins one underlying connection via pool.Conn(ctx) rather
than relying on the *gosql.DB pool, ensuring the held session is the
same physical connection for its entire lifetime. A keepalive ticker
(5m) sends queries with varied fingerprints so the session accumulates
cache state over time; a separate heartbeat ticker (1h) emits an
operator-visible status line with remaining duration and queries
executed so the long run is observable rather than silent.

The 30-day duration is intentionally longer than any existing
operation timeout in this package (the prior maximum was 96h on
backup_restore). Reviewers should flag any DRT scheduler concerns
with an operation that holds a slot for this long.

Resolves: cockroachdb#122155
Epic: none

Release note: None

Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
@blathers-crl

blathers-crl Bot commented Jun 1, 2026

Copy link
Copy Markdown

Detected infrastructure failure (matched: self-hosted runner lost communication with the server). Automatically rerunning failed jobs. (run link)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

X-perf-gain Microbenchmarks CI: Added if a performance gain is detected

Projects

None yet

Development

Successfully merging this pull request may close these issues.

drt: consider having long-running sessions with some simple workload

2 participants