Skip to content

Better nar restorer, and ref scanning#1068

Merged
Ericson2314 merged 2 commits into
mainfrom
scanner
Jun 6, 2026
Merged

Better nar restorer, and ref scanning#1068
Ericson2314 merged 2 commits into
mainfrom
scanner

Conversation

@Ericson2314

Copy link
Copy Markdown
Contributor

No description provided.

Ericson2314 and others added 2 commits June 6, 2026 15:52
Callers no longer pay the `spawn_blocking` + `SyncIoBridge` penalty
per NAR entry. The `Sink`-based API is replaced with a simpler
`restore` method that consumes a stream of `NarEvent`s directly.

Co-authored-by: Amaan Qureshi <git@amaanq.com>
Add a new `harmonia-store-ref-scan` crate with a streaming
`RefScanSink` (Boyer-Moore-style window over the nix-base32 alphabet)
for post-build reference discovery without a second disk walk.

Co-authored-by: Amaan Qureshi <git@amaanq.com>
@coderabbitai

coderabbitai Bot commented Jun 6, 2026

Copy link
Copy Markdown

Review Change Stack

Caution

Review failed

Pull request was closed or merged during review

Walkthrough

This PR introduces harmonia-store-ref-scan, a new streaming crate that detects Nix store path references embedded in byte streams via a constant-time window scanner with Boyer–Moore skipping. The crate is wired into the workspace and dependency graph. Concurrently, the PR refactors NAR restoration in harmonia-file-nar from blocking I/O with background spawn_blocking tasks to fully async operations using tokio::fs, simplifies the NarRestorer struct, removes the JoinError variant, and relaxes trait bounds from 'static to Unpin. Copyright years are updated to 2026, and an unused tempfile dependency is removed.

Possibly related PRs

  • nix-community/harmonia#1042: Adjusts STORE_PURE store-crate categorization in scripts/dependency-diagram.py to control how store crates are grouped in generated Mermaid diagrams.

Note: Chuck Norris doesn't refactor code—the code refactors itself out of respect when Chuck Norris enters the repository. In this case, async I/O bows before his async-first sensibilities, and streaming scanners simply know their windows are valid because Chuck Norris validates them with a glance.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Description check ❓ Inconclusive No pull request description was provided by the author, making it impossible to evaluate relevance to the changeset. Consider adding a description explaining the purpose and impact of the NAR restorer improvements and ref scanning implementation.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Better nar restorer, and ref scanning' directly corresponds to the two main changes: async NAR restoration refactoring and the new reference scanning functionality.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov

codecov Bot commented Jun 6, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 92.45283% with 16 lines in your changes missing coverage. Please review.
✅ Project coverage is 65.27%. Comparing base (daa3fbe) to head (e4d58f1).
⚠️ Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
harmonia-file-nar/src/archive/restorer.rs 79.24% 4 Missing and 7 partials ⚠️
harmonia-store-ref-scan/src/lib.rs 96.85% 3 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1068      +/-   ##
==========================================
+ Coverage   64.97%   65.27%   +0.30%     
==========================================
  Files         149      150       +1     
  Lines       17006    17138     +132     
  Branches    17006    17138     +132     
==========================================
+ Hits        11049    11187     +138     
+ Misses       5304     5295       -9     
- Partials      653      656       +3     
Flag Coverage Δ
aarch64-darwin 65.40% <93.77%> (+0.31%) ⬆️
aarch64-linux 64.75% <92.45%> (+0.31%) ⬆️
x86_64-linux 64.75% <92.45%> (+0.31%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Ericson2314 Ericson2314 added this pull request to the merge queue Jun 6, 2026
Merged via the queue into main with commit e3f461d Jun 6, 2026
5 of 6 checks passed
@Ericson2314 Ericson2314 deleted the scanner branch June 6, 2026 20:02
@github-actions

github-actions Bot commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

🐰 Bencher Report

Branchscanner
Testbedgithub-actions

⚠️ WARNING: No Threshold found!

Without a Threshold, no Alerts will ever be generated.

Click here to create a new Threshold
For more information, see the Threshold documentation.
To only post results if a Threshold exists, set the --ci-only-thresholds flag.

Click to view all benchmark results
BenchmarkLatencymilliseconds (ms)
closure/download📈 view plot
⚠️ NO THRESHOLD
2,786.10 ms
http/concurrent_16_identity📈 view plot
⚠️ NO THRESHOLD
206.52 ms
http/concurrent_16_zstd📈 view plot
⚠️ NO THRESHOLD
1,148.70 ms
http/concurrent_4_identity📈 view plot
⚠️ NO THRESHOLD
228.72 ms
http/concurrent_4_zstd📈 view plot
⚠️ NO THRESHOLD
1,314.40 ms
http/sequential_identity📈 view plot
⚠️ NO THRESHOLD
502.01 ms
http/sequential_zstd📈 view plot
⚠️ NO THRESHOLD
2,469.50 ms
narinfo/all📈 view plot
⚠️ NO THRESHOLD
6.19 ms
🐰 View full continuous benchmarking report in Bencher

@github-actions

github-actions Bot commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

🐰 Bencher Report

Branchmain
Testbedgithub-actions
Click to view all benchmark results
BenchmarkLatencyBenchmark Result
milliseconds (ms)
(Result Δ%)
Lower Boundary
milliseconds (ms)
(Limit %)
Upper Boundary
milliseconds (ms)
(Limit %)
closure/download📈 view plot
🚷 view threshold
2,627.70 ms
(-0.68%)Baseline: 2,645.56 ms
2,513.27 ms
(95.65%)
2,777.85 ms
(94.59%)
http/concurrent_16_identity📈 view plot
🚷 view threshold
205.81 ms
(+2.66%)Baseline: 200.48 ms
186.98 ms
(90.85%)
213.99 ms
(96.18%)
http/concurrent_16_zstd📈 view plot
🚷 view threshold
1,137.00 ms
(+0.42%)Baseline: 1,132.22 ms
1,074.87 ms
(94.54%)
1,189.57 ms
(95.58%)
http/concurrent_4_identity📈 view plot
🚷 view threshold
224.65 ms
(+2.75%)Baseline: 218.64 ms
203.12 ms
(90.42%)
234.16 ms
(95.94%)
http/concurrent_4_zstd📈 view plot
🚷 view threshold
1,289.80 ms
(+0.05%)Baseline: 1,289.11 ms
1,218.33 ms
(94.46%)
1,359.90 ms
(94.85%)
http/sequential_identity📈 view plot
🚷 view threshold
454.70 ms
(-1.61%)Baseline: 462.13 ms
420.47 ms
(92.47%)
503.78 ms
(90.26%)
http/sequential_zstd📈 view plot
🚷 view threshold
2,408.90 ms
(+0.01%)Baseline: 2,408.74 ms
2,276.64 ms
(94.51%)
2,540.84 ms
(94.81%)
narinfo/all📈 view plot
🚷 view threshold
7.10 ms
(+11.81%)Baseline: 6.35 ms
5.59 ms
(78.76%)
7.11 ms
(99.88%)
🐰 View full continuous benchmarking report in Bencher

@xokdvium

xokdvium commented Jun 7, 2026

Copy link
Copy Markdown

@Ericson2314, shouldn't this stuff use capstd ideally? (As a long-term goal)

@Ericson2314

Copy link
Copy Markdown
Contributor Author

@xokdvium yes see bytecodealliance/cap-std#414 and #1031

@Ericson2314

Copy link
Copy Markdown
Contributor Author

This is stuff from @Mic92 I am basically doing as is. I hope it could help with rio dedup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants