Fix: detect mutating keywords inside CTEs in read-only mode#292
Fix: detect mutating keywords inside CTEs in read-only mode#292tianzhou wants to merge 12 commits into
Conversation
The read-only SQL check only examined the first keyword, so queries like `WITH updated AS (UPDATE ...) SELECT ...` bypassed the restriction. Now scans the full statement (after stripping comments and strings) for DML/DDL keywords. Closes #271 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
REPLACE followed by ( is a string function call (e.g. SELECT REPLACE(...)), not a mutating REPLACE INTO statement. Use negative lookahead to avoid false positives. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Only scan for mutating keywords when first keyword is WITH (fixes false rejection of SHOW CREATE TABLE in MySQL/MariaDB) - Add SELECT ... INTO detection for Postgres/MySQL/SQL Server - EXPLAIN with DML is allowed since it doesn't execute the statement - Add 6 more tests for SHOW CREATE, EXPLAIN, and SELECT INTO Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…n CTEs - WITH statements now also check for SELECT ... INTO - EXPLAIN ANALYZE with DML is rejected (Postgres executes the statement) - EXPLAIN without ANALYZE remains allowed with DML (plan only) - Fixed regex word boundary for parenthesized EXPLAIN (ANALYZE) form - Added tests for REPLACE() inside CTEs, WITH...SELECT INTO, EXPLAIN ANALYZE DELETE, EXPLAIN (ANALYZE) DELETE Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Reuse isReadOnlySQL recursively on the statement after EXPLAIN ANALYZE, so SELECT INTO and other write-capable forms are also caught. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Strip optional VERBOSE keyword after ANALYZE so the recursive isReadOnlySQL check receives the actual inner statement. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
REPLACE INTO is a MySQL/MariaDB-specific statement. In Postgres/SQLite/ SQL Server, REPLACE is only a function, so it should not trigger the mutating keyword check. This avoids false rejections when 'replace' appears as a CTE name or alias in non-MySQL dialects. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove ANALYZE from allowed first keywords since it updates statistics (writes to system catalogs) in Postgres/MySQL/SQLite - EXPLAIN (ANALYZE false/off/0) does not execute, so allow it with DML - Add MySQL/MariaDB REPLACE() function tests to ensure no false positives Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The previous regex `replace(?!\s*\()` rejected any standalone word `replace` that wasn't a function call, causing false positives for identifiers named `replace` (e.g. CTE aliases). Now only matches `REPLACE [LOW_PRIORITY|DELAYED] INTO` which is the actual MySQL REPLACE statement syntax. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
SQLite also supports REPLACE INTO, so the mutating keyword scan needs to detect it for SQLite connectors, not just MySQL/MariaDB. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Merge mutatingPatternSQLite into shared mutatingPatternWithReplace - Replace getMutatingPattern() with Record<ConnectorType, RegExp> lookup - Use match(/\S+/) instead of split(/\s+/)[0] to avoid full array alloc - Use exec() + slice() instead of test() + replace() to avoid double regex - Extract checkReadOnly() to skip re-running stripCommentsAndStrings on recursive EXPLAIN ANALYZE calls - Unify selectIntoPattern branches for select/with Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR tightens DBHub’s read-only SQL enforcement by preventing data-modifying statements from being hidden inside otherwise “allowed” top-level statements (notably WITH ... CTEs and EXPLAIN ANALYZE ...).
Changes:
- Extend
isReadOnlySQL()to scanWITHstatements for mutating keywords (and blockSELECT ... INTOcases). - Add dialect-specific detection for
REPLACE INTOin MySQL/MariaDB/SQLite. - Add recursive validation for
EXPLAIN ANALYZEand expand unit tests for the new read-only checks.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
src/utils/allowed-keywords.ts |
Adds deeper read-only validation for CTEs / SELECT INTO / EXPLAIN ANALYZE, plus mutating-keyword scanning and MySQL-family REPLACE handling. |
src/utils/__tests__/allowed-keywords.test.ts |
Adds extensive coverage for CTE bypasses, REPLACE handling, SELECT INTO, and EXPLAIN ANALYZE edge cases. |
| * Extended pattern for dialects that support REPLACE INTO (MySQL/MariaDB/SQLite). | ||
| * Only matches `REPLACE INTO` (with optional LOW_PRIORITY/DELAYED), not | ||
| * REPLACE() function calls or identifiers named `replace`. | ||
| */ | ||
| const mutatingPatternWithReplace = new RegExp( | ||
| `\\b(?:${mutatingKeywords.join("|")}|replace\\s+(?:(?:low_priority|delayed)\\s+)?into)\\b`, |
There was a problem hiding this comment.
mutatingPatternWithReplace only matches REPLACE ... INTO, but in MySQL/MariaDB the INTO keyword is optional for REPLACE statements (e.g. REPLACE tbl_name SET ...). As a result, a mutating statement can still bypass readonly mode when wrapped in a WITH statement.
Adjust the dialect pattern to also detect REPLACE statements without INTO (while still avoiding false positives on REPLACE() function calls, e.g. by rejecting replace when it’s not immediately followed by ().
| * Extended pattern for dialects that support REPLACE INTO (MySQL/MariaDB/SQLite). | |
| * Only matches `REPLACE INTO` (with optional LOW_PRIORITY/DELAYED), not | |
| * REPLACE() function calls or identifiers named `replace`. | |
| */ | |
| const mutatingPatternWithReplace = new RegExp( | |
| `\\b(?:${mutatingKeywords.join("|")}|replace\\s+(?:(?:low_priority|delayed)\\s+)?into)\\b`, | |
| * Extended pattern for dialects that support REPLACE (MySQL/MariaDB/SQLite). | |
| * Matches REPLACE statements (including those without INTO) but not REPLACE() | |
| * function calls, by rejecting `replace` when it is immediately followed by `(`. | |
| */ | |
| const mutatingPatternWithReplace = new RegExp( | |
| `\\b(?:${mutatingKeywords.join("|")}|replace(?!\\s*\\()))\\b`, |
There was a problem hiding this comment.
Fixed in 87f75fe. Changed pattern to replace(?!\s*\() to catch all REPLACE statement forms (including REPLACE tbl SET ... without INTO). Accepts the minor false positive on identifiers named replace in MySQL/MariaDB/SQLite as a security tradeoff.
| it("should reject REPLACE INTO as a mutating statement", () => { | ||
| const sql = "REPLACE INTO users (id, name) VALUES (1, 'test')"; | ||
| expect(isReadOnlySQL(sql, "mysql")).toBe(false); | ||
| }); | ||
|
|
||
| it("should allow a CTE named 'replace' in Postgres", () => { | ||
| const sql = "WITH replace AS (SELECT 1) SELECT * FROM replace"; | ||
| expect(isReadOnlySQL(sql, "postgres")).toBe(true); | ||
| }); | ||
|
|
||
| it("should allow a CTE named 'replace' in MySQL", () => { | ||
| const sql = "WITH replace AS (SELECT 1) SELECT * FROM replace"; | ||
| expect(isReadOnlySQL(sql, "mysql")).toBe(true); | ||
| }); | ||
|
|
||
| it("should reject REPLACE (non-function) inside WITH in MySQL", () => { | ||
| const sql = "WITH cte AS (SELECT 1) REPLACE INTO users VALUES (1, 'test')"; | ||
| expect(isReadOnlySQL(sql, "mysql")).toBe(false); | ||
| }); |
There was a problem hiding this comment.
There’s test coverage for REPLACE INTO ..., but no test that asserts readonly rejection for the valid MySQL/MariaDB form REPLACE tbl_name SET ... (where INTO is optional). Adding such a test would prevent regressions and would have caught the current bypass in the pattern matcher.
There was a problem hiding this comment.
Added in 87f75fe. Added tests for REPLACE tbl SET ... (both standalone and inside WITH).
MySQL/MariaDB allow REPLACE without INTO (e.g. REPLACE tbl SET ...). Changed pattern from matching only REPLACE INTO to matching any REPLACE not followed by ( to catch all REPLACE statement forms. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
WITH cte AS (UPDATE/INSERT/DELETE ...) SELECT ...to bypass restrictionsREPLACE INTOstatement form for MySQL/MariaDB/SQLite without false-positiving onREPLACE()function calls or identifiersChanges
src/utils/allowed-keywords.ts:REPLACE INTOdetection for MySQL/MariaDB/SQLite via per-dialect pattern mapSELECT ... INTOdetection for both SELECT and WITH statementsEXPLAIN ANALYZEvalidation — recursively checks the inner statementEXPLAIN (ANALYZE false/off)andEXPLAIN ANALYZE VERBOSEcorrectlyANALYZEfrom allowed keywords (it updates statistics)checkReadOnly()to avoid re-runningstripCommentsAndStringson recursive callssrc/utils/__tests__/allowed-keywords.test.ts: Added 40+ test cases covering CTE bypass, REPLACE handling, SELECT INTO, EXPLAIN ANALYZE, and edge casesCloses #271
Test plan
🤖 Generated with Claude Code