Skip to content

Commit 1f12ed1

Browse files
gakonstampcode-com
andauthored
fix: security hardening for SQL query API (#81)
* fix: replace string-based SQL injection in inject_block_filter with AST manipulation The previous implementation used string position matching (finding WHERE, ORDER BY, LIMIT keywords) and format! interpolation to splice block_num filters into user-provided SQL. This was vulnerable to structural SQL injection where crafted queries could exploit the naive keyword matching (e.g. WHERE inside string literals, UNION bypasses). Replace with sqlparser AST parsing and manipulation: - Parse user SQL into AST, requiring a single simple SELECT statement - Determine filter column from the FROM table (num for blocks, block_num for others) - Safely AND the block filter into the existing WHERE clause (or add one) - Serialize modified AST back to SQL Also: - Reject UNION/INTERSECT/set operations in live mode (ambiguous filtering) - Return Result<String, ApiError> instead of String for proper error handling - Add Display impl for ApiError - Add tests for UNION rejection, non-SELECT rejection, and WHERE keyword in string literals Amp-Thread-ID: https://ampcode.com/threads/T-019c3272-f632-763c-8078-504a90852a67 Co-authored-by: Amp <amp@ampcode.com> * fix: add table allowlist to query validator and read-only API role Replace the blocklist-only approach with a table allowlist so API users can only query: blocks, txs, logs, receipts, token_holders, token_balances, and CTE-defined tables. Previously, users could query sync_state (internal), pg_tables (schema enumeration), or any other table accessible to the tidx DB user. Changes: - Allowlist in validator: only permitted tables + CTE-defined names pass - Block dblink function family (cross-database access) - Add db/api_role.sql migration creating a tidx_api read-only role with SELECT-only grants on indexed tables (defense-in-depth) - Thread CTE names through all validate_* functions - 6 new tests: sync_state rejected, pg_tables rejected, unknown table rejected, CTE tables allowed, dblink blocked, analytics tables allowed Amp-Thread-ID: https://ampcode.com/threads/T-019c3272-f632-763c-8078-504a90852a67 Co-authored-by: Amp <amp@ampcode.com> * fix: close DoS, privilege escalation, and file read bypass vectors Block three categories of attacks: 1. DoS via resource exhaustion: - Reject WITH RECURSIVE (endless loop CTEs) - Block generate_series() (billion-row generation) - Block SELECT INTO (object creation) 2. Privilege escalation / validator bypass: - Validate expressions inside VALUES rows (previously VALUES(pg_sleep(10)) bypassed the entire function blocklist) - Reject TABLE statement (TABLE pg_shadow bypassed table allowlist) - Validate GROUP BY, HAVING, JOIN ON expressions (could hide function calls) - Walk IsNull/IsNotNull/IsTrue/IsFalse/Like expressions recursively 3. File read hardening: - Block lo_get/lo_open/lo_close/loread/lo_creat/lo_create/lo_unlink/lo_put - Block pg_file_read/pg_file_write/pg_file_rename/pg_file_unlink/pg_logdir_ls - VALUES bypass closure prevents pg_read_file via VALUES(...) 11 new tests covering all vectors. Amp-Thread-ID: https://ampcode.com/threads/T-019c3272-f632-763c-8078-504a90852a67 Co-authored-by: Amp <amp@ampcode.com> * fix: replace function blocklist with allowlist and harden API role Switch from a function blocklist (reject known-bad) to an allowlist (permit known-good only). This eliminates the risk of missing dangerous functions as PostgreSQL adds new ones. Validator changes: - ALLOWED_FUNCTIONS allowlist: ABI helpers, aggregates, scalars, string, numeric, time, window functions, and type casting - Reject ALL table functions (FROM func(...)) unconditionally - Reject unsupported TableFactor variants (catch-all _ => Err) - Remove is_dangerous_function() and is_dangerous_table_function() API role hardening (db/api_role.sql): - Deny-by-default: REVOKE ALL on tables, sequences, and functions before granting specific access - Add token_holders and token_balances to SELECT grants - CONNECTION LIMIT 64 - statement_timeout = 30s, work_mem = 64MB, temp_file_limit = 256MB 5 new tests, all 147 lib tests passing. Amp-Thread-ID: https://ampcode.com/threads/T-019c499a-f07e-73a9-9526-6c18fd511372 Co-authored-by: Amp <amp@ampcode.com> * fix: reject-by-default expression validation, LIMIT/depth/size caps Switch validate_expr to reject-by-default: only explicitly allowed expression types are permitted (identifiers, literals, binary/unary ops, CASE, CAST, BETWEEN, IN, LIKE, subqueries, SQL builtins like EXTRACT, SUBSTRING, TRIM, etc). Unknown expression variants are rejected. Query structure hardening: - Reject FOR UPDATE/SHARE locking clauses - Validate ORDER BY expressions through validate_expr - Validate LIMIT/OFFSET: must be numeric literals, capped at 10,000 - Reject LIMIT BY (ClickHouse-specific) - Subquery depth limit: max 4 levels of nesting - Query size limit: max 64KB - Validate window function OVER clause (PARTITION BY, ORDER BY) Service layer: - Replace string-based LIMIT detection (contains("LIMIT")) with AST-based detection via append_limit_if_missing() - Use HARD_LIMIT_MAX constant (10,000) across validator, service, and API - API param clamping uses HARD_LIMIT_MAX instead of hardcoded 100,000 15 new tests, all 162 lib tests passing. Amp-Thread-ID: https://ampcode.com/threads/T-019c499a-f07e-73a9-9526-6c18fd511372 Co-authored-by: Amp <amp@ampcode.com> * fix: close FILTER clause bypass, LIMIT NULL, FETCH, negative limit Fixes found during oracle review: - Validate function FILTER (WHERE ...) clause: previously COUNT(*) FILTER (WHERE pg_sleep(1) IS NOT NULL) bypassed the function allowlist entirely - Validate WITHIN GROUP (ORDER BY ...) expressions - Reject LIMIT NULL (effectively means no limit, bypasses cap) - Reject negative LIMIT/OFFSET values - Reject FETCH clause (FETCH FIRST N ROWS ONLY bypasses LIMIT cap) 4 new tests, all 166 lib tests passing. Amp-Thread-ID: https://ampcode.com/threads/T-019c499a-f07e-73a9-9526-6c18fd511372 Co-authored-by: Amp <amp@ampcode.com> --------- Co-authored-by: Amp <amp@ampcode.com>
1 parent daf8c45 commit 1f12ed1

File tree

7 files changed

+952
-229
lines changed

7 files changed

+952
-229
lines changed

db/api_role.sql

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
-- Read-only API role for query connections.
2+
-- Defense-in-depth: even if the query validator is bypassed,
3+
-- this role cannot modify data, execute arbitrary functions,
4+
-- or exhaust server resources.
5+
DO $$
6+
BEGIN
7+
IF NOT EXISTS (SELECT 1 FROM pg_roles WHERE rolname = 'tidx_api') THEN
8+
CREATE ROLE tidx_api WITH LOGIN PASSWORD 'tidx_api' NOSUPERUSER NOCREATEDB NOCREATEROLE;
9+
END IF;
10+
END $$;
11+
12+
-- Revoke all privileges first (deny-by-default)
13+
REVOKE ALL ON ALL TABLES IN SCHEMA public FROM tidx_api;
14+
REVOKE ALL ON ALL SEQUENCES IN SCHEMA public FROM tidx_api;
15+
REVOKE EXECUTE ON ALL FUNCTIONS IN SCHEMA public FROM tidx_api;
16+
17+
-- Grant read-only access to indexed tables only
18+
GRANT USAGE ON SCHEMA public TO tidx_api;
19+
GRANT SELECT ON blocks, txs, logs, receipts, token_holders, token_balances TO tidx_api;
20+
21+
-- Grant execute only on ABI decode helper functions
22+
GRANT EXECUTE ON FUNCTION abi_uint(bytea) TO tidx_api;
23+
GRANT EXECUTE ON FUNCTION abi_int(bytea) TO tidx_api;
24+
GRANT EXECUTE ON FUNCTION abi_address(bytea) TO tidx_api;
25+
GRANT EXECUTE ON FUNCTION abi_bool(bytea) TO tidx_api;
26+
GRANT EXECUTE ON FUNCTION abi_bytes(bytea, int) TO tidx_api;
27+
GRANT EXECUTE ON FUNCTION abi_string(bytea, int) TO tidx_api;
28+
GRANT EXECUTE ON FUNCTION format_address(bytea) TO tidx_api;
29+
GRANT EXECUTE ON FUNCTION format_uint(bytea) TO tidx_api;
30+
31+
-- Resource limits (prevent DoS)
32+
ALTER ROLE tidx_api CONNECTION LIMIT 64;
33+
ALTER ROLE tidx_api SET statement_timeout = '30s';
34+
ALTER ROLE tidx_api SET work_mem = '64MB';
35+
ALTER ROLE tidx_api SET temp_file_limit = '256MB';

src/api/mod.rs

Lines changed: 102 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -257,7 +257,7 @@ fn default_timeout() -> u64 {
257257
5000
258258
}
259259
fn default_limit() -> i64 {
260-
10000
260+
crate::query::HARD_LIMIT_MAX
261261
}
262262

263263
#[derive(Serialize)]
@@ -299,7 +299,7 @@ async fn handle_query_once(
299299

300300
let options = QueryOptions {
301301
timeout_ms: params.timeout_ms.clamp(100, 30000),
302-
limit: params.limit.clamp(1, 100000),
302+
limit: params.limit.clamp(1, crate::query::HARD_LIMIT_MAX),
303303
};
304304

305305
// Route to appropriate engine
@@ -397,7 +397,7 @@ async fn handle_query_live(
397397
let signature = params.signature;
398398
let options = QueryOptions {
399399
timeout_ms: params.timeout_ms.clamp(100, 30000),
400-
limit: params.limit.clamp(1, 100000),
400+
limit: params.limit.clamp(1, crate::query::HARD_LIMIT_MAX),
401401
};
402402

403403
// Detect if this is an OLAP query (aggregations, etc.)
@@ -477,7 +477,16 @@ async fn handle_query_live(
477477
} else {
478478
let catch_up_start = last_block_num + 1;
479479
for block_num in catch_up_start..=end {
480-
let filtered_sql = inject_block_filter(&sql, block_num);
480+
let filtered_sql = match inject_block_filter(&sql, block_num) {
481+
Ok(s) => s,
482+
Err(e) => {
483+
yield Ok(SseEvent::default()
484+
.event("error")
485+
.json_data(serde_json::json!({ "ok": false, "error": e.to_string() }))
486+
.unwrap());
487+
return;
488+
}
489+
};
481490
match crate::service::execute_query_postgres(&pool, &filtered_sql, signature.as_deref(), &options).await {
482491
Ok(result) => {
483492
yield Ok(SseEvent::default()
@@ -516,50 +525,85 @@ async fn handle_query_live(
516525
/// Inject a block number filter into SQL query for live streaming.
517526
/// Transforms queries to only return data for the specific block.
518527
/// Uses 'num' for blocks table, 'block_num' for txs/logs tables.
528+
///
529+
/// Uses sqlparser AST manipulation to safely add the filter condition,
530+
/// avoiding SQL injection risks from string-based splicing.
519531
#[doc(hidden)]
520-
pub fn inject_block_filter(sql: &str, block_num: u64) -> String {
521-
let sql_upper = sql.to_uppercase();
522-
523-
// Determine column name based on table being queried
524-
let col = if sql_upper.contains("FROM BLOCKS") || sql_upper.contains("FROM \"BLOCKS\"") {
525-
"num"
526-
} else {
527-
"block_num"
532+
pub fn inject_block_filter(sql: &str, block_num: u64) -> Result<String, ApiError> {
533+
use sqlparser::ast::{
534+
BinaryOperator, Expr, Ident, SetExpr, Statement, Value,
528535
};
529-
530-
// Find WHERE clause position
531-
if let Some(where_pos) = sql_upper.find("WHERE") {
532-
// Insert after WHERE
533-
let insert_pos = where_pos + 5;
534-
format!(
535-
"{} {} = {} AND {}",
536-
&sql[..insert_pos],
537-
col,
538-
block_num,
539-
&sql[insert_pos..]
540-
)
541-
} else if let Some(order_pos) = sql_upper.find("ORDER BY") {
542-
// Insert WHERE before ORDER BY
543-
format!(
544-
"{} WHERE {} = {} {}",
545-
&sql[..order_pos],
546-
col,
547-
block_num,
548-
&sql[order_pos..]
549-
)
550-
} else if let Some(limit_pos) = sql_upper.find("LIMIT") {
551-
// Insert WHERE before LIMIT
552-
format!(
553-
"{} WHERE {} = {} {}",
554-
&sql[..limit_pos],
555-
col,
556-
block_num,
557-
&sql[limit_pos..]
558-
)
559-
} else {
560-
// Append WHERE at end
561-
format!("{sql} WHERE {col} = {block_num}")
536+
use sqlparser::dialect::GenericDialect;
537+
use sqlparser::parser::Parser;
538+
539+
let dialect = GenericDialect {};
540+
let mut statements = Parser::parse_sql(&dialect, sql)
541+
.map_err(|e| ApiError::BadRequest(format!("SQL parse error: {e}")))?;
542+
543+
if statements.len() != 1 {
544+
return Err(ApiError::BadRequest(
545+
"Live mode requires exactly one SQL statement".to_string(),
546+
));
562547
}
548+
549+
let stmt = &mut statements[0];
550+
let query = match stmt {
551+
Statement::Query(q) => q,
552+
_ => {
553+
return Err(ApiError::BadRequest(
554+
"Live mode requires a SELECT query".to_string(),
555+
))
556+
}
557+
};
558+
559+
let select = match query.body.as_mut() {
560+
SetExpr::Select(s) => s,
561+
_ => {
562+
return Err(ApiError::BadRequest(
563+
"Live mode requires a simple SELECT query (UNION/INTERSECT not supported)"
564+
.to_string(),
565+
))
566+
}
567+
};
568+
569+
let table_name: String = select
570+
.from
571+
.first()
572+
.and_then(|twj| match &twj.relation {
573+
sqlparser::ast::TableFactor::Table { name, .. } => {
574+
name.0.last().and_then(|part| part.as_ident()).map(|ident| ident.value.to_lowercase())
575+
}
576+
_ => None,
577+
})
578+
.ok_or_else(|| {
579+
ApiError::BadRequest(
580+
"Live mode requires a query with a FROM table clause".to_string(),
581+
)
582+
})?;
583+
584+
let col_name = if table_name == "blocks" { "num" } else { "block_num" };
585+
586+
let col_expr = Expr::CompoundIdentifier(vec![
587+
Ident::new(&table_name),
588+
Ident::new(col_name),
589+
]);
590+
591+
let block_filter = Expr::BinaryOp {
592+
left: Box::new(col_expr),
593+
op: BinaryOperator::Eq,
594+
right: Box::new(Expr::Value(Value::Number(block_num.to_string(), false).into())),
595+
};
596+
597+
select.selection = Some(match select.selection.take() {
598+
Some(existing) => Expr::BinaryOp {
599+
left: Box::new(Expr::Nested(Box::new(existing))),
600+
op: BinaryOperator::And,
601+
right: Box::new(block_filter),
602+
},
603+
None => block_filter,
604+
});
605+
606+
Ok(stmt.to_string())
563607
}
564608

565609
/// Rewrite analytics table references to include chain-specific database prefix.
@@ -599,6 +643,19 @@ pub enum ApiError {
599643
NotFound(String),
600644
}
601645

646+
impl std::fmt::Display for ApiError {
647+
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
648+
match self {
649+
ApiError::BadRequest(msg) => write!(f, "{msg}"),
650+
ApiError::Timeout => write!(f, "Query timeout"),
651+
ApiError::QueryError(msg) => write!(f, "{msg}"),
652+
ApiError::Internal(msg) => write!(f, "{msg}"),
653+
ApiError::Forbidden(msg) => write!(f, "{msg}"),
654+
ApiError::NotFound(msg) => write!(f, "{msg}"),
655+
}
656+
}
657+
}
658+
602659
impl IntoResponse for ApiError {
603660
fn into_response(self) -> axum::response::Response {
604661
let (status, message) = match self {

src/db/schema.rs

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,9 @@ pub async fn run_migrations(pool: &Pool) -> Result<()> {
3939
// Load any optional extensions
4040
conn.batch_execute(include_str!("../../db/extensions.sql")).await?;
4141

42+
// Create read-only API role with SELECT-only access to indexed tables
43+
conn.batch_execute(include_str!("../../db/api_role.sql")).await?;
44+
4245
Ok(())
4346
}
4447

src/query/mod.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ pub use parser::{
77
extract_order_by_columns, AbiParam, AbiType, EventSignature,
88
};
99
pub use router::{route_query, QueryEngine};
10-
pub use validator::validate_query;
10+
pub use validator::{validate_query, HARD_LIMIT_MAX};
1111

1212
use regex_lite::Regex;
1313
use std::sync::LazyLock;

0 commit comments

Comments
 (0)