Cherry-pick improvements from PR #2 (marcredhat)

- s1_client: configurable PowerQuery timeout via SDL_PQ_TIMEOUT env var
  (default 600s, was hardcoded 120s) with separate connect/read timeouts
  via httpx.Timeout; retry on ReadTimeout via SDL_PQ_TIMEOUT_RETRIES;
  better error messages include query snippet and parse non-JSON responses
- ingest: fix simulate-filter SDL syntax (== → =, drop leading | on base
  expression, surface PowerQuery error field, cleaner empty-filter fallback)
- docker-compose: pass SDL_PQ_TIMEOUT and SDL_PQ_TIMEOUT_RETRIES through
  to backend container with sensible defaults

Not taken from PR #2:
- .gitignore parsers/* change — would untrack the 7 committed parser files
- s1_client/quality/coverage changes already present in main from prior work

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Mick
2026-05-22 10:11:42 -04:00
parent c5a4f796a0
commit 2c40bf81ee
3 changed files with 43 additions and 12 deletions
+13 -6
View File
@@ -90,21 +90,28 @@ async def simulate_filter(rule: FilterRule):
"""Estimate how many events and GB would be eliminated by an exclusion filter."""
from_dt, to_dt = _date_range(rule.days)
# Build Scalyr filter expression clauses (uses = not ==, SDL syntax)
clauses = []
if rule.source:
clauses.append(f"dataSource.name=='{rule.source}'")
clauses.append(f"dataSource.name = '{rule.source}'")
if rule.event_type:
clauses.append(f"event.type=='{rule.event_type}'")
clauses.append(f"event.type = '{rule.event_type}'")
if clauses:
filter_expr = " and ".join(clauses)
query = f"| filter {filter_expr} | group events=count()"
filter_expr = " ".join(clauses)
query = f"{filter_expr} | group events=count()"
else:
query = "| group events=count()"
query = "dataSource.name != '' | group events=count()"
try:
result = await s1_client.run_powerquery(query, from_dt, to_dt)
events = (result.get("events") or [{}])[0].get("events", 0) if isinstance(result.get("events"), list) else 0
err = result.get("error") if isinstance(result, dict) else None
if err:
raise HTTPException(502, f"PowerQuery error: {err}")
rows = result.get("events") or []
events = rows[0].get("events", 0) if rows else 0
except HTTPException:
raise
except Exception as e:
raise HTTPException(502, f"PowerQuery error: {e}")