Adds an asyncio background task that re-runs the heavy Ingest Dashboard
queries every ~4 min (just under the 5 min TTL) so the in-process cache
is always populated. First user hit on any dashboard widget then returns
from cache (single-digit ms) instead of waiting 30-60s for SDL.
Components:
- backend/services/prewarmer.py: standalone module, opt-in via
INGEST_PREWARM=1; configurable windows via INGEST_PREWARM_HOURS /
INGEST_PREWARM_DAYS / INGEST_PREWARM_DAILY_VOLUME_DAYS and interval
via INGEST_PREWARM_INTERVAL_SECONDS. Logs through the uvicorn logger
so cycles are visible in 'docker logs'.
- backend/main.py: spawn the task on FastAPI startup.
- docker-compose.yml: forward INGEST_PREWARM* env vars to the
backend service (default off).
Measured on Purple AI tenant (INGEMeasured on Purple AI tenant (INGEMeasured on Purple fMeasured on Purple AI tenant (INGEMeasured on Purple AI tenant (INGEMeasured on (INGEST_PREWARM=0) so non-opt-in
users see no behaviour change.
Dashboard reloads on multi-day windows could take 30-60s and sometimes
returned HTTP 502 ('internal Scalyr error') when the SDL window was
expressed in days. Two-part fix:
1. In-process async TTL cache (services/async_cache.py)
- 5 min TTL on top-sources, by-event-type, daily-volume.
- Single-flight lock per cache key (no thundering herd).
- Optional ?nocache=1 query param to force a refresh.
- New endpoints: GET /api/ingest/cache-stats, DELETE /api/ingest/cache.
2. Normalise days -> hours upstream of the PowerQuery
- SDL is unstable on day-scale windows for large group-by counts on
this tenant but stable on the equivalent hour-scale window.
- top-sources?days=1 used to 502; now works.
Measured on Purple AI tenant:
top-sources?days=7 cold 55.7s -> warm 13ms (~4300x)
t t t t t t t t t -> 4ms (cold) / 1.4ms (warm)
- s1_client: configurable PowerQuery timeout via SDL_PQ_TIMEOUT env var
(default 600s, was hardcoded 120s) with separate connect/read timeouts
via httpx.Timeout; retry on ReadTimeout via SDL_PQ_TIMEOUT_RETRIES;
better error messages include query snippet and parse non-JSON responses
- ingest: fix simulate-filter SDL syntax (== → =, drop leading | on base
expression, surface PowerQuery error field, cleaner empty-filter fallback)
- docker-compose: pass SDL_PQ_TIMEOUT and SDL_PQ_TIMEOUT_RETRIES through
to backend container with sensible defaults
Not taken from PR #2:
- .gitignore parsers/* change — would untrack the 7 committed parser files
- s1_client/quality/coverage changes already present in main from prior work
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Key changes:
- Unlabelled event banner: shows count only after Sample Events is clicked; uses broad SDL filter expression; time window synced to sync-days dropdown
- Parser Quality: new "Attributes Missing" subsection listing all parsers without dataSource.name regardless of event volume
- Coverage map: filter buttons (All / Complete Parser / Attributes Missing); stat card renamed to "Incomplete Parser"; stub count excluded from sync when no active sources
- Sync All button: runs SDL parser sync → library sync → live sources sync in sequence
- Reset now clears ActiveSource table and resets unlabelled count cache
- run_powerquery: configurable max_count param (default 1000, 50M for count queries)
- _DS_NAME_RE: supports both quoted and unquoted dataSource.name keys in parser files
- Full modern UI redesign: slate palette, gradient cards, ring borders, pill nav, colored stat accents
- Updated 7 tracked parser files synced from SDL
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fetch detection library rules from platform-rules API at startup (falls
back to extracted.json); adds Sync Detection Library button for refresh
- Parser column simplified to ✓ Parsed / ✗ Not Parsed
- Detection counts now use library rules only (exclude custom STAR rules)
- Add close-match suggestions for dataSource.name mismatches (e.g. CloudTrail
→ AWS CloudTrail, Microsoft 365 Collaboration → Microsoft O365)
- Exclude SentinelOne Ranger AD from coverage map (native S1 source)
- Add success feedback banners to Load SDL Parsers and Sync Library buttons
- Remove rule_counts.json manual override; extracted.json is source of truth
- Remove Load Detections button; rules auto-import on backend startup
- Add get_account_id() and get_platform_rules() to s1_client
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>