Commit Graph

45 Commits

Author SHA1 Message Date
marc 1c36bac9e8 Add tools/sync-upstream.sh: safe upstream-sync workflow
Wraps the recurring 'fetch upstream, rebase, verify invariants, push'
workflow into a single command with safety nets:

- creates a tag snapshot before mutating the branch
- aborts on dirty tree
- rebase by default (--merge for merge-commit instead)
- after sync, rebuilds the backend container and verifies 5 fork-only
  invariants are still met (parser dropdown filtered, mitre_pct <= 100,
  cache endpoints present, /sample-unlabelled present, prewarmer task
  scheduled when opted in)
- exits non-zero with the recovery command if invariants regress
- optional --dry-run / --no-rebuild / --no-push for ad-hoc inspection
2026-05-22 21:36:42 +02:00
marc 7d19c57a5d Ingest Dashboard: optional background cache pre-warmer
Adds an asyncio background task that re-runs the heavy Ingest Dashboard
queries every ~4 min (just under the 5 min TTL) so the in-process cache
is always populated. First user hit on any dashboard widget then returns
from cache (single-digit ms) instead of waiting 30-60s for SDL.

Components:
  - backend/services/prewarmer.py: standalone module, opt-in via
    INGEST_PREWARM=1; configurable windows via INGEST_PREWARM_HOURS /
    INGEST_PREWARM_DAYS / INGEST_PREWARM_DAILY_VOLUME_DAYS and interval
    via INGEST_PREWARM_INTERVAL_SECONDS. Logs through the uvicorn logger
    so cycles are visible in 'docker logs'.
  - backend/main.py: spawn the task on FastAPI startup.
  - docker-compose.yml: forward INGEST_PREWARM* env vars to the
    backend service (default off).

Observed on a busy tenant with INGEST_PREWARM=1, default windows:
  top-sources?days=7 first hit after restart: ~39s -> ~8ms (cache warm).

Defaults to disabled (INGEST_PREWARM=0) so existing users see no
behaviour change.
2026-05-22 21:36:42 +02:00
marc bfff0eeec0 Ingest Dashboard: 5min TTL cache + days->hours normalisation
Dashboard reloads on multi-day windows could take 30-60s and sometimes
returned HTTP 502 ("internal Scalyr error") when the SDL window was
expressed in days. Two-part fix:

1. In-process async TTL cache (services/async_cache.py)
   - 5 min TTL on top-sources, by-event-type, daily-volume.
   - Single-flight lock per cache key (no thundering herd).
   - Optional ?nocache=1 query param to force a refresh.
   - New endpoints: GET /api/ingest/cache-stats, DELETE /api/ingest/cache.

2. Normalise days -> hours upstream of the PowerQuery
   - SDL is unstable on day-scale windows for large group-by counts on
     busy tenants but stable on the equivalent hour-scale window.
   - top-sources?days=1 used to 502; now works.

Observed timings on a busy tenant:
  top-sources?days=7  cold ~55s -> warm ~13ms (~4300x)
  top-sources?days=1  was 502   -> ~4ms (cold) / ~1.4ms (warm)
2026-05-22 21:36:42 +02:00
marc 8c4298ca2a Health Score: cap MITRE Coverage at 100% by canonicalising tactics
STAR rules sometimes label tactics with non-canonical names
(observed: "Stealth", "Defense Impairment") which were counted as
distinct tactics on top of the 14 canonical ATT&CK Enterprise ones,
producing percentages > 100% (e.g. 15/14 = 107.1% on a busy tenant).

Fix in get_health_score():
  - Restrict covered_tactics to the 14 canonical ATT&CK Enterprise tactics.
  - Map known STAR aliases ("Stealth", "Defense Impairment") -> "Defense Evasion".
  - Derive TOTAL_TACTICS from the canonical set (single source of truth).

Result: tactics_covered = 14, mitre_pct = 100.0 (was 15 / 107.1).
2026-05-22 21:36:42 +02:00
marc 70f3f83db3 Parser Test Runner: filter non-parser SDL artefacts from dropdown
SDL /logParsers/ also returns UEBA analytics tables, saved searches and
dashboard configs. They're not valid Test Runner inputs and pollute the
dropdown. Filter list_parser_files in two tiers:
 1) Name denylist (ueba_*, searches, *_baselines_*, *_features_*,
    *_scores_*, bsi-*, *-overview, smoke/test tables).
 2) Content scan: file must contain attributes:/patterns:/formats:/
    patternRefs:/rewrites:/parser: in first 4 KB.

Result: 97 files -> 41 real parsers, 0 false pos/neg.
2026-05-22 19:36:58 +02:00
marc 7c1687efce Sync upstream features; preserve fork KV scanner, parsers, verifier
Brought in 35 upstream commits (MITRE heatmap, health score, dependency map,
PowerQuery playground, onboarding tracker, product grouping, modern UI redesign).

Preserved fork additions:
  backend/routers/quality.py  KV scanner, pattern refs, JS keys, JSON mode,
                              /parsers + /sync-from-sdl endpoints
  parsers/                    96 OCSF + tenant parsers
  tools/stormshield-verify/   end-to-end ingest regression test
  .gitignore                  un-ignored parsers/*
  CHANGES.md, PATCHES.md
2026-05-22 18:19:52 +02:00
Mick a7ebcac9a6 Revert "Add product grouping to rule displays across coverage and threat pages"
This reverts commit 7620d1fcc8.
2026-05-22 12:08:56 -04:00
Mick b494c751aa Revert "Preserve parser_detected across syncs to prevent coverage regression"
This reverts commit 21c8644443.
2026-05-22 12:08:56 -04:00
Mick 21c8644443 Preserve parser_detected across syncs to prevent coverage regression
Before re-creating ActiveSource rows, snapshot existing parser_detected
values. When writing new rows, take max(new, previous) so a source that
was once confirmed as parsed (event.type present in the data lake) never
loses its Covered status due to a sampling gap, partial query result, or
SDL PowerQuery timeout during Sync All.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-22 12:07:03 -04:00
Mick 7620d1fcc8 Add product grouping to rule displays across coverage and threat pages
- Extract product label from rule data_sources in coverage.py via new
  _product_from_data_sources() helper (prefers non-SentinelOne entries
  so product-specific rules get a meaningful label)
- Coverage Map detections column: rules now grouped by product with
  collapsible chevron headers showing fired/silent counts
- Threat Coverage Rule Firing Status: collapsible product group headers
  with active/silent summary; shows all 2066 rules across 30 products
- Threat Coverage Dependency Map: collapsible product groups, at-risk
  products sorted first with risk count in header
- Ingest Dashboard: fix source name truncation — table cells now wrap
  with break-all and title tooltip; bar chart labels extended to 16
  chars with ellipsis and full-name tooltip on hover

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-22 11:56:27 -04:00
Mick bb2c00f2fa Collapse MITRE tactic cards by default — click to expand
Each card shows tactic name, technique count, and rule badge in the header.
Clicking the header toggles the technique chips with an animated chevron.
The existing '+N more' expander still works within the expanded card.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-22 11:31:23 -04:00
Mick 1a2b289f32 Rename pipeline sections to Full Pipeline / Partial Pipeline 2026-05-22 11:27:05 -04:00
Mick 800d3c545a Split onboarding pipeline into detection-mapped vs parser-only groups
Sources without detection rules no longer show stages 5-6 as failures:
- Backend: has_detection_rules flag added per source; progress (pct) calculated
  over 4 core stages for sources with no rules; detection stages marked na:true
- Frontend: pipeline splits into two sections —
    'With Detection Coverage' (6-stage, full pipeline)
    'Parser Only' (4-stage, stages 5-6 shown as — N/A)
  Each section has its own Show/Hide completed toggle
- Collapsed by default; Show Pipeline toggle reveals both sections

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-22 11:26:26 -04:00
Mick 62e29d131d Collapse onboarding pipeline table by default
Shows summary stats (Fully Onboarded / In Progress / Not Started) immediately
on page load; table is hidden until user clicks 'Show Pipeline'. Keeps the
Onboarding page scannable without scrolling past a large table to reach the
prompt template.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-22 11:23:52 -04:00
Mick d0299e0f23 Add health score, coverage trends, dependency map, PowerQuery playground, onboarding tracker
Tenant Health Score:
- CoverageSnapshot table stores daily health metrics (parser %, MITRE %, firing %)
- _compute_health() weighted formula: 40% parser coverage + 35% MITRE + 25% firing
  (reweighted 55/45 when firing cache empty)
- GET /api/coverage/health returns score + delta vs previous snapshot
- GET /api/coverage/snapshots returns chronological history for sparklines
- POST /api/coverage/snapshot for manual recording
- Auto-snapshot recorded at end of every sync-sources call
- Overview dashboard: prominent health score card with color coding, component
  breakdown, delta indicator, and inline SVG sparkline (last 30 points)

Rule Dependency Map:
- GET /api/coverage/dependency-map flips the coverage map — rule → required sources
- Each source flagged healthy/inactive/no_parser; at_risk = any source missing
- New section on Threat Coverage tab with at-risk filter toggle

PowerQuery Playground:
- New query.py router: GET /presets (7 curated queries) + POST /run
- New Query nav tab with time-range pills, preset buttons, localStorage history,
  monospace textarea, auto-column results table, client-side CSV export

Onboarding Tracker:
- GET /api/coverage/onboarding-status returns per-source pipeline progress
  across 6 stages: Data Received → Parser File → Parser Active → Source
  Labeled → Detection Rules → Rules Firing
- New section on Onboarding tab with emoji stage dots, progress bars,
  collapsed completed sources with show/hide toggle

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-22 11:09:43 -04:00
Mick b4314c07df Update README to reflect current feature set
- Add Threat Coverage tab (MITRE heatmap + rule firing status)
- Document Sync All button, SDL Config API parser sync, SDL_CONFIG_READ_KEY
- Update Parser Coverage Map: unlabelled events banner, Attributes Missing filter,
  detections column with firing status badges
- Add Parser Quality sections: unlabelled event sampler, attributes missing audit,
  JSON/NDJSON parser test runner
- Add environment variables reference table (SDL_PQ_TIMEOUT, SDL_CONFIG_READ_KEY)
- Update architecture diagram to include SDL Config File API
- Simplify setup: Sync All replaces manual multi-step first run
- Update project layout to reflect RuleFiringCache model and current file structure
- Switch docker-compose commands to `docker compose` (v2 syntax)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-22 10:46:56 -04:00
Mick 7b4eceefb8 Fix MITRE extraction to use actual S1 API structure + use generatedAlerts for firing status
MITRE fix:
- S1 platform-rules API returns rule["mitre"] = [{tactic, techniques:[{id,title}]}]
  not the flat field names we were checking — updated _extract_mitre to handle
  this as the primary path, keeping flat field fallback for STAR rules
- generatedAlerts field on each platform rule stored in raw JSON during import

Firing status fix:
- sync-rule-firing now reads generatedAlerts from ParsedRule.raw as fast path
  (instant, no SDL PowerQuery needed) since it's returned directly by the
  platform-rules API on every library sync
- SDL PowerQuery retained as fallback for rules imported from detections.json

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-22 10:42:48 -04:00
Mick 7922de315e Add MITRE ATT&CK heatmap and detection rule firing status
MITRE ATT&CK heatmap:
- _extract_mitre() helper extracts tactics/techniques from S1 API rules
  handling multiple field name conventions (tactic, mitreTechniques, etc.)
- _import_from_api_rules and _import_detections now store tactics/techniques
  in raw JSON alongside data_sources
- GET /api/coverage/mitre returns tactic/technique breakdown ordered by
  ATT&CK kill chain with coverage stats
- New "Threat Coverage" tab in frontend: stat cards (total rules, MITRE
  mapped, tactics covered, techniques covered), tactic cards grid with
  left-border color coding and technique chips with "+N more" expander

Detection rule firing status:
- RuleFiringCache table tracks alert_count per rule_name
- POST /api/coverage/sync-rule-firing queries SDL PowerQuery with 3
  field-name patterns to find rule firing data; upserts into cache
- GET /api/coverage/rule-firing-cache returns cache sorted by alert count
- /map now includes alert_count per rule and firing_cache_populated flag
- Coverage map Detections column: when cache populated, shows alert count
  in green or ⚠ amber for rules that have never fired

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-22 10:25:45 -04:00
Mick 2c40bf81ee Cherry-pick improvements from PR #2 (marcredhat)
- s1_client: configurable PowerQuery timeout via SDL_PQ_TIMEOUT env var
  (default 600s, was hardcoded 120s) with separate connect/read timeouts
  via httpx.Timeout; retry on ReadTimeout via SDL_PQ_TIMEOUT_RETRIES;
  better error messages include query snippet and parse non-JSON responses
- ingest: fix simulate-filter SDL syntax (== → =, drop leading | on base
  expression, surface PowerQuery error field, cleaner empty-filter fallback)
- docker-compose: pass SDL_PQ_TIMEOUT and SDL_PQ_TIMEOUT_RETRIES through
  to backend container with sensible defaults

Not taken from PR #2:
- .gitignore parsers/* change — would untrack the 7 committed parser files
- s1_client/quality/coverage changes already present in main from prior work

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-22 10:11:42 -04:00
Mick c5a4f796a0 Add unlabelled event detection, stub parser quality, Sync All, and modern UI redesign
Key changes:
- Unlabelled event banner: shows count only after Sample Events is clicked; uses broad SDL filter expression; time window synced to sync-days dropdown
- Parser Quality: new "Attributes Missing" subsection listing all parsers without dataSource.name regardless of event volume
- Coverage map: filter buttons (All / Complete Parser / Attributes Missing); stat card renamed to "Incomplete Parser"; stub count excluded from sync when no active sources
- Sync All button: runs SDL parser sync → library sync → live sources sync in sequence
- Reset now clears ActiveSource table and resets unlabelled count cache
- run_powerquery: configurable max_count param (default 1000, 50M for count queries)
- _DS_NAME_RE: supports both quoted and unquoted dataSource.name keys in parser files
- Full modern UI redesign: slate palette, gradient cards, ring borders, pill nav, colored stat accents
- Updated 7 tracked parser files synced from SDL

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-22 10:00:21 -04:00
Mick 0013adbe7e Merge pull request #1 from marcredhat/fix/json-parser-and-pq-syntax
Fix Parser Test Runner JSON mode, Filter Simulator PQ syntax, and parser dropdown
2026-05-20 15:25:39 -04:00
Mick 6cd9da82da Auto-load detection library from S1 API, improve coverage map accuracy
- Fetch detection library rules from platform-rules API at startup (falls
  back to extracted.json); adds Sync Detection Library button for refresh
- Parser column simplified to ✓ Parsed / ✗ Not Parsed
- Detection counts now use library rules only (exclude custom STAR rules)
- Add close-match suggestions for dataSource.name mismatches (e.g. CloudTrail
  → AWS CloudTrail, Microsoft 365 Collaboration → Microsoft O365)
- Exclude SentinelOne Ranger AD from coverage map (native S1 source)
- Add success feedback banners to Load SDL Parsers and Sync Library buttons
- Remove rule_counts.json manual override; extracted.json is source of truth
- Remove Load Detections button; rules auto-import on backend startup
- Add get_account_id() and get_platform_rules() to s1_client

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 15:14:10 -04:00
marc d8d62478c0 Add helper scripts: SDL parser sync, PQ probes, test-parser smoke tests 2026-05-20 19:41:00 +02:00
marc 8dbd38f3bb Fix Parser Test Runner JSON mode, Filter Simulator PQ syntax, dropdown source
- backend/routers/quality.py
 * Add GET /api/quality/parsers (lists actual files in /app/parsers)
 * Support SDL JSON auto-extract parsers ($=json{parse=json}$)
 * Apply parser rewrite blocks with correct $0/$N backref translation
 * Accept single JSON / JSON array / NDJSON in test-parser body
 * Flatten JSON inside 'message' for Field Population coverage
- backend/routers/ingest.py
 * Rewrite simulate-filter PowerQuery to valid SDL syntax
 * Correct field name: src.name -> dataSource.name
- frontend/index.html
 * Parser dropdown loads from /api/quality/parsers
 * Add 'Last 7d' lookback option
 * Render JSON-mode test results with badges + payload counter
2026-05-20 19:40:24 +02:00
Mick 6e137438b1 Add Detection Fields Missing column + STAR_LIBRARY_ONLY setting
Coverage Map:
- New "Detection Fields Missing" column shows dotted-path SDL fields that
  associated STAR rules reference but the parser does not provide
- Only dotted field paths (src.ip, winEventLog.channel) are considered;
  single-word correlation variables and metadata tokens are excluded
- Schema fields always present in events (dataSource.name, event.type etc)
  are excluded from the missing list

Settings:
- New STAR_LIBRARY_ONLY field (select: true/false) controls whether
  Load Library STAR Rules filters to @sentinelone.com creators or loads all
- Rendered as a dropdown in the Settings form with a hint description
- saveSettings now always persists select field values (not just non-empty)
- load-star-rules reads STAR_LIBRARY_ONLY env var as its default

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 15:46:05 -04:00
Mick a50fd35934 Filter STAR rules to Library only (creator @sentinelone.com)
load-star-rules now defaults to library_only=true, filtering rules where
the creator email ends in @sentinelone.com. Custom tenant rules are excluded
by default. Pass ?library_only=false to load all rules.
Button label updated to "Load Library STAR Rules" to make intent clear.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 15:42:09 -04:00
Mick 4d6125eb4d Add Default Parser Only and No Parser filters to Coverage Map
Filters are now: All | Custom Parser | Default Parser Only | No Parser

- Custom Parser: covered sources with a loaded SDL parser file
- Default Parser Only: covered via event.type detection in data lake
  but no custom parser file — built-in or cloud-managed parser running
- No Parser: parser_needed sources (no parser found at all)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 15:35:30 -04:00
Mick 1a68fbea2d Rewrite README in the Queen's English, inspired by Pineapple Boy
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 13:28:15 -04:00
Mick 3f80e4c344 Add README with full feature documentation
Covers setup, architecture, all five pages (Coverage Map, Ingest Dashboard,
Parser Quality, Onboarding, Settings), expected results for each tool,
rebuild commands, and project layout.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 13:25:28 -04:00
Mick 74c3a8d6a3 Auto-discover fields from log sample when source selected in Field Population Rate
Selecting a source triggers a 20-event sample; actual field names from the
log are merged with SDL schema defaults (log fields first) and pre-filled
into the fields input. Falls back to SDL defaults if no events found.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 13:23:36 -04:00
Mick 1aca7154c2 Default Live Event Sampler to 10 events
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 13:21:51 -04:00
Mick 799e413041 Add per-row copy button to Live Event Sampler message column
Message column is pinned last, shows 80 chars with tooltip for the full
value, and has a ⎘ copy button that flashes ✓ on success. Other field
cells are unchanged.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 13:18:44 -04:00
Mick 5421b2de61 Populate source dropdowns in Parser Quality from synced active sources
Live Event Sampler and Field Population Rate now load sources from the
coverage map on page render instead of free-text inputs. Sources are sorted
by event count (busiest first) and show event totals. Falls back to a hint
message if no sources have been synced yet.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 13:16:50 -04:00
Mick 1b07a59991 Use parsed event detection in data lake as coverage signal
- sync-sources now runs a parallel PowerQuery checking for event.type
  population per source; count stored in new active_sources.parser_detected
- Coverage map marks a source as covered if parser_detected > 0, even
  without a matching local parser file (handles built-in/cloud parsers)
- UI parser cell shows "Parsed (N typed events detected)" for data-lake-
  detected parsers vs named local parser files
- Runtime ALTER TABLE migration adds parser_detected column to existing DBs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 13:06:29 -04:00
Mick 81e3656c46 Fix coverage map matching: three-tier lookup for parser-to-source mapping
1. Exact dataSource.name match
2. Normalized substring on parser's dataSource.name attribute
3. Normalized substring on parser filename (catches files with wrong ds name)

Fixes CloudTrail (filename aws_cloudtrail-latest matches "cloudtrail") and
Palo Alto Networks Firewall (ds name "Palo Alto Networks" matches via substring).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 12:56:51 -04:00
Mick 999c0f7b83 Add Parser Quality page: Live Event Sampler, Field Population Rate, Parser Test Runner
- New /api/quality router with three endpoints:
  sample-events: pull raw events from a source via PowerQuery
  field-population: measure % of events with each SDL field populated;
    surfaces dataSource.name correctly (100% when filtered by it) and
    returns fields_seen_in_sample so you can see what IS being extracted
  test-parser: converts SDL \$field=pattern\$ format strings to Python
    named-group regex and tests against a pasted raw log line
- New "Parser Quality" nav item and page with all three tools
- Home page card added for Parser Quality
- Field population UI shows per-field colour-coded progress bars plus
  a chip list of fields actually present in the sample

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 12:53:48 -04:00
Mick 058b1e7cf1 Default Ingest Dashboard to 1h view on load
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 12:46:30 -04:00
Mick a5d0be0a7c Show events-by-source bar chart in 1h mode instead of blank message
When the 1h time filter is active the volume chart now renders the top-sources
data as a by-source bar chart (up to 12 sources) with the gradient fill and a
"Events by Source (Last 1h)" heading. Chart labels are auto-detected as dates
or source names so truncation is applied correctly for both modes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 12:45:55 -04:00
Mick ac97196435 Improve coverage map matching, bar chart gradients, and add 1h time filter
- Coverage map: replace filename fuzzy-match with exact dataSource.name
  lookup read directly from parser file attributes; grok/dottedJson parsers
  now flagged as "parser_needed" with format type shown in the UI
- Bar chart: SVG linearGradient (light purple → deep violet) replaces flat fill
- Ingest dashboard: add 1h button (first option) backed by new backend
  hours= query param on /api/ingest/top-sources; daily-volume chart shows
  informational message when in 1h mode

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 12:43:10 -04:00
Mick f0bd56aee8 Rewrite coverage map as source-centric view
Previously showed field-level coverage (rule fields vs parser fields).
Now shows per-dataSource.name coverage: is a parser loaded for each
active ingest source?

- New ActiveSource DB model stores live sources from SDL
- New POST /api/coverage/sync-sources endpoint runs PowerQuery to fetch
  current dataSource.names and their event counts, stores in DB
- GET /api/coverage/map now returns per-source status:
    covered       = a loaded parser matches this source name
    parser_needed = source is ingesting but no parser is loaded
- Parser matching uses fuzzy substring (handles "palo"→"Palo Alto Networks Firewall")
- Coverage table shows: source name, 7d event count, status, matched parser + field count, STAR rules
- Frontend: new "Sync Live Sources" button, updated stats cards, updated filter tabs
- Removed field-level view (was confusing — parser_needed on a field ≠ missing parser for a source)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 12:31:48 -04:00
Mick 2262892859 Improve daily volume bar chart readability
- Add event count label on top of each bar (e.g. 220 or 1.2k)
- Add Y-axis grid lines and tick labels so scale is readable
- Label shows MM/DD date format for compact display
- Chart heading now reads "events ingested per day" to clarify
  these are individual daily counts, not cumulative totals

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 11:59:12 -04:00
Mick 08c7a8a5b5 Add Filter Simulator help panel on Ingest Dashboard
Adds a collapsible "How does this work?" panel explaining:
- What the simulator does (live PowerQuery count → GB projection)
- When to use it (after spotting a noisy source in Top Sources)
- How to fill in Source name (copy from dataSource.name column)
- What Event type does (optional narrowing)
- How the GB estimate is calculated
- Warning that it is read-only — no filters are applied automatically

Also updates Source name placeholder to show a concrete example.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 11:56:52 -04:00
Mick 735e364b71 Fix Ingest Dashboard timeout causing failed to fetch
- daily-volume: run per-day PowerQueries in parallel with asyncio.gather
  instead of sequentially with sleeps — 3 days now completes in ~16s vs 140s+
- Default view changed from 7d to 3d; day buttons updated to [3, 5, 7]
- igLoad: fire daily-volume and top-sources simultaneously with Promise.allSettled
  so both panels load in parallel rather than one after the other
- Each panel shows "Querying data lake…" spinner while loading
- Each panel renders independently — one failure doesn't block the other

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 11:53:37 -04:00
Mick 2e55e21a77 Add Settings page with .env manager
- Sidebar: ⚙ Settings link pinned to bottom of nav
- Settings page: view all config keys (secrets masked), edit and save directly to .env
- Show/hide toggle for secret fields (tokens, keys)
- First-time setup banner with cp .env.example .env instructions when .env is missing
- Manual setup section with step-by-step terminal commands and where to find each credential
- New .env.example template with comments for all required variables
- Backend: GET/POST /api/settings/config router reads/writes mounted .env file
- docker-compose: mounts .env into backend container at /app/.env for write access

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 11:43:41 -04:00
Mick c182d837ee Initial commit: SIEM Toolkit for SentinelOne
Dockerized SecOps toolkit with:
- Coverage Map: STAR rule vs SDL parser field coverage analysis
- Ingest Dashboard: PowerQuery-powered event volume and source breakdown
- Onboarding Assistant: AI-guided log source onboarding with Claude
- Parser management via SDL MCP integration

Stack: FastAPI + PostgreSQL backend, nginx-served HTML frontend, Docker Compose.
PowerQuery runs via Scalyr XDR API (SDL_XDR_URL + SDL_LOG_READ_KEY).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-19 11:39:26 -04:00