mirror of
https://github.com/marcredhat/SIEM-toolkit-patched
synced 2026-06-10 21:31:19 +00:00
Sync upstream features; preserve fork KV scanner, parsers, verifier
Brought in 35 upstream commits (MITRE heatmap, health score, dependency map,
PowerQuery playground, onboarding tracker, product grouping, modern UI redesign).
Preserved fork additions:
backend/routers/quality.py KV scanner, pattern refs, JS keys, JSON mode,
/parsers + /sync-from-sdl endpoints
parsers/ 96 OCSF + tenant parsers
tools/stormshield-verify/ end-to-end ingest regression test
.gitignore un-ignored parsers/*
CHANGES.md, PATCHES.md
This commit is contained in:
+104
@@ -0,0 +1,104 @@
|
||||
# Changes vs upstream `mickbrowns1/SIEM-Toolkit`
|
||||
|
||||
All edits are confined to a handful of files; everything else is untouched.
|
||||
|
||||
## `backend/services/s1_client.py`
|
||||
|
||||
### PowerQuery client
|
||||
- All raised exceptions now include the request body / status / query so the
|
||||
UI never shows a blank `"PowerQuery error: "`.
|
||||
- Non-JSON responses (HTML 5xx gateway pages) surface as a readable error
|
||||
string instead of crashing on `resp.json()`.
|
||||
|
||||
### Detection library: site-scope fallback (`get_platform_rules`)
|
||||
- Upstream hardcoded **account scope** which 403s with site-scoped API
|
||||
tokens. Added `get_scope_for_platform_rules()` that probes `/accounts`
|
||||
first, then `/sites`, returning whichever scope the token can access.
|
||||
- `get_account_id()` now also reads `accountId` from the `/sites` payload as
|
||||
a fallback for site-scoped tokens.
|
||||
|
||||
### SDL parser sync helpers
|
||||
- `list_sdl_parsers()` — rewritten to use the real **SDL Configuration File
|
||||
API** (`POST /api/listFiles` with `pathPrefix=/logParsers/`). Previously
|
||||
it hit a 404 path on the mgmt console.
|
||||
- `get_sdl_parser()` — rewritten to `POST /api/getFile` with `{path}`.
|
||||
- New `_sdl_config_headers()` helper that uses `SDL_CONFIG_READ_KEY` (a
|
||||
separate scope from `SDL_LOG_READ_KEY`).
|
||||
|
||||
## `backend/routers/ingest.py`
|
||||
|
||||
- `/api/ingest/simulate-filter`:
|
||||
* Rebuilt the query into valid SDL syntax — was generating
|
||||
`| group events=count()` (dangling pipe) for empty bodies; now uses a
|
||||
proper base expression and falls back to `dataSource.name!=''` baseline.
|
||||
* Field name corrected from `src.name` → `dataSource.name`.
|
||||
* Surfaces both `result["error"]` and exception text so blank
|
||||
`"PowerQuery error: "` messages are gone.
|
||||
|
||||
## `backend/routers/quality.py`
|
||||
|
||||
- `GET /api/quality/parsers`: lists actual parser filenames in
|
||||
`/app/parsers/` (drives the Test Runner dropdown).
|
||||
- **New `POST /api/quality/sync-from-sdl`**: downloads every parser file
|
||||
under `/logParsers/` on the SDL tenant into `/app/parsers/`. After this
|
||||
call returns, the Parser Test Runner dropdown automatically reflects all
|
||||
tenant parsers (including custom OCSF parsers like
|
||||
`Avelios-Medical-OCSF`). Requires `SDL_CONFIG_READ_KEY` in `.env`.
|
||||
- `_flatten_event`: when a PowerQuery row only carries a JSON-stringified
|
||||
payload in `message` (i.e. the parser isn't applied at query time), parse
|
||||
and flatten that JSON inline so the Field Population tool can measure real
|
||||
coverage.
|
||||
- `POST /api/quality/test-parser`:
|
||||
* Detects SDL JSON-mode parsers (`$=json{parse=json}$`) and parses log
|
||||
lines as JSON.
|
||||
* Applies parser `rewrites: [{input,output,match,replace}]` blocks with
|
||||
correct `$0/$N` backreference translation (`$0` was being mangled to a
|
||||
null byte).
|
||||
* Accepts single JSON object, JSON array, or NDJSON multi-line input.
|
||||
* Returns mode badge data + per-payload counters for the UI.
|
||||
|
||||
## `frontend/index.html`
|
||||
|
||||
- Parser Test Runner dropdown now loads from `/api/quality/parsers` instead
|
||||
of filtering the coverage map (which only has `detected in data`
|
||||
placeholders).
|
||||
- Field Population and Sample Events: added **Last 7d** lookback option.
|
||||
- Parser Test Runner UI: mode badge (`JSON auto-extract` vs `regex format`),
|
||||
payload counter for multi-line input, separate tables for extracted vs
|
||||
derived/rewritten fields.
|
||||
|
||||
## `docker-compose.yml`
|
||||
|
||||
- Pass `SDL_CONFIG_READ_KEY` through to the backend container.
|
||||
|
||||
## `.env.example` / `.gitignore`
|
||||
|
||||
- Document the new `SDL_CONFIG_READ_KEY` variable.
|
||||
- Broaden `.gitignore` so `parsers/*` (tenant-specific synced content) is
|
||||
not committed.
|
||||
|
||||
## New helper scripts (`tools/`)
|
||||
|
||||
- `sync_sdl_parsers.py` — pull all `/logParsers/*` from the tenant.
|
||||
- `probe_pq_syntax.py` — probe which PowerQuery syntaxes the tenant accepts.
|
||||
- `probe_avelios{,_wide,_fields}.py` — inspect a source's event presence,
|
||||
columns, and embedded JSON fields.
|
||||
- `test_avelios_parser.py`, `test_avelios_multi.py` — smoke-test the patched
|
||||
`/api/quality/test-parser` endpoint with single-line and multi-line input.
|
||||
- `probe_simulate_filter.py` — smoke-test the patched
|
||||
`/api/ingest/simulate-filter` endpoint with progressively larger windows.
|
||||
- `probe_sync_from_sdl.py` — call `/api/quality/sync-from-sdl` and verify
|
||||
that `/api/quality/parsers` then reflects the downloaded parsers.
|
||||
- `sdl_config.example.json` — template config (the toolkit's `.env` is
|
||||
separate from the SDL config used by these helper scripts).
|
||||
|
||||
## New `.env` knobs
|
||||
|
||||
```bash
|
||||
# PowerQuery transport tuning (both optional; defaults work for most tenants)
|
||||
SDL_PQ_TIMEOUT=600 # PowerQuery read timeout in seconds (default 600)
|
||||
SDL_PQ_TIMEOUT_RETRIES=1 # extra retries on ReadTimeout (default 1)
|
||||
|
||||
# Required for /api/quality/sync-from-sdl
|
||||
SDL_CONFIG_READ_KEY=... # Data Lake API key with Configuration Read scope
|
||||
```
|
||||
Reference in New Issue
Block a user