mirror of
https://github.com/marcredhat/SIEM-toolkit-patched
synced 2026-06-08 20:37:12 +00:00
7c1687efce
Brought in 35 upstream commits (MITRE heatmap, health score, dependency map,
PowerQuery playground, onboarding tracker, product grouping, modern UI redesign).
Preserved fork additions:
backend/routers/quality.py KV scanner, pattern refs, JS keys, JSON mode,
/parsers + /sync-from-sdl endpoints
parsers/ 96 OCSF + tenant parsers
tools/stormshield-verify/ end-to-end ingest regression test
.gitignore un-ignored parsers/*
CHANGES.md, PATCHES.md
105 lines
4.7 KiB
Markdown
105 lines
4.7 KiB
Markdown
# Changes vs upstream `mickbrowns1/SIEM-Toolkit`
|
|
|
|
All edits are confined to a handful of files; everything else is untouched.
|
|
|
|
## `backend/services/s1_client.py`
|
|
|
|
### PowerQuery client
|
|
- All raised exceptions now include the request body / status / query so the
|
|
UI never shows a blank `"PowerQuery error: "`.
|
|
- Non-JSON responses (HTML 5xx gateway pages) surface as a readable error
|
|
string instead of crashing on `resp.json()`.
|
|
|
|
### Detection library: site-scope fallback (`get_platform_rules`)
|
|
- Upstream hardcoded **account scope** which 403s with site-scoped API
|
|
tokens. Added `get_scope_for_platform_rules()` that probes `/accounts`
|
|
first, then `/sites`, returning whichever scope the token can access.
|
|
- `get_account_id()` now also reads `accountId` from the `/sites` payload as
|
|
a fallback for site-scoped tokens.
|
|
|
|
### SDL parser sync helpers
|
|
- `list_sdl_parsers()` — rewritten to use the real **SDL Configuration File
|
|
API** (`POST /api/listFiles` with `pathPrefix=/logParsers/`). Previously
|
|
it hit a 404 path on the mgmt console.
|
|
- `get_sdl_parser()` — rewritten to `POST /api/getFile` with `{path}`.
|
|
- New `_sdl_config_headers()` helper that uses `SDL_CONFIG_READ_KEY` (a
|
|
separate scope from `SDL_LOG_READ_KEY`).
|
|
|
|
## `backend/routers/ingest.py`
|
|
|
|
- `/api/ingest/simulate-filter`:
|
|
* Rebuilt the query into valid SDL syntax — was generating
|
|
`| group events=count()` (dangling pipe) for empty bodies; now uses a
|
|
proper base expression and falls back to `dataSource.name!=''` baseline.
|
|
* Field name corrected from `src.name` → `dataSource.name`.
|
|
* Surfaces both `result["error"]` and exception text so blank
|
|
`"PowerQuery error: "` messages are gone.
|
|
|
|
## `backend/routers/quality.py`
|
|
|
|
- `GET /api/quality/parsers`: lists actual parser filenames in
|
|
`/app/parsers/` (drives the Test Runner dropdown).
|
|
- **New `POST /api/quality/sync-from-sdl`**: downloads every parser file
|
|
under `/logParsers/` on the SDL tenant into `/app/parsers/`. After this
|
|
call returns, the Parser Test Runner dropdown automatically reflects all
|
|
tenant parsers (including custom OCSF parsers like
|
|
`Avelios-Medical-OCSF`). Requires `SDL_CONFIG_READ_KEY` in `.env`.
|
|
- `_flatten_event`: when a PowerQuery row only carries a JSON-stringified
|
|
payload in `message` (i.e. the parser isn't applied at query time), parse
|
|
and flatten that JSON inline so the Field Population tool can measure real
|
|
coverage.
|
|
- `POST /api/quality/test-parser`:
|
|
* Detects SDL JSON-mode parsers (`$=json{parse=json}$`) and parses log
|
|
lines as JSON.
|
|
* Applies parser `rewrites: [{input,output,match,replace}]` blocks with
|
|
correct `$0/$N` backreference translation (`$0` was being mangled to a
|
|
null byte).
|
|
* Accepts single JSON object, JSON array, or NDJSON multi-line input.
|
|
* Returns mode badge data + per-payload counters for the UI.
|
|
|
|
## `frontend/index.html`
|
|
|
|
- Parser Test Runner dropdown now loads from `/api/quality/parsers` instead
|
|
of filtering the coverage map (which only has `detected in data`
|
|
placeholders).
|
|
- Field Population and Sample Events: added **Last 7d** lookback option.
|
|
- Parser Test Runner UI: mode badge (`JSON auto-extract` vs `regex format`),
|
|
payload counter for multi-line input, separate tables for extracted vs
|
|
derived/rewritten fields.
|
|
|
|
## `docker-compose.yml`
|
|
|
|
- Pass `SDL_CONFIG_READ_KEY` through to the backend container.
|
|
|
|
## `.env.example` / `.gitignore`
|
|
|
|
- Document the new `SDL_CONFIG_READ_KEY` variable.
|
|
- Broaden `.gitignore` so `parsers/*` (tenant-specific synced content) is
|
|
not committed.
|
|
|
|
## New helper scripts (`tools/`)
|
|
|
|
- `sync_sdl_parsers.py` — pull all `/logParsers/*` from the tenant.
|
|
- `probe_pq_syntax.py` — probe which PowerQuery syntaxes the tenant accepts.
|
|
- `probe_avelios{,_wide,_fields}.py` — inspect a source's event presence,
|
|
columns, and embedded JSON fields.
|
|
- `test_avelios_parser.py`, `test_avelios_multi.py` — smoke-test the patched
|
|
`/api/quality/test-parser` endpoint with single-line and multi-line input.
|
|
- `probe_simulate_filter.py` — smoke-test the patched
|
|
`/api/ingest/simulate-filter` endpoint with progressively larger windows.
|
|
- `probe_sync_from_sdl.py` — call `/api/quality/sync-from-sdl` and verify
|
|
that `/api/quality/parsers` then reflects the downloaded parsers.
|
|
- `sdl_config.example.json` — template config (the toolkit's `.env` is
|
|
separate from the SDL config used by these helper scripts).
|
|
|
|
## New `.env` knobs
|
|
|
|
```bash
|
|
# PowerQuery transport tuning (both optional; defaults work for most tenants)
|
|
SDL_PQ_TIMEOUT=600 # PowerQuery read timeout in seconds (default 600)
|
|
SDL_PQ_TIMEOUT_RETRIES=1 # extra retries on ReadTimeout (default 1)
|
|
|
|
# Required for /api/quality/sync-from-sdl
|
|
SDL_CONFIG_READ_KEY=... # Data Lake API key with Configuration Read scope
|
|
```
|