Auto-load detection library from S1 API, improve coverage map accuracy

- Fetch detection library rules from platform-rules API at startup (falls
  back to extracted.json); adds Sync Detection Library button for refresh
- Parser column simplified to ✓ Parsed / ✗ Not Parsed
- Detection counts now use library rules only (exclude custom STAR rules)
- Add close-match suggestions for dataSource.name mismatches (e.g. CloudTrail
  → AWS CloudTrail, Microsoft 365 Collaboration → Microsoft O365)
- Exclude SentinelOne Ranger AD from coverage map (native S1 source)
- Add success feedback banners to Load SDL Parsers and Sync Library buttons
- Remove rule_counts.json manual override; extracted.json is source of truth
- Remove Load Detections button; rules auto-import on backend startup
- Add get_account_id() and get_platform_rules() to s1_client

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Mick
2026-05-20 15:14:10 -04:00
parent 6e137438b1
commit 6cd9da82da
8 changed files with 580 additions and 90 deletions
+1
View File
@@ -7,3 +7,4 @@ node_modules/
frontend/out/ frontend/out/
pgdata/ pgdata/
parsers/*.json parsers/*.json
data/
+45 -11
View File
@@ -10,6 +10,7 @@ A self-hosted troubleshooting and visibility tool for SentinelOne AI-SIEM SecOps
| Page | Purpose | | Page | Purpose |
|---|---| |---|---|
| **Overview** | Live health stats — coverage percentage, active sources, top uncovered sources by volume |
| **Parser Coverage Map** | Which active data sources have a parser? Which don't? | | **Parser Coverage Map** | Which active data sources have a parser? Which don't? |
| **Ingest Dashboard** | Event volume, top sources, cost projection, filter simulator | | **Ingest Dashboard** | Event volume, top sources, cost projection, filter simulator |
| **Parser Quality** | Live event sampler, field population rate, parser test runner | | **Parser Quality** | Live event sampler, field population rate, parser test runner |
@@ -26,12 +27,12 @@ browser → nginx (port 3001) → single-page HTML/JS application
FastAPI backend (port 8001) FastAPI backend (port 8001)
┌───────────────────────────┐ ┌───────────────────────────┐
│ PostgreSQL (SQLAlchemy) │ parsed rules, parser fields, active sources │ PostgreSQL (SQLAlchemy) │ parser fields, active sources
└───────────────────────────┘ └───────────────────────────┘
┌───────────────────────────┐ ┌───────────────────────────┐
│ SentinelOne APIs │ │ SentinelOne APIs │
│ • Management API (STAR) │ demo.sentinelone.net │ • Management API │ demo.sentinelone.net
│ • Scalyr XDR PowerQuery │ xdr.us1.sentinelone.net │ • Scalyr XDR PowerQuery │ xdr.us1.sentinelone.net
└───────────────────────────┘ └───────────────────────────┘
``` ```
@@ -54,16 +55,34 @@ Edit `.env` with your credentials:
```env ```env
S1_BASE_URL=https://demo.sentinelone.net # Your console URL S1_BASE_URL=https://demo.sentinelone.net # Your console URL
S1_API_TOKEN=eyJ... # Service user API token S1_API_TOKEN=eyJ... # Service user API token (account scope or higher)
SDL_XDR_URL=https://xdr.us1.sentinelone.net # Scalyr XDR endpoint SDL_XDR_URL=https://xdr.us1.sentinelone.net # Scalyr XDR endpoint
SDL_LOG_READ_KEY=1j2IU0S... # Data Lake read key SDL_LOG_READ_KEY=1j2IU0S... # Data Lake read key
ANTHROPIC_API_KEY= # Optional — Onboarding page only ANTHROPIC_API_KEY= # Optional — not currently used
``` ```
**S1_API_TOKEN** — generate at *Settings → Users → Service Users* in the console. **S1_API_TOKEN** — generate at *Settings → Users → Service Users* in the console. The service user should be provisioned at **account scope** or higher.
**SDL_LOG_READ_KEY** — found at *Settings → Integrations → Data Lake API Keys*. **SDL_LOG_READ_KEY** — found at *Settings → Integrations → Data Lake API Keys*.
### 2. Add Parser Files (optional but strongly recommended) ### 2. Add the Detection Library (strongly recommended)
The Detection Fields Missing column and per-source detection counts on the Coverage Map require a local detections export. This is generated from the [detection-validator](https://github.com/mickbrowns1/detection-validator) repository.
```bash
# Clone the detection-validator repo alongside this one
git clone https://github.com/mickbrowns1/detection-validator.git
cd detection-validator
# Follow its README to generate the export, then copy the output here:
mkdir -p ../SIEM-Toolkit/data
cp data/data/detections/extracted.json ../SIEM-Toolkit/data/detections.json
cd ../SIEM-Toolkit
```
The `data/` directory is gitignored and never committed. Once the stack is running, click **Load Detections** on the Coverage Map to import the rules into the database.
### 3. Add Parser Files (optional but strongly recommended)
Place your SDL parser JSON files into the `parsers/` directory. The backend reads them directly at query time — no rebuild is necessary. Place your SDL parser JSON files into the `parsers/` directory. The backend reads them directly at query time — no rebuild is necessary.
@@ -71,7 +90,7 @@ Place your SDL parser JSON files into the `parsers/` directory. The backend read
cp ~/my-parsers/*.json parsers/ cp ~/my-parsers/*.json parsers/
``` ```
### 3. Start the Stack ### 4. Start the Stack
```bash ```bash
docker-compose up -d --build docker-compose up -d --build
@@ -83,6 +102,18 @@ Open **http://localhost:3001** in your browser and you're off.
## Features ## Features
### Overview Dashboard
The landing page gives you an at-a-glance health summary drawn live from the database:
- **Parser Coverage %** — proportion of active sources with a confirmed parser
- **Active Sources** — total number of `dataSource.name` values seen in the last 7 days
- **Covered / Need Parser** — counts for each status
If any sources are uncovered, the **Top Sources Needing a Parser** table lists the highest-volume offenders. Click any source name to jump directly to the Parser Quality page with that source pre-selected.
---
### Parser Coverage Map ### Parser Coverage Map
Answers the question: *does each active data source have a parser running?* Answers the question: *does each active data source have a parser running?*
@@ -91,7 +122,6 @@ Answers the question: *does each active data source have a parser running?*
1. **Sync Live Sources** — executes a PowerQuery against your data lake to retrieve every `dataSource.name` seen in the last 7 days, along with event counts. 1. **Sync Live Sources** — executes a PowerQuery against your data lake to retrieve every `dataSource.name` seen in the last 7 days, along with event counts.
2. **Load SDL Parsers** — reads parser files from `parsers/`, extracts the `dataSource.name` attribute from each, and stores the field list in the database. 2. **Load SDL Parsers** — reads parser files from `parsers/`, extracts the `dataSource.name` attribute from each, and stores the field list in the database.
3. **Load STAR Rules** — retrieves your STAR detection rules from the management API and indexes which data sources each rule references.
**Matching logic (three-tier):** **Matching logic (three-tier):**
1. Exact `dataSource.name` match between the active source and the parser attribute 1. Exact `dataSource.name` match between the active source and the parser attribute
@@ -104,6 +134,10 @@ Answers the question: *does each active data source have a parser running?*
- 🟢 **Covered** — custom parser confirmed (local file or detected via parsed events in the data lake) - 🟢 **Covered** — custom parser confirmed (local file or detected via parsed events in the data lake)
- 🔴 **Parser Needed** — no parser found, or only a grok/dottedJson format (which typically indicates an incomplete parser) - 🔴 **Parser Needed** — no parser found, or only a grok/dottedJson format (which typically indicates an incomplete parser)
**Filters:** Use the filter pills to focus on Custom Parser only, Default Parser Only (data lake detected), or No Parser.
**Deep link:** Click any source name in the table to open it directly in Parser Quality with all dropdowns pre-populated.
**Expected results:** After syncing sources and loading parsers, sources with active SDL parsers will appear as Covered. Sources sending raw, unparsed data — where only `message` and `timestamp` appear in the data lake — will appear as Parser Needed. **Expected results:** After syncing sources and loading parsers, sources with active SDL parsers will appear as Covered. Sources sending raw, unparsed data — where only `message` and `timestamp` appear in the data lake — will appear as Parser Needed.
--- ---
@@ -173,8 +207,7 @@ A prompt template for using Claude Code to onboard a new log source. Copy the te
- An SDL parser skeleton in augmented-JSON format - An SDL parser skeleton in augmented-JSON format
- Field mappings to the SDL common schema - Field mappings to the SDL common schema
- 23 starter STAR detection rules - Parser test assertions
- 5 parser test assertions
No Anthropic API key is required — this uses Claude Code directly from your terminal. No Anthropic API key is required — this uses Claude Code directly from your terminal.
@@ -222,7 +255,7 @@ curl -X DELETE http://localhost:8001/api/coverage/reset
│ │ └── settings.py # .env read/write │ │ └── settings.py # .env read/write
│ └── services/ │ └── services/
│ ├── s1_client.py # SentinelOne + Scalyr API client │ ├── s1_client.py # SentinelOne + Scalyr API client
│ └── rule_parser.py # SDL/Sigma/STAR field extraction │ └── rule_parser.py # SDL format string field extraction
├── frontend/ ├── frontend/
│ └── index.html # Single-page application (Tailwind, vanilla JS) │ └── index.html # Single-page application (Tailwind, vanilla JS)
├── parsers/ # SDL parser files (volume-mounted) ├── parsers/ # SDL parser files (volume-mounted)
@@ -240,3 +273,4 @@ curl -X DELETE http://localhost:8001/api/coverage/reset
- The backend queries your **demo tenant** (`demo.sentinelone.net`) — not usea1-purple or any other tenant. Ensure your `S1_BASE_URL` and `SDL_LOG_READ_KEY` are pointed at the same tenant. - The backend queries your **demo tenant** (`demo.sentinelone.net`) — not usea1-purple or any other tenant. Ensure your `S1_BASE_URL` and `SDL_LOG_READ_KEY` are pointed at the same tenant.
- Parser files in `parsers/` are read at query time, not on startup — add or update files at any point without rebuilding the image. - Parser files in `parsers/` are read at query time, not on startup — add or update files at any point without rebuilding the image.
- The filter simulator is entirely read-only and makes no changes whatsoever to your tenant configuration. - The filter simulator is entirely read-only and makes no changes whatsoever to your tenant configuration.
- The service user API token must be at **account scope** or higher. Site-scoped tokens will have limited visibility into rules and may see reduced source counts.
+35 -1
View File
@@ -1,6 +1,6 @@
from fastapi import FastAPI from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware from fastapi.middleware.cors import CORSMiddleware
from db import engine, Base from db import engine, Base, get_db, ParsedRule
from routers import coverage, ingest, settings, quality from routers import coverage, ingest, settings, quality
Base.metadata.create_all(bind=engine) Base.metadata.create_all(bind=engine)
@@ -15,6 +15,40 @@ with engine.connect() as _conn:
app = FastAPI(title="SIEM Toolkit", version="1.0.0") app = FastAPI(title="SIEM Toolkit", version="1.0.0")
@app.on_event("startup")
async def auto_load_detections():
"""
Auto-load detection library rules on startup.
Tries the live S1 API first (accurate 'sources' field); falls back to extracted.json.
Skips if rules are already loaded — use the 'Sync Library' button to force a refresh.
"""
import os
from sqlalchemy.orm import Session
from services import s1_client
db: Session = next(get_db())
try:
existing = db.query(ParsedRule).filter_by(rule_type="library").count()
if existing > 0:
return # Already loaded — skip until user manually refreshes
# Try live API first
try:
rules = await s1_client.get_platform_rules()
if rules:
coverage._import_from_api_rules(db, rules)
return
except Exception:
pass
# Fall back to local file
detections_file = os.environ.get("DETECTIONS_FILE", "/app/data/detections.json")
if os.path.exists(detections_file):
coverage._import_detections(db, detections_file)
finally:
db.close()
app.add_middleware( app.add_middleware(
CORSMiddleware, CORSMiddleware,
allow_origins=["http://localhost:3001"], allow_origins=["http://localhost:3001"],
+208 -30
View File
@@ -1,4 +1,5 @@
import json import json
import os
from fastapi import APIRouter, UploadFile, File, Depends, HTTPException from fastapi import APIRouter, UploadFile, File, Depends, HTTPException
from pydantic import BaseModel from pydantic import BaseModel
from sqlalchemy.orm import Session from sqlalchemy.orm import Session
@@ -6,6 +7,8 @@ from datetime import datetime
from db import get_db, ParsedRule, ParserField, ActiveSource from db import get_db, ParsedRule, ParserField, ActiveSource
from services import s1_client, rule_parser from services import s1_client, rule_parser
DETECTIONS_FILE = os.environ.get("DETECTIONS_FILE", "/app/data/detections.json")
router = APIRouter() router = APIRouter()
@@ -40,22 +43,12 @@ def _star_query_texts(rule: dict) -> list[str]:
@router.post("/load-star-rules") @router.post("/load-star-rules")
async def load_star_rules(library_only: bool = None, db: Session = Depends(get_db)): async def load_star_rules(db: Session = Depends(get_db)):
"""Fetch STAR rules from SentinelOne and index their fields. """Fetch all STAR rules from the Management Console API and index their fields."""
library_only defaults to the STAR_LIBRARY_ONLY env var (default true).
Pass ?library_only=false to include custom tenant rules as well.
"""
import os
if library_only is None:
library_only = os.environ.get("STAR_LIBRARY_ONLY", "true").lower() != "false"
try: try:
rules = await s1_client.get_star_rules() rules = await s1_client.get_star_rules()
except Exception as e: except Exception as e:
raise HTTPException(502, f"S1 API error: {e}") raise HTTPException(502, f"S1 API error: {type(e).__name__}: {e}")
if library_only:
rules = [r for r in rules if str(r.get("creator", "")).lower().endswith("@sentinelone.com")]
# Replace all existing STAR rules cleanly to avoid duplicate key errors # Replace all existing STAR rules cleanly to avoid duplicate key errors
db.query(ParsedRule).filter_by(rule_type="star").delete() db.query(ParsedRule).filter_by(rule_type="star").delete()
@@ -81,6 +74,118 @@ async def load_star_rules(library_only: bool = None, db: Session = Depends(get_d
return {"loaded": len(loaded), "rules": loaded} return {"loaded": len(loaded), "rules": loaded}
_EXCLUDED_PATHS = ("/rules/silent/", "/rules/dev/")
def _import_from_api_rules(db, rules: list) -> int:
"""
Import platform rules fetched directly from the S1 API into the database.
Each rule has a 'sources' list — the authoritative dataSource.name values.
"""
db.query(ParsedRule).filter_by(rule_type="library").delete()
db.commit()
loaded = 0
seen_ids: set = set()
for rule in rules:
rule_id = str(rule.get("id", f"lib_{loaded}"))
if rule_id in seen_ids:
continue
seen_ids.add(rule_id)
sources = rule.get("sources") or []
db.add(ParsedRule(
rule_id=rule_id,
name=rule.get("name", "unnamed"),
rule_type="library",
fields_used=[], # API rules don't expose field-level info
raw=json.dumps({"data_sources": sources}),
))
loaded += 1
if loaded % 500 == 0:
db.flush()
db.commit()
return loaded
def _import_detections(db, detections_file: str) -> int:
"""
Import library detection rules from extracted.json into the database.
Replaces any existing library rules. Returns the count of rules loaded.
"""
with open(detections_file, "r", encoding="utf-8") as fh:
data = json.load(fh)
results = data.get("results", [])
results = [r for r in results if not any(r.get("file", "").startswith(p) for p in _EXCLUDED_PATHS)]
db.query(ParsedRule).filter_by(rule_type="library").delete()
db.commit()
loaded = 0
seen_ids: set = set()
for rule in results:
all_fields: set = set()
data_sources: list[str] = []
for q in rule.get("queries", []):
all_fields.update(q.get("keys", []))
ds_vals = q.get("pairs", {}).get("dataSource.name", [])
for v in ds_vals:
if isinstance(v, str):
data_sources.append(v)
elif isinstance(v, list):
data_sources.extend(str(x) for x in v)
rule_id = str(rule.get("id", f"lib_{loaded}"))
if rule_id in seen_ids:
continue
seen_ids.add(rule_id)
db.add(ParsedRule(
rule_id=rule_id,
name=rule.get("name", "unnamed"),
rule_type="library",
fields_used=list(all_fields),
raw=json.dumps({"data_sources": list(set(data_sources))}),
))
loaded += 1
if loaded % 500 == 0:
db.flush()
db.commit()
return loaded
@router.post("/load-detections")
async def load_detections(db: Session = Depends(get_db)):
"""
Reload detection library rules.
Tries the live S1 API first (platform-rules endpoint); falls back to extracted.json.
"""
# Prefer the live API — gives accurate 'sources' and is always up to date
try:
rules = await s1_client.get_platform_rules()
if rules:
loaded = _import_from_api_rules(db, rules)
return {"loaded": loaded, "source": "api"}
except Exception:
pass
# Fall back to local extracted.json
if not os.path.exists(DETECTIONS_FILE):
raise HTTPException(
404,
"S1 API unavailable and no detections file found — "
"ensure the data/ volume is mounted with detections.json"
)
try:
loaded = _import_detections(db, DETECTIONS_FILE)
except Exception as e:
raise HTTPException(500, f"Failed to import detections: {e}")
return {"loaded": loaded, "source": "file"}
@router.post("/upload-sigma") @router.post("/upload-sigma")
async def upload_sigma(files: list[UploadFile] = File(...), db: Session = Depends(get_db)): async def upload_sigma(files: list[UploadFile] = File(...), db: Session = Depends(get_db)):
"""Upload one or more Sigma YAML files and index their fields.""" """Upload one or more Sigma YAML files and index their fields."""
@@ -216,11 +321,21 @@ async def load_parser_content(payload: ParserContentPayload, db: Session = Depen
return {"parser": payload.parser_name, "fields": list(fields), "field_count": len(fields)} return {"parser": payload.parser_name, "fields": list(fields), "field_count": len(fields)}
# Native SentinelOne platform sources — parsed by the system, not by SDL parsers.
# Excluded from the coverage map as they do not require custom parser coverage.
_S1_NATIVE_SOURCES = {
"SentinelOne", "asset", "alert", "vulnerability",
"ActivityFeed", "indicator", "misconfiguration",
"SentinelOne Ranger AD",
}
@router.post("/sync-sources") @router.post("/sync-sources")
async def sync_sources(days: int = 7, db: Session = Depends(get_db)): async def sync_sources(days: int = 7, db: Session = Depends(get_db)):
"""Pull active dataSource.names from the SDL and store them. """Pull active dataSource.names from the SDL and store them.
Also detects whether a parser is already producing structured fields Also detects whether a parser is already producing structured fields
for each source by checking if event.type is populated in the data lake. for each source by checking if event.type is populated in the data lake.
Native S1 platform sources are excluded as they do not require SDL parsers.
""" """
import asyncio import asyncio
from datetime import datetime, timedelta from datetime import datetime, timedelta
@@ -255,7 +370,7 @@ async def sync_sources(days: int = 7, db: Session = Depends(get_db)):
seen = 0 seen = 0
for row in rows: for row in rows:
name = row.get("dataSource.name") name = row.get("dataSource.name")
if name: if name and name not in _S1_NATIVE_SOURCES:
db.add(ActiveSource( db.add(ActiveSource(
source_name=name, source_name=name,
event_count=row.get("events", 0), event_count=row.get("events", 0),
@@ -264,7 +379,7 @@ async def sync_sources(days: int = 7, db: Session = Depends(get_db)):
)) ))
seen += 1 seen += 1
db.commit() db.commit()
return {"synced": seen, "sources": [r["dataSource.name"] for r in rows if r.get("dataSource.name")]} return {"synced": seen, "sources": [r["dataSource.name"] for r in rows if r.get("dataSource.name") and r["dataSource.name"] not in _S1_NATIVE_SOURCES]}
def _build_parser_ds_index() -> dict[str, dict]: def _build_parser_ds_index() -> dict[str, dict]:
@@ -367,19 +482,28 @@ def get_coverage_map(db: Session = Depends(get_db)):
# Build rule index: source_name → rules that reference it # Build rule index: source_name → rules that reference it
rule_by_source: dict[str, list] = {} rule_by_source: dict[str, list] = {}
for rule in rules: for rule in rules:
query_texts = _star_query_texts(json.loads(rule.raw)) if rule.rule_type == "star" else [] try:
data_sources = rule_parser.extract_data_sources(query_texts) raw_data = json.loads(rule.raw) if rule.raw else {}
except Exception:
raw_data = {}
if rule.rule_type == "library":
# Library rules store pre-extracted data_sources list in raw
data_sources = raw_data.get("data_sources", [])
else:
query_texts = _star_query_texts(raw_data)
data_sources = rule_parser.extract_data_sources(query_texts)
for ds in data_sources: for ds in data_sources:
rule_by_source.setdefault(ds, []).append({"rule": rule.name, "type": rule.rule_type}) rule_by_source.setdefault(ds, []).append({"rule": rule.name, "type": rule.rule_type})
if not data_sources:
# Rule with no explicit source filter — applies to all
rule_by_source.setdefault("__any__", []).append({"rule": rule.name, "type": rule.rule_type})
# Fields to ignore when computing "missing" — these are metadata/schema fields # Fields to ignore when computing "missing" — these are metadata/schema fields
# always present in events regardless of the parser # always present in events regardless of the parser
_SCHEMA_FIELDS = { _SCHEMA_FIELDS = {
"dataSource.name", "dataSource.vendor", "dataSource.category", "dataSource.name", "dataSource.vendor", "dataSource.category",
"event.type", "timestamp", "src.endpoint.ip", "src.endpoint.name", "event.type", "timestamp", "src.endpoint.ip", "src.endpoint.name",
# Endpoint agent fields — populated by the SentinelOne agent, not by SDL parsers
"cmdScript.content", "endpoint.os", "endpoint.name", "endpoint.uid",
} }
sources_out = [] sources_out = []
@@ -414,22 +538,75 @@ def get_coverage_map(db: Session = Depends(get_db)):
else: else:
needed_count += 1 needed_count += 1
rules_for_src = rule_by_source.get(src.source_name, []) + rule_by_source.get("__any__", []) rules_for_src: list = [r for r in rule_by_source.get(src.source_name, []) if r["type"] == "library"]
# Fields all associated rules need, minus schema fields always present # Close-match suggestions — shown when there are no library rules for this source.
rule_fields_needed: set = set() close_matches: list = []
if not rules_for_src:
import re as _re
def _word_tokens(s: str) -> set:
"""Split on non-alphanumeric boundaries, lowercase, drop single chars."""
return {t for t in _re.split(r"[^a-z0-9]+", s.lower()) if len(t) >= 2}
def _is_close(a: str, b: str) -> bool:
na, nb = _normalize(a), _normalize(b)
# 1. Simple substring match
if na in nb or nb in na:
return True
# 2. Token-level: handles "Microsoft 365 Collaboration" vs "Microsoft O365"
# — "365" is inside "o365", and they share "microsoft"
ta, tb = _word_tokens(a), _word_tokens(b)
shared_exact = ta & tb
if not shared_exact:
return False # Must share at least one word exactly
# Check that a DISTINCTIVE (non-shared) token from one name
# appears as a substring inside a token from the other.
# This avoids matching "Azure AD" to "Azure Platform" on "azure" alone.
unique_a = ta - shared_exact
unique_b = tb - shared_exact
return any(
ua in ub or ub in ua
for ua in unique_a for ub in unique_b
if len(ua) >= 2 and len(ub) >= 2
)
sn = _normalize(src.source_name)
for lib_ds, lib_rules in rule_by_source.items():
lib_only = [r for r in lib_rules if r["type"] == "library"]
if not lib_only:
continue
if _is_close(src.source_name, lib_ds):
close_matches.append({
"library_name": lib_ds,
"rule_count": len(lib_only),
})
close_matches.sort(key=lambda x: x["rule_count"], reverse=True)
close_matches = close_matches[:3]
# Count how many rules reference each field (frequency)
field_freq: dict[str, int] = {}
for r in rules_for_src: for r in rules_for_src:
rule_fields_needed |= rule_fields_index.get(r["rule"], set()) for f in rule_fields_index.get(r["rule"], set()):
rule_fields_needed -= _SCHEMA_FIELDS field_freq[f] = field_freq.get(f, 0) + 1
# Fields the parser provides # Fields the parser provides
parser_provides = parser_index.get(matched_parser, set()) if matched_parser and matched_parser != "detected in data" else set() parser_provides = parser_index.get(matched_parser, set()) if matched_parser and matched_parser != "detected in data" else set()
# Missing = fields rules need that the parser doesn't provide. # Minimum number of rules that must reference a field before we flag it.
# Only consider dotted-path fields (e.g. src.ip, winEventLog.channel) — # Scales with rule count so single-rule oddities don't dominate.
# single-word tokens are typically correlation variables or rule metadata. rule_count = len(rules_for_src)
rule_fields_dotted = {f for f in rule_fields_needed if "." in f} min_rules = max(2, round(rule_count * 0.05)) if rule_count >= 10 else 2
missing_fields = sorted(rule_fields_dotted - parser_provides)
# Missing = dotted-path fields needed by >= min_rules rules,
# not in schema constants, not provided by the parser.
missing_fields = sorted(
f for f, count in field_freq.items()
if count >= min_rules
and "." in f
and f not in _SCHEMA_FIELDS
and f not in parser_provides
)
sources_out.append({ sources_out.append({
"source_name": src.source_name, "source_name": src.source_name,
@@ -441,6 +618,7 @@ def get_coverage_map(db: Session = Depends(get_db)):
"parser_detected": src.parser_detected or 0, "parser_detected": src.parser_detected or 0,
"rules": rules_for_src, "rules": rules_for_src,
"rule_count": len(rules_for_src), "rule_count": len(rules_for_src),
"close_matches": close_matches,
"missing_fields": missing_fields, "missing_fields": missing_fields,
"missing_fields_count": len(missing_fields), "missing_fields_count": len(missing_fields),
"synced_at": src.synced_at.isoformat() if src.synced_at else None, "synced_at": src.synced_at.isoformat() if src.synced_at else None,
-3
View File
@@ -15,9 +15,6 @@ FIELDS = [
{"key": "SDL_XDR_URL", "label": "SDL XDR URL", "secret": False, "placeholder": "https://xdr.us1.sentinelone.net"}, {"key": "SDL_XDR_URL", "label": "SDL XDR URL", "secret": False, "placeholder": "https://xdr.us1.sentinelone.net"},
{"key": "SDL_LOG_READ_KEY", "label": "SDL Log Read Key", "secret": True, "placeholder": "1DnK0Y4e..."}, {"key": "SDL_LOG_READ_KEY", "label": "SDL Log Read Key", "secret": True, "placeholder": "1DnK0Y4e..."},
{"key": "ANTHROPIC_API_KEY", "label": "Anthropic API Key", "secret": True, "placeholder": "sk-ant-..."}, {"key": "ANTHROPIC_API_KEY", "label": "Anthropic API Key", "secret": True, "placeholder": "sk-ant-..."},
{"key": "STAR_LIBRARY_ONLY", "label": "STAR Rules — Library Only", "secret": False, "placeholder": "true",
"type": "select", "options": ["true", "false"],
"hint": "true = load only SentinelOne Library rules (@sentinelone.com creators). false = include custom tenant rules as well."},
] ]
FIELD_KEYS = {f["key"] for f in FIELDS} FIELD_KEYS = {f["key"] for f in FIELDS}
+114 -9
View File
@@ -24,16 +24,72 @@ def _iso_to_epoch_ms(iso_str: str) -> int:
return int(dt.timestamp() * 1000) return int(dt.timestamp() * 1000)
async def get_star_rules(limit: int = 200) -> list: async def get_star_rules(page_size: int = 100) -> list:
"""Fetch active STAR rules from the Management Console API.""" """Fetch custom STAR rules from /cloud-detection/rules, paginating via cursor."""
all_rules = []
cursor = None
async with httpx.AsyncClient(timeout=30) as client: async with httpx.AsyncClient(timeout=30) as client:
resp = await client.get( while True:
f"{BASE_URL}/web/api/v2.1/cloud-detection/rules", params = {"limit": page_size}
headers=HEADERS, if cursor:
params={"limit": limit}, params["cursor"] = cursor
) resp = await client.get(
resp.raise_for_status() f"{BASE_URL}/web/api/v2.1/cloud-detection/rules",
return resp.json().get("data", []) headers=HEADERS,
params=params,
)
resp.raise_for_status()
body = resp.json()
all_rules.extend(body.get("data", []))
cursor = body.get("pagination", {}).get("nextCursor")
if not cursor:
break
return all_rules
async def get_library_rules(page_size: int = 100) -> list:
"""
Fetch Detection Library (OOTB/Platform) rules from /web/api/v2.1/detection-library/rules.
Requires an account-level or higher API token — site-scoped tokens will receive a 400.
Returns an empty list gracefully if the token lacks sufficient scope.
"""
all_rules = []
cursor = None
async with httpx.AsyncClient(timeout=60) as client:
while True:
params: dict = {"limit": page_size}
if cursor:
params["cursor"] = cursor
resp = await client.get(
f"{BASE_URL}/web/api/v2.1/detection-library/rules",
headers=HEADERS,
params=params,
)
# 400 typically means site-scoped token — return empty rather than crash
if resp.status_code == 400:
return []
resp.raise_for_status()
body = resp.json()
batch = body.get("data", [])
all_rules.extend(batch)
cursor = body.get("pagination", {}).get("nextCursor")
if not cursor:
break
results = []
for rule in all_rules:
results.append({
"id": str(rule.get("id", "")),
"name": rule.get("name", "unnamed"),
"s1ql": rule.get("s1ql") or rule.get("query", ""),
"queryType": rule.get("queryType", "events"),
"severity": rule.get("severity", ""),
"description": rule.get("description", ""),
"gdlRuleId": rule.get("id", ""),
"creator": "SentinelOne",
"expirationMode": rule.get("expirationMode", "Permanent"),
})
return results
async def run_powerquery(query: str, from_date: str, to_date: str) -> dict: async def run_powerquery(query: str, from_date: str, to_date: str) -> dict:
@@ -124,6 +180,55 @@ async def get_sdl_parser(filename: str) -> dict:
return resp.json() return resp.json()
async def get_account_id() -> str | None:
"""Return the first account ID visible to the current token."""
async with httpx.AsyncClient(timeout=15) as client:
resp = await client.get(
f"{BASE_URL}/web/api/v2.1/accounts",
headers=HEADERS,
params={"limit": 1},
)
resp.raise_for_status()
accounts = resp.json().get("data", [])
return str(accounts[0]["id"]) if accounts else None
async def get_platform_rules(page_size: int = 1000) -> list:
"""
Fetch all Detection Library platform rules from /detection-library/platform-rules.
Requires scopeLevel + scopeId — uses account scope with the first visible account.
Returns list of rules, each with a 'sources' list (authoritative data source names).
"""
account_id = await get_account_id()
if not account_id:
return []
all_rules: list = []
cursor: str = ""
async with httpx.AsyncClient(timeout=60) as client:
while True:
params: dict = {
"scopeLevel": "account",
"scopeId": account_id,
"limit": page_size,
"cursor": cursor,
}
resp = await client.get(
f"{BASE_URL}/web/api/v2.1/detection-library/platform-rules",
headers=HEADERS,
params=params,
)
if resp.status_code == 400:
return []
resp.raise_for_status()
body = resp.json()
all_rules.extend(body.get("data", []))
cursor = body.get("pagination", {}).get("nextCursor") or ""
if not cursor:
break
return all_rules
async def get_sites() -> list: async def get_sites() -> list:
async with httpx.AsyncClient(timeout=30) as client: async with httpx.AsyncClient(timeout=30) as client:
resp = await client.get( resp = await client.get(
+2
View File
@@ -17,12 +17,14 @@ services:
- SDL_LOG_READ_KEY=${SDL_LOG_READ_KEY} - SDL_LOG_READ_KEY=${SDL_LOG_READ_KEY}
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY} - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
- DATABASE_URL=postgresql://siem:siem@db:5432/siem - DATABASE_URL=postgresql://siem:siem@db:5432/siem
- DETECTIONS_FILE=/app/data/detections.json
depends_on: depends_on:
db: db:
condition: service_healthy condition: service_healthy
volumes: volumes:
- ./parsers:/app/parsers - ./parsers:/app/parsers
- ./.env:/app/.env - ./.env:/app/.env
- ./data:/app/data:ro
db: db:
image: postgres:16-alpine image: postgres:16-alpine
+175 -36
View File
@@ -116,17 +116,98 @@ function barChart(rows, labelKey, valueKey) {
function renderHome() { function renderHome() {
set(`<div class="p-8 max-w-5xl"> set(`<div class="p-8 max-w-5xl">
<div class="mb-8"> <div class="mb-6">
<h1 class="text-2xl font-bold text-white">SIEM Engineering Toolkit</h1> <h1 class="text-2xl font-bold text-white">SIEM Engineering Toolkit</h1>
<p class="text-gray-400 mt-1">SentinelOne AI-SIEM · demo.sentinelone.net</p> <p class="text-gray-400 mt-1">SentinelOne AI-SIEM · demo.sentinelone.net</p>
</div> </div>
<div class="grid grid-cols-1 md:grid-cols-3 gap-5"> <div id="home-stats" class="grid grid-cols-2 md:grid-cols-4 gap-4 mb-8">
${homeCard('#/coverage','Parser Coverage Map','Cross-reference SDL parser fields against STAR and Sigma rule fields. Surface parsed-but-unused fields as reduction candidates.','Open Coverage Map','from-purple-700 to-purple-900')} <div class="bg-gray-900 border border-gray-800 rounded-xl p-4 text-center animate-pulse">
${homeCard('#/ingest','Ingest Dashboard','Visualize event volume by source and type. Project monthly GB costs and simulate exclusion filters before applying them.','Open Dashboard','from-blue-700 to-blue-900')} <div class="h-7 w-16 bg-gray-800 rounded mx-auto mb-1"></div>
${homeCard('#/quality','Parser Quality','Sample live events to see which fields landed. Measure field population rates and test parser patterns against raw log lines.','Open Quality Tools','from-amber-700 to-amber-900')} <div class="h-3 w-20 bg-gray-800 rounded mx-auto"></div>
${homeCard('#/onboarding','Onboarding Accelerator','Step-by-step guide for onboarding a new log source using Claude Code directly — no API key required.','View Guide','from-emerald-700 to-emerald-900')} </div>
<div class="bg-gray-900 border border-gray-800 rounded-xl p-4 text-center animate-pulse">
<div class="h-7 w-16 bg-gray-800 rounded mx-auto mb-1"></div>
<div class="h-3 w-20 bg-gray-800 rounded mx-auto"></div>
</div>
<div class="bg-gray-900 border border-gray-800 rounded-xl p-4 text-center animate-pulse">
<div class="h-7 w-16 bg-gray-800 rounded mx-auto mb-1"></div>
<div class="h-3 w-20 bg-gray-800 rounded mx-auto"></div>
</div>
<div class="bg-gray-900 border border-gray-800 rounded-xl p-4 text-center animate-pulse">
<div class="h-7 w-16 bg-gray-800 rounded mx-auto mb-1"></div>
<div class="h-3 w-20 bg-gray-800 rounded mx-auto"></div>
</div>
</div>
<div id="home-uncovered" class="hidden mb-8"></div>
<div class="grid grid-cols-1 md:grid-cols-2 gap-5">
${homeCard('#/coverage','Parser Coverage Map','See which active data sources have a parser running and which need one.','Open Coverage Map','from-purple-700 to-purple-900')}
${homeCard('#/ingest','Ingest Dashboard','Visualize event volume by source and type. Simulate exclusion filters before applying them.','Open Dashboard','from-blue-700 to-blue-900')}
${homeCard('#/quality','Parser Quality','Sample live events, measure field population rates, and test parser patterns against raw log lines.','Open Quality Tools','from-amber-700 to-amber-900')}
${homeCard('#/onboarding','Onboarding Accelerator','Step-by-step guide for onboarding a new log source using Claude Code directly.','View Guide','from-emerald-700 to-emerald-900')}
</div> </div>
</div>`) </div>`)
homeLoadStats()
}
async function homeLoadStats() {
try {
const r = await apiGet('/api/coverage/map')
const sources = r.sources || []
const total = sources.length
const covered = sources.filter(s => s.status === 'covered').length
const needed = sources.filter(s => s.status === 'parser_needed').length
const pct = total ? Math.round(covered / total * 100) : 0
const pctColor = pct >= 80 ? 'text-emerald-400' : pct >= 50 ? 'text-amber-400' : 'text-red-400'
document.getElementById('home-stats').innerHTML = `
${homeStat(pct + '%', 'Parser Coverage', pctColor)}
${homeStat(total.toLocaleString(), 'Active Sources', 'text-blue-400')}
${homeStat(covered.toLocaleString(), 'Covered', 'text-emerald-400')}
${homeStat(needed.toLocaleString(), 'Need Parser', needed > 0 ? 'text-red-400' : 'text-gray-500')}`
// Top uncovered sources by volume
const uncovered = sources
.filter(s => s.status === 'parser_needed')
.sort((a, b) => (b.event_count || 0) - (a.event_count || 0))
.slice(0, 5)
if (uncovered.length) {
const rows = uncovered.map(s => `
<tr class="border-b border-gray-800/50">
<td class="py-2 pr-4 font-mono text-xs text-gray-200">
<a href="#/quality" onclick="queueQualitySource('${esc(s.source_name)}')" class="hover:text-purple-400 cursor-pointer">${esc(s.source_name)}</a>
</td>
<td class="py-2 text-xs text-gray-400">${(s.event_count || 0).toLocaleString()} events</td>
</tr>`).join('')
document.getElementById('home-uncovered').classList.remove('hidden')
document.getElementById('home-uncovered').innerHTML = `
<div class="bg-gray-900 border border-red-900/40 rounded-xl p-5">
<h2 class="text-sm font-semibold text-white mb-1">Top Sources Needing a Parser</h2>
<p class="text-xs text-gray-500 mb-3">Highest-volume sources with no parser running — click to inspect in Parser Quality.</p>
<table class="w-full">
<thead><tr class="text-left text-gray-500 border-b border-gray-800">
<th class="pb-2 pr-4 text-xs font-medium">Source</th>
<th class="pb-2 text-xs font-medium">Volume</th>
</tr></thead>
<tbody>${rows}</tbody>
</table>
</div>`
}
} catch(e) {
document.getElementById('home-stats').innerHTML = `
${homeStat('—', 'Parser Coverage', 'text-gray-600')}
${homeStat('—', 'Active Sources', 'text-gray-600')}
${homeStat('—', 'Covered', 'text-gray-600')}
${homeStat('—', 'Need Parser', 'text-gray-600')}`
}
}
function homeStat(value, label, valueClass) {
return `<div class="bg-gray-900 border border-gray-800 rounded-xl p-4 text-center">
<div class="text-2xl font-bold ${valueClass} mb-1">${value}</div>
<div class="text-xs text-gray-500">${label}</div>
</div>`
} }
function homeCard(href, title, desc, cta, grad) { function homeCard(href, title, desc, cta, grad) {
@@ -138,6 +219,12 @@ function homeCard(href, title, desc, cta, grad) {
</div>` </div>`
} }
// Queue a source to be pre-selected when Quality page loads
let _pendingQualitySource = null
function queueQualitySource(source) {
_pendingQualitySource = source
}
// ── Coverage ────────────────────────────────────────────────────────────── // ── Coverage ──────────────────────────────────────────────────────────────
let cvFilter = 'all', cvData = null let cvFilter = 'all', cvData = null
@@ -151,7 +238,7 @@ function renderCoverage() {
</div> </div>
<div class="flex gap-2 flex-wrap justify-end"> <div class="flex gap-2 flex-wrap justify-end">
<button id="btn-sync" onclick="cvSyncSources()" class="px-3 py-1.5 text-sm bg-blue-700 hover:bg-blue-600 rounded-lg text-white">Sync Live Sources</button> <button id="btn-sync" onclick="cvSyncSources()" class="px-3 py-1.5 text-sm bg-blue-700 hover:bg-blue-600 rounded-lg text-white">Sync Live Sources</button>
<button id="btn-star" onclick="loadStar()" class="px-3 py-1.5 text-sm bg-purple-700 hover:bg-purple-600 rounded-lg text-white">Load Library STAR Rules</button> <button id="btn-sync-library" onclick="syncLibrary()" class="px-3 py-1.5 text-sm bg-blue-700 hover:bg-blue-600 rounded-lg text-white">Sync Detection Library</button>
<button id="btn-sdl-parsers" onclick="loadSDLParsers()" class="px-3 py-1.5 text-sm bg-purple-700 hover:bg-purple-600 rounded-lg text-white">Load SDL Parsers</button> <button id="btn-sdl-parsers" onclick="loadSDLParsers()" class="px-3 py-1.5 text-sm bg-purple-700 hover:bg-purple-600 rounded-lg text-white">Load SDL Parsers</button>
<button onclick="document.getElementById('f-parser').click()" class="px-3 py-1.5 text-sm bg-gray-700 hover:bg-gray-600 rounded-lg text-white">Upload Parser</button> <button onclick="document.getElementById('f-parser').click()" class="px-3 py-1.5 text-sm bg-gray-700 hover:bg-gray-600 rounded-lg text-white">Upload Parser</button>
<button onclick="cvReset()" class="px-3 py-1.5 text-sm bg-red-900/60 hover:bg-red-800 rounded-lg text-red-300">Reset</button> <button onclick="cvReset()" class="px-3 py-1.5 text-sm bg-red-900/60 hover:bg-red-800 rounded-lg text-red-300">Reset</button>
@@ -166,28 +253,51 @@ function renderCoverage() {
cvLoad() cvLoad()
} }
async function loadSDLParsers() { async function syncLibrary() {
setBtn('btn-sdl-parsers', true) setBtn('btn-sync-library', true)
document.getElementById('cv-err').innerHTML = '' const errEl = document.getElementById('cv-err')
if (errEl) errEl.innerHTML = ''
try { try {
const res = await apiPost('/api/coverage/load-parsers-from-sdl', {}) const r = await apiPost('/api/coverage/load-detections', {})
if (res.errors?.length) { if (errEl) {
document.getElementById('cv-err').innerHTML = errBox(`${res.errors.length} parser(s) failed to load: ${res.errors.map(e=>e.parser).join(', ')}`) errEl.innerHTML = `<div class="p-3 bg-emerald-900/40 border border-emerald-700 rounded-lg text-sm text-emerald-300 mb-4">✓ ${r.loaded} detection rules synced from ${r.source === 'api' ? 'S1 API' : 'local file'}</div>`
setTimeout(() => { errEl.innerHTML = '' }, 4000)
} }
cvLoad() cvLoad()
} catch(e) { } catch(e) {
document.getElementById('cv-err').innerHTML = errBox(e.message) if (errEl) errEl.innerHTML = errBox(e.message)
} finally { setBtn('btn-sync-library', false, 'Sync Detection Library') }
}
async function loadSDLParsers() {
setBtn('btn-sdl-parsers', true)
const errEl = document.getElementById('cv-err')
if (errEl) errEl.innerHTML = ''
try {
const res = await apiPost('/api/coverage/load-parsers-from-sdl', {})
let msg = `${res.loaded} parser${res.loaded !== 1 ? 's' : ''} loaded`
if (res.errors?.length) {
msg += `${res.errors.length} failed: ${res.errors.map(e=>e.parser).join(', ')}`
if (errEl) errEl.innerHTML = errBox(msg)
} else {
if (errEl) errEl.innerHTML = `<div class="p-3 bg-emerald-900/40 border border-emerald-700 rounded-lg text-sm text-emerald-300 mb-4">${msg}</div>`
setTimeout(() => { if (errEl) errEl.innerHTML = '' }, 4000)
}
cvLoad()
} catch(e) {
if (errEl) errEl.innerHTML = errBox(e.message)
} finally { } finally {
setBtn('btn-sdl-parsers', false, 'Load SDL Parsers') setBtn('btn-sdl-parsers', false, 'Load SDL Parsers')
} }
} }
async function loadStar() {
setBtn('btn-star', true) function cvToggleMissing(id) {
document.getElementById('cv-err').innerHTML = '' const el = document.getElementById(id)
try { await apiPost('/api/coverage/load-star-rules', {}); cvLoad() } const chevron = document.getElementById(id + '-chevron')
catch(e) { document.getElementById('cv-err').innerHTML = errBox(e.message) } if (!el) return
finally { setBtn('btn-star', false, 'Load Library STAR Rules') } const open = el.classList.toggle('hidden')
if (chevron) chevron.textContent = open ? '▶' : '▼'
} }
async function cvUploadSigma(files) { async function cvUploadSigma(files) {
@@ -236,7 +346,7 @@ async function cvLoad() {
document.getElementById('cv-table').innerHTML = ` document.getElementById('cv-table').innerHTML = `
<div class="bg-gray-900/50 border border-gray-800 rounded-lg p-6 text-center text-sm text-gray-500"> <div class="bg-gray-900/50 border border-gray-800 rounded-lg p-6 text-center text-sm text-gray-500">
<p class="mb-2">No active sources synced yet.</p> <p class="mb-2">No active sources synced yet.</p>
<p>Click <strong class="text-gray-300">Sync Live Sources</strong> to pull current dataSource.names from the data lake, then <strong class="text-gray-300">Load STAR Rules</strong> and <strong class="text-gray-300">Load SDL Parsers</strong> to see coverage.</p> <p>Click <strong class="text-gray-300">Sync Live Sources</strong> to pull current dataSource.names from the data lake, then <strong class="text-gray-300">Load SDL Parsers</strong> to see coverage.</p>
</div>` </div>`
return return
} }
@@ -286,24 +396,39 @@ function cvSetFilter(f) {
? `<span class="text-emerald-600 text-xs">✓ All fields covered</span>` ? `<span class="text-emerald-600 text-xs">✓ All fields covered</span>`
: `<span class="text-gray-700 text-xs">—</span>` : `<span class="text-gray-700 text-xs">—</span>`
} }
const id = 'mf-' + s.source_name.replace(/[^a-z0-9]/gi, '_')
const chips = s.missing_fields.map(f => const chips = s.missing_fields.map(f =>
`<span class="px-1.5 py-0.5 bg-red-900/40 border border-red-800/60 rounded text-xs font-mono text-red-300">${esc(f)}</span>` `<span class="px-1.5 py-0.5 bg-red-900/40 border border-red-800/60 rounded text-xs font-mono text-red-300">${esc(f)}</span>`
).join(' ') ).join(' ')
return `<div class="flex flex-wrap gap-1">${chips}</div>` return `<div>
<button onclick="cvToggleMissing('${id}')"
class="flex items-center gap-1.5 text-xs text-red-400 hover:text-red-300 transition-colors">
<span class="px-1.5 py-0.5 bg-red-900/40 border border-red-800/60 rounded font-semibold">${s.missing_fields.length}</span>
<span>field${s.missing_fields.length !== 1 ? 's' : ''} missing</span>
<span id="${id}-chevron" class="text-gray-600">▶</span>
</button>
<div id="${id}" class="hidden mt-1.5 flex flex-wrap gap-1">${chips}</div>
</div>`
}
function detectionsCell(s) {
if (s.rule_count) {
return `<span class="text-purple-400 font-medium">${s.rule_count}</span> rule${s.rule_count !== 1 ? 's' : ''}`
}
if (s.close_matches && s.close_matches.length) {
const hints = s.close_matches.map(m =>
`<span class="text-amber-400">${esc(m.library_name)}</span> <span class="text-gray-600">(${m.rule_count} rules)</span>`
).join(', ')
return `<span class="text-gray-700">—</span> <span class="text-amber-600 text-xs" title="dataSource.name mismatch?">⚠ similar: ${hints}</span>`
}
return `<span class="text-gray-700">—</span>`
} }
function parserCell(s) { function parserCell(s) {
if (s.status === 'covered') { if (s.status === 'covered') {
if (s.parser === 'detected in data') { return `<span class="text-emerald-400 font-medium">✓ Parsed</span>`
return `<span class="text-emerald-400">✓ Parsed <span class="text-emerald-700">(${(s.parser_detected||0).toLocaleString()} typed events detected)</span></span>`
}
const detail = s.parser_fields ? ` (${s.parser_fields} fields)` : ''
return `<span class="text-gray-400">${esc(s.parser)}${detail}</span>`
} }
if (s.parser && s.format_type && s.format_type !== 'custom') { return `<span class="text-red-400 font-medium">✗ Not Parsed</span>`
return `<span class="text-amber-400 italic">⚠ ${esc(s.parser)} <span class="text-amber-600">(${esc(s.format_type)} — needs custom parser)</span></span>`
}
return `<span class="text-red-400 italic">⚠ No parser loaded</span>`
} }
document.getElementById('cv-table').innerHTML = sources.length === 0 document.getElementById('cv-table').innerHTML = sources.length === 0
@@ -314,16 +439,20 @@ function cvSetFilter(f) {
<th class="pb-2 pr-4 font-medium">Events (7d)</th> <th class="pb-2 pr-4 font-medium">Events (7d)</th>
<th class="pb-2 pr-4 font-medium">Status</th> <th class="pb-2 pr-4 font-medium">Status</th>
<th class="pb-2 pr-4 font-medium">Parser</th> <th class="pb-2 pr-4 font-medium">Parser</th>
<th class="pb-2 pr-4 font-medium">STAR Rules</th> <th class="pb-2 pr-4 font-medium">Detections</th>
<th class="pb-2 font-medium">Detection Fields Missing</th> <th class="pb-2 font-medium">Fields Missing</th>
</tr></thead> </tr></thead>
<tbody>${sources.map(s => ` <tbody>${sources.map(s => `
<tr class="border-b border-gray-800/50 hover:bg-gray-900/30"> <tr class="border-b border-gray-800/50 hover:bg-gray-900/30">
<td class="py-2 pr-4 font-mono text-xs text-gray-200">${esc(s.source_name)}</td> <td class="py-2 pr-4 font-mono text-xs">
<a href="#/quality" onclick="queueQualitySource('${esc(s.source_name)}')"
class="text-gray-200 hover:text-purple-400 cursor-pointer transition-colors"
title="Open in Parser Quality">${esc(s.source_name)}</a>
</td>
<td class="py-2 pr-4 text-xs text-gray-400">${(s.event_count||0).toLocaleString()}</td> <td class="py-2 pr-4 text-xs text-gray-400">${(s.event_count||0).toLocaleString()}</td>
<td class="py-2 pr-4"><span class="px-2 py-0.5 rounded text-xs border ${STYLES[s.status]||''}">${LABELS[s.status]||s.status}</span></td> <td class="py-2 pr-4"><span class="px-2 py-0.5 rounded text-xs border ${STYLES[s.status]||''}">${LABELS[s.status]||s.status}</span></td>
<td class="py-2 pr-4 text-xs">${parserCell(s)}</td> <td class="py-2 pr-4 text-xs">${parserCell(s)}</td>
<td class="py-2 pr-4 text-xs text-gray-400">${s.rules?.length ? s.rules.map(r=>esc(r.rule)).join(', ') : '—'}</td> <td class="py-2 pr-4 text-xs text-gray-400">${detectionsCell(s)}</td>
<td class="py-2 text-xs">${missingFieldsCell(s)}</td> <td class="py-2 text-xs">${missingFieldsCell(s)}</td>
</tr>`).join('')} </tr>`).join('')}
</tbody></table></div>` </tbody></table></div>`
@@ -749,7 +878,17 @@ function renderQuality() {
<div id="qt-result"></div> <div id="qt-result"></div>
</div> </div>
</div>`) </div>`)
qtLoadParsers() qtLoadParsers().then(() => {
// Pre-select source if navigated from Coverage Map or Overview
if (_pendingQualitySource) {
const src = _pendingQualitySource
_pendingQualitySource = null
const qsSel = document.getElementById('qs-source')
const qpSel = document.getElementById('qp-source')
if (qsSel) qsSel.value = src
if (qpSel) { qpSel.value = src; qpDiscoverFields() }
}
})
} }
// ── Live Event Sampler ───────────────────────────────────────────────────── // ── Live Event Sampler ─────────────────────────────────────────────────────