mirror of
https://github.com/marcredhat/SIEM-toolkit-patched
synced 2026-06-08 20:37:12 +00:00
Auto-load detection library from S1 API, improve coverage map accuracy
- Fetch detection library rules from platform-rules API at startup (falls back to extracted.json); adds Sync Detection Library button for refresh - Parser column simplified to ✓ Parsed / ✗ Not Parsed - Detection counts now use library rules only (exclude custom STAR rules) - Add close-match suggestions for dataSource.name mismatches (e.g. CloudTrail → AWS CloudTrail, Microsoft 365 Collaboration → Microsoft O365) - Exclude SentinelOne Ranger AD from coverage map (native S1 source) - Add success feedback banners to Load SDL Parsers and Sync Library buttons - Remove rule_counts.json manual override; extracted.json is source of truth - Remove Load Detections button; rules auto-import on backend startup - Add get_account_id() and get_platform_rules() to s1_client Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -7,3 +7,4 @@ node_modules/
|
|||||||
frontend/out/
|
frontend/out/
|
||||||
pgdata/
|
pgdata/
|
||||||
parsers/*.json
|
parsers/*.json
|
||||||
|
data/
|
||||||
|
|||||||
@@ -10,6 +10,7 @@ A self-hosted troubleshooting and visibility tool for SentinelOne AI-SIEM SecOps
|
|||||||
|
|
||||||
| Page | Purpose |
|
| Page | Purpose |
|
||||||
|---|---|
|
|---|---|
|
||||||
|
| **Overview** | Live health stats — coverage percentage, active sources, top uncovered sources by volume |
|
||||||
| **Parser Coverage Map** | Which active data sources have a parser? Which don't? |
|
| **Parser Coverage Map** | Which active data sources have a parser? Which don't? |
|
||||||
| **Ingest Dashboard** | Event volume, top sources, cost projection, filter simulator |
|
| **Ingest Dashboard** | Event volume, top sources, cost projection, filter simulator |
|
||||||
| **Parser Quality** | Live event sampler, field population rate, parser test runner |
|
| **Parser Quality** | Live event sampler, field population rate, parser test runner |
|
||||||
@@ -26,12 +27,12 @@ browser → nginx (port 3001) → single-page HTML/JS application
|
|||||||
FastAPI backend (port 8001)
|
FastAPI backend (port 8001)
|
||||||
↓
|
↓
|
||||||
┌───────────────────────────┐
|
┌───────────────────────────┐
|
||||||
│ PostgreSQL (SQLAlchemy) │ parsed rules, parser fields, active sources
|
│ PostgreSQL (SQLAlchemy) │ parser fields, active sources
|
||||||
└───────────────────────────┘
|
└───────────────────────────┘
|
||||||
↓
|
↓
|
||||||
┌───────────────────────────┐
|
┌───────────────────────────┐
|
||||||
│ SentinelOne APIs │
|
│ SentinelOne APIs │
|
||||||
│ • Management API (STAR) │ demo.sentinelone.net
|
│ • Management API │ demo.sentinelone.net
|
||||||
│ • Scalyr XDR PowerQuery │ xdr.us1.sentinelone.net
|
│ • Scalyr XDR PowerQuery │ xdr.us1.sentinelone.net
|
||||||
└───────────────────────────┘
|
└───────────────────────────┘
|
||||||
```
|
```
|
||||||
@@ -54,16 +55,34 @@ Edit `.env` with your credentials:
|
|||||||
|
|
||||||
```env
|
```env
|
||||||
S1_BASE_URL=https://demo.sentinelone.net # Your console URL
|
S1_BASE_URL=https://demo.sentinelone.net # Your console URL
|
||||||
S1_API_TOKEN=eyJ... # Service user API token
|
S1_API_TOKEN=eyJ... # Service user API token (account scope or higher)
|
||||||
SDL_XDR_URL=https://xdr.us1.sentinelone.net # Scalyr XDR endpoint
|
SDL_XDR_URL=https://xdr.us1.sentinelone.net # Scalyr XDR endpoint
|
||||||
SDL_LOG_READ_KEY=1j2IU0S... # Data Lake read key
|
SDL_LOG_READ_KEY=1j2IU0S... # Data Lake read key
|
||||||
ANTHROPIC_API_KEY= # Optional — Onboarding page only
|
ANTHROPIC_API_KEY= # Optional — not currently used
|
||||||
```
|
```
|
||||||
|
|
||||||
**S1_API_TOKEN** — generate at *Settings → Users → Service Users* in the console.
|
**S1_API_TOKEN** — generate at *Settings → Users → Service Users* in the console. The service user should be provisioned at **account scope** or higher.
|
||||||
**SDL_LOG_READ_KEY** — found at *Settings → Integrations → Data Lake API Keys*.
|
**SDL_LOG_READ_KEY** — found at *Settings → Integrations → Data Lake API Keys*.
|
||||||
|
|
||||||
### 2. Add Parser Files (optional but strongly recommended)
|
### 2. Add the Detection Library (strongly recommended)
|
||||||
|
|
||||||
|
The Detection Fields Missing column and per-source detection counts on the Coverage Map require a local detections export. This is generated from the [detection-validator](https://github.com/mickbrowns1/detection-validator) repository.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Clone the detection-validator repo alongside this one
|
||||||
|
git clone https://github.com/mickbrowns1/detection-validator.git
|
||||||
|
cd detection-validator
|
||||||
|
|
||||||
|
# Follow its README to generate the export, then copy the output here:
|
||||||
|
mkdir -p ../SIEM-Toolkit/data
|
||||||
|
cp data/data/detections/extracted.json ../SIEM-Toolkit/data/detections.json
|
||||||
|
|
||||||
|
cd ../SIEM-Toolkit
|
||||||
|
```
|
||||||
|
|
||||||
|
The `data/` directory is gitignored and never committed. Once the stack is running, click **Load Detections** on the Coverage Map to import the rules into the database.
|
||||||
|
|
||||||
|
### 3. Add Parser Files (optional but strongly recommended)
|
||||||
|
|
||||||
Place your SDL parser JSON files into the `parsers/` directory. The backend reads them directly at query time — no rebuild is necessary.
|
Place your SDL parser JSON files into the `parsers/` directory. The backend reads them directly at query time — no rebuild is necessary.
|
||||||
|
|
||||||
@@ -71,7 +90,7 @@ Place your SDL parser JSON files into the `parsers/` directory. The backend read
|
|||||||
cp ~/my-parsers/*.json parsers/
|
cp ~/my-parsers/*.json parsers/
|
||||||
```
|
```
|
||||||
|
|
||||||
### 3. Start the Stack
|
### 4. Start the Stack
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
docker-compose up -d --build
|
docker-compose up -d --build
|
||||||
@@ -83,6 +102,18 @@ Open **http://localhost:3001** in your browser and you're off.
|
|||||||
|
|
||||||
## Features
|
## Features
|
||||||
|
|
||||||
|
### Overview Dashboard
|
||||||
|
|
||||||
|
The landing page gives you an at-a-glance health summary drawn live from the database:
|
||||||
|
|
||||||
|
- **Parser Coverage %** — proportion of active sources with a confirmed parser
|
||||||
|
- **Active Sources** — total number of `dataSource.name` values seen in the last 7 days
|
||||||
|
- **Covered / Need Parser** — counts for each status
|
||||||
|
|
||||||
|
If any sources are uncovered, the **Top Sources Needing a Parser** table lists the highest-volume offenders. Click any source name to jump directly to the Parser Quality page with that source pre-selected.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
### Parser Coverage Map
|
### Parser Coverage Map
|
||||||
|
|
||||||
Answers the question: *does each active data source have a parser running?*
|
Answers the question: *does each active data source have a parser running?*
|
||||||
@@ -91,7 +122,6 @@ Answers the question: *does each active data source have a parser running?*
|
|||||||
|
|
||||||
1. **Sync Live Sources** — executes a PowerQuery against your data lake to retrieve every `dataSource.name` seen in the last 7 days, along with event counts.
|
1. **Sync Live Sources** — executes a PowerQuery against your data lake to retrieve every `dataSource.name` seen in the last 7 days, along with event counts.
|
||||||
2. **Load SDL Parsers** — reads parser files from `parsers/`, extracts the `dataSource.name` attribute from each, and stores the field list in the database.
|
2. **Load SDL Parsers** — reads parser files from `parsers/`, extracts the `dataSource.name` attribute from each, and stores the field list in the database.
|
||||||
3. **Load STAR Rules** — retrieves your STAR detection rules from the management API and indexes which data sources each rule references.
|
|
||||||
|
|
||||||
**Matching logic (three-tier):**
|
**Matching logic (three-tier):**
|
||||||
1. Exact `dataSource.name` match between the active source and the parser attribute
|
1. Exact `dataSource.name` match between the active source and the parser attribute
|
||||||
@@ -104,6 +134,10 @@ Answers the question: *does each active data source have a parser running?*
|
|||||||
- 🟢 **Covered** — custom parser confirmed (local file or detected via parsed events in the data lake)
|
- 🟢 **Covered** — custom parser confirmed (local file or detected via parsed events in the data lake)
|
||||||
- 🔴 **Parser Needed** — no parser found, or only a grok/dottedJson format (which typically indicates an incomplete parser)
|
- 🔴 **Parser Needed** — no parser found, or only a grok/dottedJson format (which typically indicates an incomplete parser)
|
||||||
|
|
||||||
|
**Filters:** Use the filter pills to focus on Custom Parser only, Default Parser Only (data lake detected), or No Parser.
|
||||||
|
|
||||||
|
**Deep link:** Click any source name in the table to open it directly in Parser Quality with all dropdowns pre-populated.
|
||||||
|
|
||||||
**Expected results:** After syncing sources and loading parsers, sources with active SDL parsers will appear as Covered. Sources sending raw, unparsed data — where only `message` and `timestamp` appear in the data lake — will appear as Parser Needed.
|
**Expected results:** After syncing sources and loading parsers, sources with active SDL parsers will appear as Covered. Sources sending raw, unparsed data — where only `message` and `timestamp` appear in the data lake — will appear as Parser Needed.
|
||||||
|
|
||||||
---
|
---
|
||||||
@@ -173,8 +207,7 @@ A prompt template for using Claude Code to onboard a new log source. Copy the te
|
|||||||
|
|
||||||
- An SDL parser skeleton in augmented-JSON format
|
- An SDL parser skeleton in augmented-JSON format
|
||||||
- Field mappings to the SDL common schema
|
- Field mappings to the SDL common schema
|
||||||
- 2–3 starter STAR detection rules
|
- Parser test assertions
|
||||||
- 5 parser test assertions
|
|
||||||
|
|
||||||
No Anthropic API key is required — this uses Claude Code directly from your terminal.
|
No Anthropic API key is required — this uses Claude Code directly from your terminal.
|
||||||
|
|
||||||
@@ -222,7 +255,7 @@ curl -X DELETE http://localhost:8001/api/coverage/reset
|
|||||||
│ │ └── settings.py # .env read/write
|
│ │ └── settings.py # .env read/write
|
||||||
│ └── services/
|
│ └── services/
|
||||||
│ ├── s1_client.py # SentinelOne + Scalyr API client
|
│ ├── s1_client.py # SentinelOne + Scalyr API client
|
||||||
│ └── rule_parser.py # SDL/Sigma/STAR field extraction
|
│ └── rule_parser.py # SDL format string field extraction
|
||||||
├── frontend/
|
├── frontend/
|
||||||
│ └── index.html # Single-page application (Tailwind, vanilla JS)
|
│ └── index.html # Single-page application (Tailwind, vanilla JS)
|
||||||
├── parsers/ # SDL parser files (volume-mounted)
|
├── parsers/ # SDL parser files (volume-mounted)
|
||||||
@@ -240,3 +273,4 @@ curl -X DELETE http://localhost:8001/api/coverage/reset
|
|||||||
- The backend queries your **demo tenant** (`demo.sentinelone.net`) — not usea1-purple or any other tenant. Ensure your `S1_BASE_URL` and `SDL_LOG_READ_KEY` are pointed at the same tenant.
|
- The backend queries your **demo tenant** (`demo.sentinelone.net`) — not usea1-purple or any other tenant. Ensure your `S1_BASE_URL` and `SDL_LOG_READ_KEY` are pointed at the same tenant.
|
||||||
- Parser files in `parsers/` are read at query time, not on startup — add or update files at any point without rebuilding the image.
|
- Parser files in `parsers/` are read at query time, not on startup — add or update files at any point without rebuilding the image.
|
||||||
- The filter simulator is entirely read-only and makes no changes whatsoever to your tenant configuration.
|
- The filter simulator is entirely read-only and makes no changes whatsoever to your tenant configuration.
|
||||||
|
- The service user API token must be at **account scope** or higher. Site-scoped tokens will have limited visibility into rules and may see reduced source counts.
|
||||||
|
|||||||
+35
-1
@@ -1,6 +1,6 @@
|
|||||||
from fastapi import FastAPI
|
from fastapi import FastAPI
|
||||||
from fastapi.middleware.cors import CORSMiddleware
|
from fastapi.middleware.cors import CORSMiddleware
|
||||||
from db import engine, Base
|
from db import engine, Base, get_db, ParsedRule
|
||||||
from routers import coverage, ingest, settings, quality
|
from routers import coverage, ingest, settings, quality
|
||||||
|
|
||||||
Base.metadata.create_all(bind=engine)
|
Base.metadata.create_all(bind=engine)
|
||||||
@@ -15,6 +15,40 @@ with engine.connect() as _conn:
|
|||||||
|
|
||||||
app = FastAPI(title="SIEM Toolkit", version="1.0.0")
|
app = FastAPI(title="SIEM Toolkit", version="1.0.0")
|
||||||
|
|
||||||
|
|
||||||
|
@app.on_event("startup")
|
||||||
|
async def auto_load_detections():
|
||||||
|
"""
|
||||||
|
Auto-load detection library rules on startup.
|
||||||
|
Tries the live S1 API first (accurate 'sources' field); falls back to extracted.json.
|
||||||
|
Skips if rules are already loaded — use the 'Sync Library' button to force a refresh.
|
||||||
|
"""
|
||||||
|
import os
|
||||||
|
from sqlalchemy.orm import Session
|
||||||
|
from services import s1_client
|
||||||
|
|
||||||
|
db: Session = next(get_db())
|
||||||
|
try:
|
||||||
|
existing = db.query(ParsedRule).filter_by(rule_type="library").count()
|
||||||
|
if existing > 0:
|
||||||
|
return # Already loaded — skip until user manually refreshes
|
||||||
|
|
||||||
|
# Try live API first
|
||||||
|
try:
|
||||||
|
rules = await s1_client.get_platform_rules()
|
||||||
|
if rules:
|
||||||
|
coverage._import_from_api_rules(db, rules)
|
||||||
|
return
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
|
||||||
|
# Fall back to local file
|
||||||
|
detections_file = os.environ.get("DETECTIONS_FILE", "/app/data/detections.json")
|
||||||
|
if os.path.exists(detections_file):
|
||||||
|
coverage._import_detections(db, detections_file)
|
||||||
|
finally:
|
||||||
|
db.close()
|
||||||
|
|
||||||
app.add_middleware(
|
app.add_middleware(
|
||||||
CORSMiddleware,
|
CORSMiddleware,
|
||||||
allow_origins=["http://localhost:3001"],
|
allow_origins=["http://localhost:3001"],
|
||||||
|
|||||||
+208
-30
@@ -1,4 +1,5 @@
|
|||||||
import json
|
import json
|
||||||
|
import os
|
||||||
from fastapi import APIRouter, UploadFile, File, Depends, HTTPException
|
from fastapi import APIRouter, UploadFile, File, Depends, HTTPException
|
||||||
from pydantic import BaseModel
|
from pydantic import BaseModel
|
||||||
from sqlalchemy.orm import Session
|
from sqlalchemy.orm import Session
|
||||||
@@ -6,6 +7,8 @@ from datetime import datetime
|
|||||||
from db import get_db, ParsedRule, ParserField, ActiveSource
|
from db import get_db, ParsedRule, ParserField, ActiveSource
|
||||||
from services import s1_client, rule_parser
|
from services import s1_client, rule_parser
|
||||||
|
|
||||||
|
DETECTIONS_FILE = os.environ.get("DETECTIONS_FILE", "/app/data/detections.json")
|
||||||
|
|
||||||
router = APIRouter()
|
router = APIRouter()
|
||||||
|
|
||||||
|
|
||||||
@@ -40,22 +43,12 @@ def _star_query_texts(rule: dict) -> list[str]:
|
|||||||
|
|
||||||
|
|
||||||
@router.post("/load-star-rules")
|
@router.post("/load-star-rules")
|
||||||
async def load_star_rules(library_only: bool = None, db: Session = Depends(get_db)):
|
async def load_star_rules(db: Session = Depends(get_db)):
|
||||||
"""Fetch STAR rules from SentinelOne and index their fields.
|
"""Fetch all STAR rules from the Management Console API and index their fields."""
|
||||||
library_only defaults to the STAR_LIBRARY_ONLY env var (default true).
|
|
||||||
Pass ?library_only=false to include custom tenant rules as well.
|
|
||||||
"""
|
|
||||||
import os
|
|
||||||
if library_only is None:
|
|
||||||
library_only = os.environ.get("STAR_LIBRARY_ONLY", "true").lower() != "false"
|
|
||||||
|
|
||||||
try:
|
try:
|
||||||
rules = await s1_client.get_star_rules()
|
rules = await s1_client.get_star_rules()
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
raise HTTPException(502, f"S1 API error: {e}")
|
raise HTTPException(502, f"S1 API error: {type(e).__name__}: {e}")
|
||||||
|
|
||||||
if library_only:
|
|
||||||
rules = [r for r in rules if str(r.get("creator", "")).lower().endswith("@sentinelone.com")]
|
|
||||||
|
|
||||||
# Replace all existing STAR rules cleanly to avoid duplicate key errors
|
# Replace all existing STAR rules cleanly to avoid duplicate key errors
|
||||||
db.query(ParsedRule).filter_by(rule_type="star").delete()
|
db.query(ParsedRule).filter_by(rule_type="star").delete()
|
||||||
@@ -81,6 +74,118 @@ async def load_star_rules(library_only: bool = None, db: Session = Depends(get_d
|
|||||||
return {"loaded": len(loaded), "rules": loaded}
|
return {"loaded": len(loaded), "rules": loaded}
|
||||||
|
|
||||||
|
|
||||||
|
_EXCLUDED_PATHS = ("/rules/silent/", "/rules/dev/")
|
||||||
|
|
||||||
|
|
||||||
|
def _import_from_api_rules(db, rules: list) -> int:
|
||||||
|
"""
|
||||||
|
Import platform rules fetched directly from the S1 API into the database.
|
||||||
|
Each rule has a 'sources' list — the authoritative dataSource.name values.
|
||||||
|
"""
|
||||||
|
db.query(ParsedRule).filter_by(rule_type="library").delete()
|
||||||
|
db.commit()
|
||||||
|
|
||||||
|
loaded = 0
|
||||||
|
seen_ids: set = set()
|
||||||
|
for rule in rules:
|
||||||
|
rule_id = str(rule.get("id", f"lib_{loaded}"))
|
||||||
|
if rule_id in seen_ids:
|
||||||
|
continue
|
||||||
|
seen_ids.add(rule_id)
|
||||||
|
|
||||||
|
sources = rule.get("sources") or []
|
||||||
|
db.add(ParsedRule(
|
||||||
|
rule_id=rule_id,
|
||||||
|
name=rule.get("name", "unnamed"),
|
||||||
|
rule_type="library",
|
||||||
|
fields_used=[], # API rules don't expose field-level info
|
||||||
|
raw=json.dumps({"data_sources": sources}),
|
||||||
|
))
|
||||||
|
loaded += 1
|
||||||
|
if loaded % 500 == 0:
|
||||||
|
db.flush()
|
||||||
|
|
||||||
|
db.commit()
|
||||||
|
return loaded
|
||||||
|
|
||||||
|
|
||||||
|
def _import_detections(db, detections_file: str) -> int:
|
||||||
|
"""
|
||||||
|
Import library detection rules from extracted.json into the database.
|
||||||
|
Replaces any existing library rules. Returns the count of rules loaded.
|
||||||
|
"""
|
||||||
|
with open(detections_file, "r", encoding="utf-8") as fh:
|
||||||
|
data = json.load(fh)
|
||||||
|
|
||||||
|
results = data.get("results", [])
|
||||||
|
results = [r for r in results if not any(r.get("file", "").startswith(p) for p in _EXCLUDED_PATHS)]
|
||||||
|
|
||||||
|
db.query(ParsedRule).filter_by(rule_type="library").delete()
|
||||||
|
db.commit()
|
||||||
|
|
||||||
|
loaded = 0
|
||||||
|
seen_ids: set = set()
|
||||||
|
for rule in results:
|
||||||
|
all_fields: set = set()
|
||||||
|
data_sources: list[str] = []
|
||||||
|
for q in rule.get("queries", []):
|
||||||
|
all_fields.update(q.get("keys", []))
|
||||||
|
ds_vals = q.get("pairs", {}).get("dataSource.name", [])
|
||||||
|
for v in ds_vals:
|
||||||
|
if isinstance(v, str):
|
||||||
|
data_sources.append(v)
|
||||||
|
elif isinstance(v, list):
|
||||||
|
data_sources.extend(str(x) for x in v)
|
||||||
|
|
||||||
|
rule_id = str(rule.get("id", f"lib_{loaded}"))
|
||||||
|
if rule_id in seen_ids:
|
||||||
|
continue
|
||||||
|
seen_ids.add(rule_id)
|
||||||
|
|
||||||
|
db.add(ParsedRule(
|
||||||
|
rule_id=rule_id,
|
||||||
|
name=rule.get("name", "unnamed"),
|
||||||
|
rule_type="library",
|
||||||
|
fields_used=list(all_fields),
|
||||||
|
raw=json.dumps({"data_sources": list(set(data_sources))}),
|
||||||
|
))
|
||||||
|
loaded += 1
|
||||||
|
if loaded % 500 == 0:
|
||||||
|
db.flush()
|
||||||
|
|
||||||
|
db.commit()
|
||||||
|
return loaded
|
||||||
|
|
||||||
|
|
||||||
|
@router.post("/load-detections")
|
||||||
|
async def load_detections(db: Session = Depends(get_db)):
|
||||||
|
"""
|
||||||
|
Reload detection library rules.
|
||||||
|
Tries the live S1 API first (platform-rules endpoint); falls back to extracted.json.
|
||||||
|
"""
|
||||||
|
# Prefer the live API — gives accurate 'sources' and is always up to date
|
||||||
|
try:
|
||||||
|
rules = await s1_client.get_platform_rules()
|
||||||
|
if rules:
|
||||||
|
loaded = _import_from_api_rules(db, rules)
|
||||||
|
return {"loaded": loaded, "source": "api"}
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
|
||||||
|
# Fall back to local extracted.json
|
||||||
|
if not os.path.exists(DETECTIONS_FILE):
|
||||||
|
raise HTTPException(
|
||||||
|
404,
|
||||||
|
"S1 API unavailable and no detections file found — "
|
||||||
|
"ensure the data/ volume is mounted with detections.json"
|
||||||
|
)
|
||||||
|
try:
|
||||||
|
loaded = _import_detections(db, DETECTIONS_FILE)
|
||||||
|
except Exception as e:
|
||||||
|
raise HTTPException(500, f"Failed to import detections: {e}")
|
||||||
|
return {"loaded": loaded, "source": "file"}
|
||||||
|
|
||||||
|
|
||||||
@router.post("/upload-sigma")
|
@router.post("/upload-sigma")
|
||||||
async def upload_sigma(files: list[UploadFile] = File(...), db: Session = Depends(get_db)):
|
async def upload_sigma(files: list[UploadFile] = File(...), db: Session = Depends(get_db)):
|
||||||
"""Upload one or more Sigma YAML files and index their fields."""
|
"""Upload one or more Sigma YAML files and index their fields."""
|
||||||
@@ -216,11 +321,21 @@ async def load_parser_content(payload: ParserContentPayload, db: Session = Depen
|
|||||||
return {"parser": payload.parser_name, "fields": list(fields), "field_count": len(fields)}
|
return {"parser": payload.parser_name, "fields": list(fields), "field_count": len(fields)}
|
||||||
|
|
||||||
|
|
||||||
|
# Native SentinelOne platform sources — parsed by the system, not by SDL parsers.
|
||||||
|
# Excluded from the coverage map as they do not require custom parser coverage.
|
||||||
|
_S1_NATIVE_SOURCES = {
|
||||||
|
"SentinelOne", "asset", "alert", "vulnerability",
|
||||||
|
"ActivityFeed", "indicator", "misconfiguration",
|
||||||
|
"SentinelOne Ranger AD",
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
@router.post("/sync-sources")
|
@router.post("/sync-sources")
|
||||||
async def sync_sources(days: int = 7, db: Session = Depends(get_db)):
|
async def sync_sources(days: int = 7, db: Session = Depends(get_db)):
|
||||||
"""Pull active dataSource.names from the SDL and store them.
|
"""Pull active dataSource.names from the SDL and store them.
|
||||||
Also detects whether a parser is already producing structured fields
|
Also detects whether a parser is already producing structured fields
|
||||||
for each source by checking if event.type is populated in the data lake.
|
for each source by checking if event.type is populated in the data lake.
|
||||||
|
Native S1 platform sources are excluded as they do not require SDL parsers.
|
||||||
"""
|
"""
|
||||||
import asyncio
|
import asyncio
|
||||||
from datetime import datetime, timedelta
|
from datetime import datetime, timedelta
|
||||||
@@ -255,7 +370,7 @@ async def sync_sources(days: int = 7, db: Session = Depends(get_db)):
|
|||||||
seen = 0
|
seen = 0
|
||||||
for row in rows:
|
for row in rows:
|
||||||
name = row.get("dataSource.name")
|
name = row.get("dataSource.name")
|
||||||
if name:
|
if name and name not in _S1_NATIVE_SOURCES:
|
||||||
db.add(ActiveSource(
|
db.add(ActiveSource(
|
||||||
source_name=name,
|
source_name=name,
|
||||||
event_count=row.get("events", 0),
|
event_count=row.get("events", 0),
|
||||||
@@ -264,7 +379,7 @@ async def sync_sources(days: int = 7, db: Session = Depends(get_db)):
|
|||||||
))
|
))
|
||||||
seen += 1
|
seen += 1
|
||||||
db.commit()
|
db.commit()
|
||||||
return {"synced": seen, "sources": [r["dataSource.name"] for r in rows if r.get("dataSource.name")]}
|
return {"synced": seen, "sources": [r["dataSource.name"] for r in rows if r.get("dataSource.name") and r["dataSource.name"] not in _S1_NATIVE_SOURCES]}
|
||||||
|
|
||||||
|
|
||||||
def _build_parser_ds_index() -> dict[str, dict]:
|
def _build_parser_ds_index() -> dict[str, dict]:
|
||||||
@@ -367,19 +482,28 @@ def get_coverage_map(db: Session = Depends(get_db)):
|
|||||||
# Build rule index: source_name → rules that reference it
|
# Build rule index: source_name → rules that reference it
|
||||||
rule_by_source: dict[str, list] = {}
|
rule_by_source: dict[str, list] = {}
|
||||||
for rule in rules:
|
for rule in rules:
|
||||||
query_texts = _star_query_texts(json.loads(rule.raw)) if rule.rule_type == "star" else []
|
try:
|
||||||
data_sources = rule_parser.extract_data_sources(query_texts)
|
raw_data = json.loads(rule.raw) if rule.raw else {}
|
||||||
|
except Exception:
|
||||||
|
raw_data = {}
|
||||||
|
|
||||||
|
if rule.rule_type == "library":
|
||||||
|
# Library rules store pre-extracted data_sources list in raw
|
||||||
|
data_sources = raw_data.get("data_sources", [])
|
||||||
|
else:
|
||||||
|
query_texts = _star_query_texts(raw_data)
|
||||||
|
data_sources = rule_parser.extract_data_sources(query_texts)
|
||||||
|
|
||||||
for ds in data_sources:
|
for ds in data_sources:
|
||||||
rule_by_source.setdefault(ds, []).append({"rule": rule.name, "type": rule.rule_type})
|
rule_by_source.setdefault(ds, []).append({"rule": rule.name, "type": rule.rule_type})
|
||||||
if not data_sources:
|
|
||||||
# Rule with no explicit source filter — applies to all
|
|
||||||
rule_by_source.setdefault("__any__", []).append({"rule": rule.name, "type": rule.rule_type})
|
|
||||||
|
|
||||||
# Fields to ignore when computing "missing" — these are metadata/schema fields
|
# Fields to ignore when computing "missing" — these are metadata/schema fields
|
||||||
# always present in events regardless of the parser
|
# always present in events regardless of the parser
|
||||||
_SCHEMA_FIELDS = {
|
_SCHEMA_FIELDS = {
|
||||||
"dataSource.name", "dataSource.vendor", "dataSource.category",
|
"dataSource.name", "dataSource.vendor", "dataSource.category",
|
||||||
"event.type", "timestamp", "src.endpoint.ip", "src.endpoint.name",
|
"event.type", "timestamp", "src.endpoint.ip", "src.endpoint.name",
|
||||||
|
# Endpoint agent fields — populated by the SentinelOne agent, not by SDL parsers
|
||||||
|
"cmdScript.content", "endpoint.os", "endpoint.name", "endpoint.uid",
|
||||||
}
|
}
|
||||||
|
|
||||||
sources_out = []
|
sources_out = []
|
||||||
@@ -414,22 +538,75 @@ def get_coverage_map(db: Session = Depends(get_db)):
|
|||||||
else:
|
else:
|
||||||
needed_count += 1
|
needed_count += 1
|
||||||
|
|
||||||
rules_for_src = rule_by_source.get(src.source_name, []) + rule_by_source.get("__any__", [])
|
rules_for_src: list = [r for r in rule_by_source.get(src.source_name, []) if r["type"] == "library"]
|
||||||
|
|
||||||
# Fields all associated rules need, minus schema fields always present
|
# Close-match suggestions — shown when there are no library rules for this source.
|
||||||
rule_fields_needed: set = set()
|
close_matches: list = []
|
||||||
|
if not rules_for_src:
|
||||||
|
import re as _re
|
||||||
|
|
||||||
|
def _word_tokens(s: str) -> set:
|
||||||
|
"""Split on non-alphanumeric boundaries, lowercase, drop single chars."""
|
||||||
|
return {t for t in _re.split(r"[^a-z0-9]+", s.lower()) if len(t) >= 2}
|
||||||
|
|
||||||
|
def _is_close(a: str, b: str) -> bool:
|
||||||
|
na, nb = _normalize(a), _normalize(b)
|
||||||
|
# 1. Simple substring match
|
||||||
|
if na in nb or nb in na:
|
||||||
|
return True
|
||||||
|
# 2. Token-level: handles "Microsoft 365 Collaboration" vs "Microsoft O365"
|
||||||
|
# — "365" is inside "o365", and they share "microsoft"
|
||||||
|
ta, tb = _word_tokens(a), _word_tokens(b)
|
||||||
|
shared_exact = ta & tb
|
||||||
|
if not shared_exact:
|
||||||
|
return False # Must share at least one word exactly
|
||||||
|
# Check that a DISTINCTIVE (non-shared) token from one name
|
||||||
|
# appears as a substring inside a token from the other.
|
||||||
|
# This avoids matching "Azure AD" to "Azure Platform" on "azure" alone.
|
||||||
|
unique_a = ta - shared_exact
|
||||||
|
unique_b = tb - shared_exact
|
||||||
|
return any(
|
||||||
|
ua in ub or ub in ua
|
||||||
|
for ua in unique_a for ub in unique_b
|
||||||
|
if len(ua) >= 2 and len(ub) >= 2
|
||||||
|
)
|
||||||
|
|
||||||
|
sn = _normalize(src.source_name)
|
||||||
|
for lib_ds, lib_rules in rule_by_source.items():
|
||||||
|
lib_only = [r for r in lib_rules if r["type"] == "library"]
|
||||||
|
if not lib_only:
|
||||||
|
continue
|
||||||
|
if _is_close(src.source_name, lib_ds):
|
||||||
|
close_matches.append({
|
||||||
|
"library_name": lib_ds,
|
||||||
|
"rule_count": len(lib_only),
|
||||||
|
})
|
||||||
|
close_matches.sort(key=lambda x: x["rule_count"], reverse=True)
|
||||||
|
close_matches = close_matches[:3]
|
||||||
|
|
||||||
|
# Count how many rules reference each field (frequency)
|
||||||
|
field_freq: dict[str, int] = {}
|
||||||
for r in rules_for_src:
|
for r in rules_for_src:
|
||||||
rule_fields_needed |= rule_fields_index.get(r["rule"], set())
|
for f in rule_fields_index.get(r["rule"], set()):
|
||||||
rule_fields_needed -= _SCHEMA_FIELDS
|
field_freq[f] = field_freq.get(f, 0) + 1
|
||||||
|
|
||||||
# Fields the parser provides
|
# Fields the parser provides
|
||||||
parser_provides = parser_index.get(matched_parser, set()) if matched_parser and matched_parser != "detected in data" else set()
|
parser_provides = parser_index.get(matched_parser, set()) if matched_parser and matched_parser != "detected in data" else set()
|
||||||
|
|
||||||
# Missing = fields rules need that the parser doesn't provide.
|
# Minimum number of rules that must reference a field before we flag it.
|
||||||
# Only consider dotted-path fields (e.g. src.ip, winEventLog.channel) —
|
# Scales with rule count so single-rule oddities don't dominate.
|
||||||
# single-word tokens are typically correlation variables or rule metadata.
|
rule_count = len(rules_for_src)
|
||||||
rule_fields_dotted = {f for f in rule_fields_needed if "." in f}
|
min_rules = max(2, round(rule_count * 0.05)) if rule_count >= 10 else 2
|
||||||
missing_fields = sorted(rule_fields_dotted - parser_provides)
|
|
||||||
|
# Missing = dotted-path fields needed by >= min_rules rules,
|
||||||
|
# not in schema constants, not provided by the parser.
|
||||||
|
missing_fields = sorted(
|
||||||
|
f for f, count in field_freq.items()
|
||||||
|
if count >= min_rules
|
||||||
|
and "." in f
|
||||||
|
and f not in _SCHEMA_FIELDS
|
||||||
|
and f not in parser_provides
|
||||||
|
)
|
||||||
|
|
||||||
sources_out.append({
|
sources_out.append({
|
||||||
"source_name": src.source_name,
|
"source_name": src.source_name,
|
||||||
@@ -441,6 +618,7 @@ def get_coverage_map(db: Session = Depends(get_db)):
|
|||||||
"parser_detected": src.parser_detected or 0,
|
"parser_detected": src.parser_detected or 0,
|
||||||
"rules": rules_for_src,
|
"rules": rules_for_src,
|
||||||
"rule_count": len(rules_for_src),
|
"rule_count": len(rules_for_src),
|
||||||
|
"close_matches": close_matches,
|
||||||
"missing_fields": missing_fields,
|
"missing_fields": missing_fields,
|
||||||
"missing_fields_count": len(missing_fields),
|
"missing_fields_count": len(missing_fields),
|
||||||
"synced_at": src.synced_at.isoformat() if src.synced_at else None,
|
"synced_at": src.synced_at.isoformat() if src.synced_at else None,
|
||||||
|
|||||||
@@ -15,9 +15,6 @@ FIELDS = [
|
|||||||
{"key": "SDL_XDR_URL", "label": "SDL XDR URL", "secret": False, "placeholder": "https://xdr.us1.sentinelone.net"},
|
{"key": "SDL_XDR_URL", "label": "SDL XDR URL", "secret": False, "placeholder": "https://xdr.us1.sentinelone.net"},
|
||||||
{"key": "SDL_LOG_READ_KEY", "label": "SDL Log Read Key", "secret": True, "placeholder": "1DnK0Y4e..."},
|
{"key": "SDL_LOG_READ_KEY", "label": "SDL Log Read Key", "secret": True, "placeholder": "1DnK0Y4e..."},
|
||||||
{"key": "ANTHROPIC_API_KEY", "label": "Anthropic API Key", "secret": True, "placeholder": "sk-ant-..."},
|
{"key": "ANTHROPIC_API_KEY", "label": "Anthropic API Key", "secret": True, "placeholder": "sk-ant-..."},
|
||||||
{"key": "STAR_LIBRARY_ONLY", "label": "STAR Rules — Library Only", "secret": False, "placeholder": "true",
|
|
||||||
"type": "select", "options": ["true", "false"],
|
|
||||||
"hint": "true = load only SentinelOne Library rules (@sentinelone.com creators). false = include custom tenant rules as well."},
|
|
||||||
]
|
]
|
||||||
|
|
||||||
FIELD_KEYS = {f["key"] for f in FIELDS}
|
FIELD_KEYS = {f["key"] for f in FIELDS}
|
||||||
|
|||||||
@@ -24,16 +24,72 @@ def _iso_to_epoch_ms(iso_str: str) -> int:
|
|||||||
return int(dt.timestamp() * 1000)
|
return int(dt.timestamp() * 1000)
|
||||||
|
|
||||||
|
|
||||||
async def get_star_rules(limit: int = 200) -> list:
|
async def get_star_rules(page_size: int = 100) -> list:
|
||||||
"""Fetch active STAR rules from the Management Console API."""
|
"""Fetch custom STAR rules from /cloud-detection/rules, paginating via cursor."""
|
||||||
|
all_rules = []
|
||||||
|
cursor = None
|
||||||
async with httpx.AsyncClient(timeout=30) as client:
|
async with httpx.AsyncClient(timeout=30) as client:
|
||||||
resp = await client.get(
|
while True:
|
||||||
f"{BASE_URL}/web/api/v2.1/cloud-detection/rules",
|
params = {"limit": page_size}
|
||||||
headers=HEADERS,
|
if cursor:
|
||||||
params={"limit": limit},
|
params["cursor"] = cursor
|
||||||
)
|
resp = await client.get(
|
||||||
resp.raise_for_status()
|
f"{BASE_URL}/web/api/v2.1/cloud-detection/rules",
|
||||||
return resp.json().get("data", [])
|
headers=HEADERS,
|
||||||
|
params=params,
|
||||||
|
)
|
||||||
|
resp.raise_for_status()
|
||||||
|
body = resp.json()
|
||||||
|
all_rules.extend(body.get("data", []))
|
||||||
|
cursor = body.get("pagination", {}).get("nextCursor")
|
||||||
|
if not cursor:
|
||||||
|
break
|
||||||
|
return all_rules
|
||||||
|
|
||||||
|
|
||||||
|
async def get_library_rules(page_size: int = 100) -> list:
|
||||||
|
"""
|
||||||
|
Fetch Detection Library (OOTB/Platform) rules from /web/api/v2.1/detection-library/rules.
|
||||||
|
Requires an account-level or higher API token — site-scoped tokens will receive a 400.
|
||||||
|
Returns an empty list gracefully if the token lacks sufficient scope.
|
||||||
|
"""
|
||||||
|
all_rules = []
|
||||||
|
cursor = None
|
||||||
|
async with httpx.AsyncClient(timeout=60) as client:
|
||||||
|
while True:
|
||||||
|
params: dict = {"limit": page_size}
|
||||||
|
if cursor:
|
||||||
|
params["cursor"] = cursor
|
||||||
|
resp = await client.get(
|
||||||
|
f"{BASE_URL}/web/api/v2.1/detection-library/rules",
|
||||||
|
headers=HEADERS,
|
||||||
|
params=params,
|
||||||
|
)
|
||||||
|
# 400 typically means site-scoped token — return empty rather than crash
|
||||||
|
if resp.status_code == 400:
|
||||||
|
return []
|
||||||
|
resp.raise_for_status()
|
||||||
|
body = resp.json()
|
||||||
|
batch = body.get("data", [])
|
||||||
|
all_rules.extend(batch)
|
||||||
|
cursor = body.get("pagination", {}).get("nextCursor")
|
||||||
|
if not cursor:
|
||||||
|
break
|
||||||
|
|
||||||
|
results = []
|
||||||
|
for rule in all_rules:
|
||||||
|
results.append({
|
||||||
|
"id": str(rule.get("id", "")),
|
||||||
|
"name": rule.get("name", "unnamed"),
|
||||||
|
"s1ql": rule.get("s1ql") or rule.get("query", ""),
|
||||||
|
"queryType": rule.get("queryType", "events"),
|
||||||
|
"severity": rule.get("severity", ""),
|
||||||
|
"description": rule.get("description", ""),
|
||||||
|
"gdlRuleId": rule.get("id", ""),
|
||||||
|
"creator": "SentinelOne",
|
||||||
|
"expirationMode": rule.get("expirationMode", "Permanent"),
|
||||||
|
})
|
||||||
|
return results
|
||||||
|
|
||||||
|
|
||||||
async def run_powerquery(query: str, from_date: str, to_date: str) -> dict:
|
async def run_powerquery(query: str, from_date: str, to_date: str) -> dict:
|
||||||
@@ -124,6 +180,55 @@ async def get_sdl_parser(filename: str) -> dict:
|
|||||||
return resp.json()
|
return resp.json()
|
||||||
|
|
||||||
|
|
||||||
|
async def get_account_id() -> str | None:
|
||||||
|
"""Return the first account ID visible to the current token."""
|
||||||
|
async with httpx.AsyncClient(timeout=15) as client:
|
||||||
|
resp = await client.get(
|
||||||
|
f"{BASE_URL}/web/api/v2.1/accounts",
|
||||||
|
headers=HEADERS,
|
||||||
|
params={"limit": 1},
|
||||||
|
)
|
||||||
|
resp.raise_for_status()
|
||||||
|
accounts = resp.json().get("data", [])
|
||||||
|
return str(accounts[0]["id"]) if accounts else None
|
||||||
|
|
||||||
|
|
||||||
|
async def get_platform_rules(page_size: int = 1000) -> list:
|
||||||
|
"""
|
||||||
|
Fetch all Detection Library platform rules from /detection-library/platform-rules.
|
||||||
|
Requires scopeLevel + scopeId — uses account scope with the first visible account.
|
||||||
|
Returns list of rules, each with a 'sources' list (authoritative data source names).
|
||||||
|
"""
|
||||||
|
account_id = await get_account_id()
|
||||||
|
if not account_id:
|
||||||
|
return []
|
||||||
|
|
||||||
|
all_rules: list = []
|
||||||
|
cursor: str = ""
|
||||||
|
async with httpx.AsyncClient(timeout=60) as client:
|
||||||
|
while True:
|
||||||
|
params: dict = {
|
||||||
|
"scopeLevel": "account",
|
||||||
|
"scopeId": account_id,
|
||||||
|
"limit": page_size,
|
||||||
|
"cursor": cursor,
|
||||||
|
}
|
||||||
|
resp = await client.get(
|
||||||
|
f"{BASE_URL}/web/api/v2.1/detection-library/platform-rules",
|
||||||
|
headers=HEADERS,
|
||||||
|
params=params,
|
||||||
|
)
|
||||||
|
if resp.status_code == 400:
|
||||||
|
return []
|
||||||
|
resp.raise_for_status()
|
||||||
|
body = resp.json()
|
||||||
|
all_rules.extend(body.get("data", []))
|
||||||
|
cursor = body.get("pagination", {}).get("nextCursor") or ""
|
||||||
|
if not cursor:
|
||||||
|
break
|
||||||
|
return all_rules
|
||||||
|
|
||||||
|
|
||||||
async def get_sites() -> list:
|
async def get_sites() -> list:
|
||||||
async with httpx.AsyncClient(timeout=30) as client:
|
async with httpx.AsyncClient(timeout=30) as client:
|
||||||
resp = await client.get(
|
resp = await client.get(
|
||||||
|
|||||||
@@ -17,12 +17,14 @@ services:
|
|||||||
- SDL_LOG_READ_KEY=${SDL_LOG_READ_KEY}
|
- SDL_LOG_READ_KEY=${SDL_LOG_READ_KEY}
|
||||||
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
|
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
|
||||||
- DATABASE_URL=postgresql://siem:siem@db:5432/siem
|
- DATABASE_URL=postgresql://siem:siem@db:5432/siem
|
||||||
|
- DETECTIONS_FILE=/app/data/detections.json
|
||||||
depends_on:
|
depends_on:
|
||||||
db:
|
db:
|
||||||
condition: service_healthy
|
condition: service_healthy
|
||||||
volumes:
|
volumes:
|
||||||
- ./parsers:/app/parsers
|
- ./parsers:/app/parsers
|
||||||
- ./.env:/app/.env
|
- ./.env:/app/.env
|
||||||
|
- ./data:/app/data:ro
|
||||||
|
|
||||||
db:
|
db:
|
||||||
image: postgres:16-alpine
|
image: postgres:16-alpine
|
||||||
|
|||||||
+175
-36
@@ -116,17 +116,98 @@ function barChart(rows, labelKey, valueKey) {
|
|||||||
|
|
||||||
function renderHome() {
|
function renderHome() {
|
||||||
set(`<div class="p-8 max-w-5xl">
|
set(`<div class="p-8 max-w-5xl">
|
||||||
<div class="mb-8">
|
<div class="mb-6">
|
||||||
<h1 class="text-2xl font-bold text-white">SIEM Engineering Toolkit</h1>
|
<h1 class="text-2xl font-bold text-white">SIEM Engineering Toolkit</h1>
|
||||||
<p class="text-gray-400 mt-1">SentinelOne AI-SIEM · demo.sentinelone.net</p>
|
<p class="text-gray-400 mt-1">SentinelOne AI-SIEM · demo.sentinelone.net</p>
|
||||||
</div>
|
</div>
|
||||||
<div class="grid grid-cols-1 md:grid-cols-3 gap-5">
|
<div id="home-stats" class="grid grid-cols-2 md:grid-cols-4 gap-4 mb-8">
|
||||||
${homeCard('#/coverage','Parser Coverage Map','Cross-reference SDL parser fields against STAR and Sigma rule fields. Surface parsed-but-unused fields as reduction candidates.','Open Coverage Map','from-purple-700 to-purple-900')}
|
<div class="bg-gray-900 border border-gray-800 rounded-xl p-4 text-center animate-pulse">
|
||||||
${homeCard('#/ingest','Ingest Dashboard','Visualize event volume by source and type. Project monthly GB costs and simulate exclusion filters before applying them.','Open Dashboard','from-blue-700 to-blue-900')}
|
<div class="h-7 w-16 bg-gray-800 rounded mx-auto mb-1"></div>
|
||||||
${homeCard('#/quality','Parser Quality','Sample live events to see which fields landed. Measure field population rates and test parser patterns against raw log lines.','Open Quality Tools','from-amber-700 to-amber-900')}
|
<div class="h-3 w-20 bg-gray-800 rounded mx-auto"></div>
|
||||||
${homeCard('#/onboarding','Onboarding Accelerator','Step-by-step guide for onboarding a new log source using Claude Code directly — no API key required.','View Guide','from-emerald-700 to-emerald-900')}
|
</div>
|
||||||
|
<div class="bg-gray-900 border border-gray-800 rounded-xl p-4 text-center animate-pulse">
|
||||||
|
<div class="h-7 w-16 bg-gray-800 rounded mx-auto mb-1"></div>
|
||||||
|
<div class="h-3 w-20 bg-gray-800 rounded mx-auto"></div>
|
||||||
|
</div>
|
||||||
|
<div class="bg-gray-900 border border-gray-800 rounded-xl p-4 text-center animate-pulse">
|
||||||
|
<div class="h-7 w-16 bg-gray-800 rounded mx-auto mb-1"></div>
|
||||||
|
<div class="h-3 w-20 bg-gray-800 rounded mx-auto"></div>
|
||||||
|
</div>
|
||||||
|
<div class="bg-gray-900 border border-gray-800 rounded-xl p-4 text-center animate-pulse">
|
||||||
|
<div class="h-7 w-16 bg-gray-800 rounded mx-auto mb-1"></div>
|
||||||
|
<div class="h-3 w-20 bg-gray-800 rounded mx-auto"></div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div id="home-uncovered" class="hidden mb-8"></div>
|
||||||
|
<div class="grid grid-cols-1 md:grid-cols-2 gap-5">
|
||||||
|
${homeCard('#/coverage','Parser Coverage Map','See which active data sources have a parser running and which need one.','Open Coverage Map','from-purple-700 to-purple-900')}
|
||||||
|
${homeCard('#/ingest','Ingest Dashboard','Visualize event volume by source and type. Simulate exclusion filters before applying them.','Open Dashboard','from-blue-700 to-blue-900')}
|
||||||
|
${homeCard('#/quality','Parser Quality','Sample live events, measure field population rates, and test parser patterns against raw log lines.','Open Quality Tools','from-amber-700 to-amber-900')}
|
||||||
|
${homeCard('#/onboarding','Onboarding Accelerator','Step-by-step guide for onboarding a new log source using Claude Code directly.','View Guide','from-emerald-700 to-emerald-900')}
|
||||||
</div>
|
</div>
|
||||||
</div>`)
|
</div>`)
|
||||||
|
homeLoadStats()
|
||||||
|
}
|
||||||
|
|
||||||
|
async function homeLoadStats() {
|
||||||
|
try {
|
||||||
|
const r = await apiGet('/api/coverage/map')
|
||||||
|
const sources = r.sources || []
|
||||||
|
const total = sources.length
|
||||||
|
const covered = sources.filter(s => s.status === 'covered').length
|
||||||
|
const needed = sources.filter(s => s.status === 'parser_needed').length
|
||||||
|
const pct = total ? Math.round(covered / total * 100) : 0
|
||||||
|
const pctColor = pct >= 80 ? 'text-emerald-400' : pct >= 50 ? 'text-amber-400' : 'text-red-400'
|
||||||
|
|
||||||
|
document.getElementById('home-stats').innerHTML = `
|
||||||
|
${homeStat(pct + '%', 'Parser Coverage', pctColor)}
|
||||||
|
${homeStat(total.toLocaleString(), 'Active Sources', 'text-blue-400')}
|
||||||
|
${homeStat(covered.toLocaleString(), 'Covered', 'text-emerald-400')}
|
||||||
|
${homeStat(needed.toLocaleString(), 'Need Parser', needed > 0 ? 'text-red-400' : 'text-gray-500')}`
|
||||||
|
|
||||||
|
// Top uncovered sources by volume
|
||||||
|
const uncovered = sources
|
||||||
|
.filter(s => s.status === 'parser_needed')
|
||||||
|
.sort((a, b) => (b.event_count || 0) - (a.event_count || 0))
|
||||||
|
.slice(0, 5)
|
||||||
|
|
||||||
|
if (uncovered.length) {
|
||||||
|
const rows = uncovered.map(s => `
|
||||||
|
<tr class="border-b border-gray-800/50">
|
||||||
|
<td class="py-2 pr-4 font-mono text-xs text-gray-200">
|
||||||
|
<a href="#/quality" onclick="queueQualitySource('${esc(s.source_name)}')" class="hover:text-purple-400 cursor-pointer">${esc(s.source_name)}</a>
|
||||||
|
</td>
|
||||||
|
<td class="py-2 text-xs text-gray-400">${(s.event_count || 0).toLocaleString()} events</td>
|
||||||
|
</tr>`).join('')
|
||||||
|
|
||||||
|
document.getElementById('home-uncovered').classList.remove('hidden')
|
||||||
|
document.getElementById('home-uncovered').innerHTML = `
|
||||||
|
<div class="bg-gray-900 border border-red-900/40 rounded-xl p-5">
|
||||||
|
<h2 class="text-sm font-semibold text-white mb-1">Top Sources Needing a Parser</h2>
|
||||||
|
<p class="text-xs text-gray-500 mb-3">Highest-volume sources with no parser running — click to inspect in Parser Quality.</p>
|
||||||
|
<table class="w-full">
|
||||||
|
<thead><tr class="text-left text-gray-500 border-b border-gray-800">
|
||||||
|
<th class="pb-2 pr-4 text-xs font-medium">Source</th>
|
||||||
|
<th class="pb-2 text-xs font-medium">Volume</th>
|
||||||
|
</tr></thead>
|
||||||
|
<tbody>${rows}</tbody>
|
||||||
|
</table>
|
||||||
|
</div>`
|
||||||
|
}
|
||||||
|
} catch(e) {
|
||||||
|
document.getElementById('home-stats').innerHTML = `
|
||||||
|
${homeStat('—', 'Parser Coverage', 'text-gray-600')}
|
||||||
|
${homeStat('—', 'Active Sources', 'text-gray-600')}
|
||||||
|
${homeStat('—', 'Covered', 'text-gray-600')}
|
||||||
|
${homeStat('—', 'Need Parser', 'text-gray-600')}`
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
function homeStat(value, label, valueClass) {
|
||||||
|
return `<div class="bg-gray-900 border border-gray-800 rounded-xl p-4 text-center">
|
||||||
|
<div class="text-2xl font-bold ${valueClass} mb-1">${value}</div>
|
||||||
|
<div class="text-xs text-gray-500">${label}</div>
|
||||||
|
</div>`
|
||||||
}
|
}
|
||||||
|
|
||||||
function homeCard(href, title, desc, cta, grad) {
|
function homeCard(href, title, desc, cta, grad) {
|
||||||
@@ -138,6 +219,12 @@ function homeCard(href, title, desc, cta, grad) {
|
|||||||
</div>`
|
</div>`
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Queue a source to be pre-selected when Quality page loads
|
||||||
|
let _pendingQualitySource = null
|
||||||
|
function queueQualitySource(source) {
|
||||||
|
_pendingQualitySource = source
|
||||||
|
}
|
||||||
|
|
||||||
// ── Coverage ──────────────────────────────────────────────────────────────
|
// ── Coverage ──────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
let cvFilter = 'all', cvData = null
|
let cvFilter = 'all', cvData = null
|
||||||
@@ -151,7 +238,7 @@ function renderCoverage() {
|
|||||||
</div>
|
</div>
|
||||||
<div class="flex gap-2 flex-wrap justify-end">
|
<div class="flex gap-2 flex-wrap justify-end">
|
||||||
<button id="btn-sync" onclick="cvSyncSources()" class="px-3 py-1.5 text-sm bg-blue-700 hover:bg-blue-600 rounded-lg text-white">Sync Live Sources</button>
|
<button id="btn-sync" onclick="cvSyncSources()" class="px-3 py-1.5 text-sm bg-blue-700 hover:bg-blue-600 rounded-lg text-white">Sync Live Sources</button>
|
||||||
<button id="btn-star" onclick="loadStar()" class="px-3 py-1.5 text-sm bg-purple-700 hover:bg-purple-600 rounded-lg text-white">Load Library STAR Rules</button>
|
<button id="btn-sync-library" onclick="syncLibrary()" class="px-3 py-1.5 text-sm bg-blue-700 hover:bg-blue-600 rounded-lg text-white">Sync Detection Library</button>
|
||||||
<button id="btn-sdl-parsers" onclick="loadSDLParsers()" class="px-3 py-1.5 text-sm bg-purple-700 hover:bg-purple-600 rounded-lg text-white">Load SDL Parsers</button>
|
<button id="btn-sdl-parsers" onclick="loadSDLParsers()" class="px-3 py-1.5 text-sm bg-purple-700 hover:bg-purple-600 rounded-lg text-white">Load SDL Parsers</button>
|
||||||
<button onclick="document.getElementById('f-parser').click()" class="px-3 py-1.5 text-sm bg-gray-700 hover:bg-gray-600 rounded-lg text-white">Upload Parser</button>
|
<button onclick="document.getElementById('f-parser').click()" class="px-3 py-1.5 text-sm bg-gray-700 hover:bg-gray-600 rounded-lg text-white">Upload Parser</button>
|
||||||
<button onclick="cvReset()" class="px-3 py-1.5 text-sm bg-red-900/60 hover:bg-red-800 rounded-lg text-red-300">Reset</button>
|
<button onclick="cvReset()" class="px-3 py-1.5 text-sm bg-red-900/60 hover:bg-red-800 rounded-lg text-red-300">Reset</button>
|
||||||
@@ -166,28 +253,51 @@ function renderCoverage() {
|
|||||||
cvLoad()
|
cvLoad()
|
||||||
}
|
}
|
||||||
|
|
||||||
async function loadSDLParsers() {
|
async function syncLibrary() {
|
||||||
setBtn('btn-sdl-parsers', true)
|
setBtn('btn-sync-library', true)
|
||||||
document.getElementById('cv-err').innerHTML = ''
|
const errEl = document.getElementById('cv-err')
|
||||||
|
if (errEl) errEl.innerHTML = ''
|
||||||
try {
|
try {
|
||||||
const res = await apiPost('/api/coverage/load-parsers-from-sdl', {})
|
const r = await apiPost('/api/coverage/load-detections', {})
|
||||||
if (res.errors?.length) {
|
if (errEl) {
|
||||||
document.getElementById('cv-err').innerHTML = errBox(`${res.errors.length} parser(s) failed to load: ${res.errors.map(e=>e.parser).join(', ')}`)
|
errEl.innerHTML = `<div class="p-3 bg-emerald-900/40 border border-emerald-700 rounded-lg text-sm text-emerald-300 mb-4">✓ ${r.loaded} detection rules synced from ${r.source === 'api' ? 'S1 API' : 'local file'}</div>`
|
||||||
|
setTimeout(() => { errEl.innerHTML = '' }, 4000)
|
||||||
}
|
}
|
||||||
cvLoad()
|
cvLoad()
|
||||||
} catch(e) {
|
} catch(e) {
|
||||||
document.getElementById('cv-err').innerHTML = errBox(e.message)
|
if (errEl) errEl.innerHTML = errBox(e.message)
|
||||||
|
} finally { setBtn('btn-sync-library', false, 'Sync Detection Library') }
|
||||||
|
}
|
||||||
|
|
||||||
|
async function loadSDLParsers() {
|
||||||
|
setBtn('btn-sdl-parsers', true)
|
||||||
|
const errEl = document.getElementById('cv-err')
|
||||||
|
if (errEl) errEl.innerHTML = ''
|
||||||
|
try {
|
||||||
|
const res = await apiPost('/api/coverage/load-parsers-from-sdl', {})
|
||||||
|
let msg = `✓ ${res.loaded} parser${res.loaded !== 1 ? 's' : ''} loaded`
|
||||||
|
if (res.errors?.length) {
|
||||||
|
msg += ` — ${res.errors.length} failed: ${res.errors.map(e=>e.parser).join(', ')}`
|
||||||
|
if (errEl) errEl.innerHTML = errBox(msg)
|
||||||
|
} else {
|
||||||
|
if (errEl) errEl.innerHTML = `<div class="p-3 bg-emerald-900/40 border border-emerald-700 rounded-lg text-sm text-emerald-300 mb-4">${msg}</div>`
|
||||||
|
setTimeout(() => { if (errEl) errEl.innerHTML = '' }, 4000)
|
||||||
|
}
|
||||||
|
cvLoad()
|
||||||
|
} catch(e) {
|
||||||
|
if (errEl) errEl.innerHTML = errBox(e.message)
|
||||||
} finally {
|
} finally {
|
||||||
setBtn('btn-sdl-parsers', false, 'Load SDL Parsers')
|
setBtn('btn-sdl-parsers', false, 'Load SDL Parsers')
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
async function loadStar() {
|
|
||||||
setBtn('btn-star', true)
|
function cvToggleMissing(id) {
|
||||||
document.getElementById('cv-err').innerHTML = ''
|
const el = document.getElementById(id)
|
||||||
try { await apiPost('/api/coverage/load-star-rules', {}); cvLoad() }
|
const chevron = document.getElementById(id + '-chevron')
|
||||||
catch(e) { document.getElementById('cv-err').innerHTML = errBox(e.message) }
|
if (!el) return
|
||||||
finally { setBtn('btn-star', false, 'Load Library STAR Rules') }
|
const open = el.classList.toggle('hidden')
|
||||||
|
if (chevron) chevron.textContent = open ? '▶' : '▼'
|
||||||
}
|
}
|
||||||
|
|
||||||
async function cvUploadSigma(files) {
|
async function cvUploadSigma(files) {
|
||||||
@@ -236,7 +346,7 @@ async function cvLoad() {
|
|||||||
document.getElementById('cv-table').innerHTML = `
|
document.getElementById('cv-table').innerHTML = `
|
||||||
<div class="bg-gray-900/50 border border-gray-800 rounded-lg p-6 text-center text-sm text-gray-500">
|
<div class="bg-gray-900/50 border border-gray-800 rounded-lg p-6 text-center text-sm text-gray-500">
|
||||||
<p class="mb-2">No active sources synced yet.</p>
|
<p class="mb-2">No active sources synced yet.</p>
|
||||||
<p>Click <strong class="text-gray-300">Sync Live Sources</strong> to pull current dataSource.names from the data lake, then <strong class="text-gray-300">Load STAR Rules</strong> and <strong class="text-gray-300">Load SDL Parsers</strong> to see coverage.</p>
|
<p>Click <strong class="text-gray-300">Sync Live Sources</strong> to pull current dataSource.names from the data lake, then <strong class="text-gray-300">Load SDL Parsers</strong> to see coverage.</p>
|
||||||
</div>`
|
</div>`
|
||||||
return
|
return
|
||||||
}
|
}
|
||||||
@@ -286,24 +396,39 @@ function cvSetFilter(f) {
|
|||||||
? `<span class="text-emerald-600 text-xs">✓ All fields covered</span>`
|
? `<span class="text-emerald-600 text-xs">✓ All fields covered</span>`
|
||||||
: `<span class="text-gray-700 text-xs">—</span>`
|
: `<span class="text-gray-700 text-xs">—</span>`
|
||||||
}
|
}
|
||||||
|
const id = 'mf-' + s.source_name.replace(/[^a-z0-9]/gi, '_')
|
||||||
const chips = s.missing_fields.map(f =>
|
const chips = s.missing_fields.map(f =>
|
||||||
`<span class="px-1.5 py-0.5 bg-red-900/40 border border-red-800/60 rounded text-xs font-mono text-red-300">${esc(f)}</span>`
|
`<span class="px-1.5 py-0.5 bg-red-900/40 border border-red-800/60 rounded text-xs font-mono text-red-300">${esc(f)}</span>`
|
||||||
).join(' ')
|
).join(' ')
|
||||||
return `<div class="flex flex-wrap gap-1">${chips}</div>`
|
return `<div>
|
||||||
|
<button onclick="cvToggleMissing('${id}')"
|
||||||
|
class="flex items-center gap-1.5 text-xs text-red-400 hover:text-red-300 transition-colors">
|
||||||
|
<span class="px-1.5 py-0.5 bg-red-900/40 border border-red-800/60 rounded font-semibold">${s.missing_fields.length}</span>
|
||||||
|
<span>field${s.missing_fields.length !== 1 ? 's' : ''} missing</span>
|
||||||
|
<span id="${id}-chevron" class="text-gray-600">▶</span>
|
||||||
|
</button>
|
||||||
|
<div id="${id}" class="hidden mt-1.5 flex flex-wrap gap-1">${chips}</div>
|
||||||
|
</div>`
|
||||||
|
}
|
||||||
|
|
||||||
|
function detectionsCell(s) {
|
||||||
|
if (s.rule_count) {
|
||||||
|
return `<span class="text-purple-400 font-medium">${s.rule_count}</span> rule${s.rule_count !== 1 ? 's' : ''}`
|
||||||
|
}
|
||||||
|
if (s.close_matches && s.close_matches.length) {
|
||||||
|
const hints = s.close_matches.map(m =>
|
||||||
|
`<span class="text-amber-400">${esc(m.library_name)}</span> <span class="text-gray-600">(${m.rule_count} rules)</span>`
|
||||||
|
).join(', ')
|
||||||
|
return `<span class="text-gray-700">—</span> <span class="text-amber-600 text-xs" title="dataSource.name mismatch?">⚠ similar: ${hints}</span>`
|
||||||
|
}
|
||||||
|
return `<span class="text-gray-700">—</span>`
|
||||||
}
|
}
|
||||||
|
|
||||||
function parserCell(s) {
|
function parserCell(s) {
|
||||||
if (s.status === 'covered') {
|
if (s.status === 'covered') {
|
||||||
if (s.parser === 'detected in data') {
|
return `<span class="text-emerald-400 font-medium">✓ Parsed</span>`
|
||||||
return `<span class="text-emerald-400">✓ Parsed <span class="text-emerald-700">(${(s.parser_detected||0).toLocaleString()} typed events detected)</span></span>`
|
|
||||||
}
|
|
||||||
const detail = s.parser_fields ? ` (${s.parser_fields} fields)` : ''
|
|
||||||
return `<span class="text-gray-400">${esc(s.parser)}${detail}</span>`
|
|
||||||
}
|
}
|
||||||
if (s.parser && s.format_type && s.format_type !== 'custom') {
|
return `<span class="text-red-400 font-medium">✗ Not Parsed</span>`
|
||||||
return `<span class="text-amber-400 italic">⚠ ${esc(s.parser)} <span class="text-amber-600">(${esc(s.format_type)} — needs custom parser)</span></span>`
|
|
||||||
}
|
|
||||||
return `<span class="text-red-400 italic">⚠ No parser loaded</span>`
|
|
||||||
}
|
}
|
||||||
|
|
||||||
document.getElementById('cv-table').innerHTML = sources.length === 0
|
document.getElementById('cv-table').innerHTML = sources.length === 0
|
||||||
@@ -314,16 +439,20 @@ function cvSetFilter(f) {
|
|||||||
<th class="pb-2 pr-4 font-medium">Events (7d)</th>
|
<th class="pb-2 pr-4 font-medium">Events (7d)</th>
|
||||||
<th class="pb-2 pr-4 font-medium">Status</th>
|
<th class="pb-2 pr-4 font-medium">Status</th>
|
||||||
<th class="pb-2 pr-4 font-medium">Parser</th>
|
<th class="pb-2 pr-4 font-medium">Parser</th>
|
||||||
<th class="pb-2 pr-4 font-medium">STAR Rules</th>
|
<th class="pb-2 pr-4 font-medium">Detections</th>
|
||||||
<th class="pb-2 font-medium">Detection Fields Missing</th>
|
<th class="pb-2 font-medium">Fields Missing</th>
|
||||||
</tr></thead>
|
</tr></thead>
|
||||||
<tbody>${sources.map(s => `
|
<tbody>${sources.map(s => `
|
||||||
<tr class="border-b border-gray-800/50 hover:bg-gray-900/30">
|
<tr class="border-b border-gray-800/50 hover:bg-gray-900/30">
|
||||||
<td class="py-2 pr-4 font-mono text-xs text-gray-200">${esc(s.source_name)}</td>
|
<td class="py-2 pr-4 font-mono text-xs">
|
||||||
|
<a href="#/quality" onclick="queueQualitySource('${esc(s.source_name)}')"
|
||||||
|
class="text-gray-200 hover:text-purple-400 cursor-pointer transition-colors"
|
||||||
|
title="Open in Parser Quality">${esc(s.source_name)}</a>
|
||||||
|
</td>
|
||||||
<td class="py-2 pr-4 text-xs text-gray-400">${(s.event_count||0).toLocaleString()}</td>
|
<td class="py-2 pr-4 text-xs text-gray-400">${(s.event_count||0).toLocaleString()}</td>
|
||||||
<td class="py-2 pr-4"><span class="px-2 py-0.5 rounded text-xs border ${STYLES[s.status]||''}">${LABELS[s.status]||s.status}</span></td>
|
<td class="py-2 pr-4"><span class="px-2 py-0.5 rounded text-xs border ${STYLES[s.status]||''}">${LABELS[s.status]||s.status}</span></td>
|
||||||
<td class="py-2 pr-4 text-xs">${parserCell(s)}</td>
|
<td class="py-2 pr-4 text-xs">${parserCell(s)}</td>
|
||||||
<td class="py-2 pr-4 text-xs text-gray-400">${s.rules?.length ? s.rules.map(r=>esc(r.rule)).join(', ') : '—'}</td>
|
<td class="py-2 pr-4 text-xs text-gray-400">${detectionsCell(s)}</td>
|
||||||
<td class="py-2 text-xs">${missingFieldsCell(s)}</td>
|
<td class="py-2 text-xs">${missingFieldsCell(s)}</td>
|
||||||
</tr>`).join('')}
|
</tr>`).join('')}
|
||||||
</tbody></table></div>`
|
</tbody></table></div>`
|
||||||
@@ -749,7 +878,17 @@ function renderQuality() {
|
|||||||
<div id="qt-result"></div>
|
<div id="qt-result"></div>
|
||||||
</div>
|
</div>
|
||||||
</div>`)
|
</div>`)
|
||||||
qtLoadParsers()
|
qtLoadParsers().then(() => {
|
||||||
|
// Pre-select source if navigated from Coverage Map or Overview
|
||||||
|
if (_pendingQualitySource) {
|
||||||
|
const src = _pendingQualitySource
|
||||||
|
_pendingQualitySource = null
|
||||||
|
const qsSel = document.getElementById('qs-source')
|
||||||
|
const qpSel = document.getElementById('qp-source')
|
||||||
|
if (qsSel) qsSel.value = src
|
||||||
|
if (qpSel) { qpSel.value = src; qpDiscoverFields() }
|
||||||
|
}
|
||||||
|
})
|
||||||
}
|
}
|
||||||
|
|
||||||
// ── Live Event Sampler ─────────────────────────────────────────────────────
|
// ── Live Event Sampler ─────────────────────────────────────────────────────
|
||||||
|
|||||||
Reference in New Issue
Block a user