mirror of
https://github.com/nox-project/nox-framework.git
synced 2026-06-08 16:07:17 +00:00
maint: patch sources, engine hardening, proxy update for v1.0.1
This commit is contained in:
@@ -17,7 +17,7 @@
|
|||||||
[](https://www.kali.org/)
|
[](https://www.kali.org/)
|
||||||
[](https://blackarch.org/)
|
[](https://blackarch.org/)
|
||||||
[](https://github.com/nox-project/nox-framework)
|
[](https://github.com/nox-project/nox-framework)
|
||||||
[](https://github.com/nox-project/nox-framework)
|
[](https://github.com/nox-project/nox-framework)
|
||||||
|
|
||||||
*OSINT framework for red teaming, digital forensics, and corporate exposure analysis.*
|
*OSINT framework for red teaming, digital forensics, and corporate exposure analysis.*
|
||||||
|
|
||||||
@@ -31,7 +31,7 @@ NOX is a purpose-built cyber threat intelligence engine designed for operators w
|
|||||||
|
|
||||||
| Capability | Detail |
|
| Capability | Detail |
|
||||||
|-|-|
|
|-|-|
|
||||||
| ⚡ **Async Execution Engine** | Massively parallel scanning across 126 intelligence feeds with no sequential bottlenecks and no blocking I/O. |
|
| ⚡ **Async Execution Engine** | Massively parallel scanning across 123 intelligence feeds with no sequential bottlenecks and no blocking I/O. |
|
||||||
| 🛡️ **Guardian Engine** | Integrated OPSEC layer with automatic proxy rotation and SOCKS5 support. Fail-safe kill-switch halts all traffic if the transport circuit is unavailable. |
|
| 🛡️ **Guardian Engine** | Integrated OPSEC layer with automatic proxy rotation and SOCKS5 support. Fail-safe kill-switch halts all traffic if the transport circuit is unavailable. |
|
||||||
| 🧠 **Risk Scoring** | Dynamic 0–100 scoring with time-decay, source confidence weighting, password complexity analysis, persistence multipliers, and HVT detection. |
|
| 🧠 **Risk Scoring** | Dynamic 0–100 scoring with time-decay, source confidence weighting, password complexity analysis, persistence multipliers, and HVT detection. |
|
||||||
| 🔗 **Recursive Avalanche Engine** | Every discovered asset — username, email, cracked password, phone — is automatically re-injected as a new scan seed. Per-asset pipeline runs sequentially (breach → crack → dork → scrape); child assets run concurrently. Identifiers from all four phases feed the pivot queue. Global deduplication and configurable depth cap prevent runaway recursion. |
|
| 🔗 **Recursive Avalanche Engine** | Every discovered asset — username, email, cracked password, phone — is automatically re-injected as a new scan seed. Per-asset pipeline runs sequentially (breach → crack → dork → scrape); child assets run concurrently. Identifiers from all four phases feed the pivot queue. Global deduplication and configurable depth cap prevent runaway recursion. |
|
||||||
@@ -43,7 +43,7 @@ NOX is a purpose-built cyber threat intelligence engine designed for operators w
|
|||||||
|
|
||||||
| Feature | Description |
|
| Feature | Description |
|
||||||
|-|-|
|
|-|-|
|
||||||
| **126 JSON Plugin Sources** | Every intelligence source is a JSON plugin. The execution engine contains zero hardcoded source logic. |
|
| **123 JSON Plugin Sources** | Every intelligence source is a JSON plugin. The execution engine contains zero hardcoded source logic. |
|
||||||
| **Async Core** | Full `asyncio` event loop with JA3 fingerprinting, SSL session management, per-request jitter, and configurable concurrency. |
|
| **Async Core** | Full `asyncio` event loop with JA3 fingerprinting, SSL session management, per-request jitter, and configurable concurrency. |
|
||||||
| **Autoscan Pipeline** | `--autoscan` triggers: breach scan → recursive pivot → Google/Bing/SearXNG dorking → paste/Telegram scraping — all in one command. |
|
| **Autoscan Pipeline** | `--autoscan` triggers: breach scan → recursive pivot → Google/Bing/SearXNG dorking → paste/Telegram scraping — all in one command. |
|
||||||
| **Recursive Avalanche Engine** | Every identifier discovered — from breach records, dork hits, or scraped paste/Telegram content — is re-injected as a new seed. Per-asset pipeline is sequential (breach → crack → dork → scrape); child assets run concurrently via `asyncio.gather`. A global `seen_assets` set prevents infinite loops. Concurrency and depth are fully configurable at runtime via `--threads` and `--depth`. |
|
| **Recursive Avalanche Engine** | Every identifier discovered — from breach records, dork hits, or scraped paste/Telegram content — is re-injected as a new seed. Per-asset pipeline is sequential (breach → crack → dork → scrape); child assets run concurrently via `asyncio.gather`. A global `seen_assets` set prevents infinite loops. Concurrency and depth are fully configurable at runtime via `--threads` and `--depth`. |
|
||||||
@@ -108,7 +108,7 @@ Supported fields: `name`, `endpoint`, `method`, `headers`, `regex_pattern` (or `
|
|||||||
```
|
```
|
||||||
For each asset (seed + every discovered identifier):
|
For each asset (seed + every discovered identifier):
|
||||||
├─ Phase 1 — Breach Scan
|
├─ Phase 1 — Breach Scan
|
||||||
│ 126 sources queried in parallel (async)
|
│ 123 sources queried in parallel (async)
|
||||||
│
|
│
|
||||||
├─ Phase 2 — Hash Crack (non-blocking, concurrent)
|
├─ Phase 2 — Hash Crack (non-blocking, concurrent)
|
||||||
│ Hashes found in breach data → rainbow-table APIs → cracked plaintext
|
│ Hashes found in breach data → rainbow-table APIs → cracked plaintext
|
||||||
@@ -258,7 +258,7 @@ nox-cli --help
|
|||||||
The post-install script automatically:
|
The post-install script automatically:
|
||||||
1. Creates an isolated virtual environment at `/opt/nox-cli/.venv`
|
1. Creates an isolated virtual environment at `/opt/nox-cli/.venv`
|
||||||
2. Installs all Python dependencies inside the venv (PEP 668 compliant — zero system pollution)
|
2. Installs all Python dependencies inside the venv (PEP 668 compliant — zero system pollution)
|
||||||
3. Builds the 126 source plugins
|
3. Builds the 123 source plugins
|
||||||
4. Links `/usr/bin/nox-cli` → `/opt/nox-cli/nox-wrapper.sh`
|
4. Links `/usr/bin/nox-cli` → `/opt/nox-cli/nox-wrapper.sh`
|
||||||
|
|
||||||
### Option 2: From Source
|
### Option 2: From Source
|
||||||
|
|||||||
+27
-32
@@ -71,6 +71,11 @@ class SourceConfig(BaseModel):
|
|||||||
backup_endpoints: List[str] = Field(default_factory=list)
|
backup_endpoints: List[str] = Field(default_factory=list)
|
||||||
# H2: optional confidence override — when set, takes precedence over formula
|
# H2: optional confidence override — when set, takes precedence over formula
|
||||||
confidence: Optional[float] = None
|
confidence: Optional[float] = None
|
||||||
|
# Two-phase poll support (e.g. IntelX: POST → job_id → GET results)
|
||||||
|
poll_endpoint: Optional[str] = None
|
||||||
|
poll_id_field: Optional[str] = None
|
||||||
|
poll_id_param: Optional[str] = None
|
||||||
|
poll_json_root: Optional[str] = None
|
||||||
|
|
||||||
@field_validator("reliability_score")
|
@field_validator("reliability_score")
|
||||||
@classmethod
|
@classmethod
|
||||||
@@ -131,6 +136,10 @@ def _mk(
|
|||||||
bypass_required: Optional[List[str]] = None,
|
bypass_required: Optional[List[str]] = None,
|
||||||
user_agent_type: Optional[str] = None,
|
user_agent_type: Optional[str] = None,
|
||||||
backup_endpoints: Optional[List[str]] = None,
|
backup_endpoints: Optional[List[str]] = None,
|
||||||
|
poll_endpoint: Optional[str] = None,
|
||||||
|
poll_id_field: Optional[str] = None,
|
||||||
|
poll_id_param: Optional[str] = None,
|
||||||
|
poll_json_root: Optional[str] = None,
|
||||||
) -> SourceConfig:
|
) -> SourceConfig:
|
||||||
return SourceConfig(
|
return SourceConfig(
|
||||||
name=name, category=category, endpoint=endpoint, method=method,
|
name=name, category=category, endpoint=endpoint, method=method,
|
||||||
@@ -150,6 +159,10 @@ def _mk(
|
|||||||
bypass_required=bypass_required or None,
|
bypass_required=bypass_required or None,
|
||||||
user_agent_type=user_agent_type,
|
user_agent_type=user_agent_type,
|
||||||
backup_endpoints=backup_endpoints or [],
|
backup_endpoints=backup_endpoints or [],
|
||||||
|
poll_endpoint=poll_endpoint,
|
||||||
|
poll_id_field=poll_id_field,
|
||||||
|
poll_id_param=poll_id_param,
|
||||||
|
poll_json_root=poll_json_root,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
@@ -251,10 +264,12 @@ FREE_PUBLIC_SOURCES: List[SourceConfig] = [
|
|||||||
_base("hudsonrock_osint", "breach_data",
|
_base("hudsonrock_osint", "breach_data",
|
||||||
"https://cavalier.hudsonrock.com/api/json/v2/osint-tools/search-by-email?email={target}", "GET",
|
"https://cavalier.hudsonrock.com/api/json/v2/osint-tools/search-by-email?email={target}", "GET",
|
||||||
{"stealers": "$.stealers"},
|
{"stealers": "$.stealers"},
|
||||||
|
rate_limit=5.0,
|
||||||
|
headers={"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/135.0.0.0 Safari/537.36"},
|
||||||
input_type="email", output_type=["email", "domain", "username"],
|
input_type="email", output_type=["email", "domain", "username"],
|
||||||
normalization_map={"stealers": "breach_record"},
|
normalization_map={"stealers": "breach_record"},
|
||||||
tags=["passive", "stealth"],
|
tags=["passive", "stealth"],
|
||||||
health_check_url="https://cavalier.hudsonrock.com", reliability_score=4),
|
health_check_url="https://cavalier.hudsonrock.com", reliability_score=3),
|
||||||
|
|
||||||
_base("ipinfo_io", "geolocation",
|
_base("ipinfo_io", "geolocation",
|
||||||
"https://ipinfo.io/{target}/json", "GET",
|
"https://ipinfo.io/{target}/json", "GET",
|
||||||
@@ -459,32 +474,15 @@ FREE_PUBLIC_SOURCES: List[SourceConfig] = [
|
|||||||
tags=["passive"],
|
tags=["passive"],
|
||||||
health_check_url="https://packetstormsecurity.com", reliability_score=4),
|
health_check_url="https://packetstormsecurity.com", reliability_score=4),
|
||||||
|
|
||||||
_base("scylla_sh_search", "breaches",
|
|
||||||
"https://scylla.so/search?q={target}", "GET",
|
|
||||||
{"results": "$.*"},
|
|
||||||
input_type="email", output_type=["email", "domain"],
|
|
||||||
tags=["passive", "stealth"],
|
|
||||||
health_check_url="https://scylla.so", reliability_score=2, is_volatile=True,
|
|
||||||
bypass_required=["cloudflare"], user_agent_type="browser",
|
|
||||||
backup_endpoints=["https://scylla.so/api/search?q={target}"]),
|
|
||||||
|
|
||||||
_base("vigilante_pw", "breaches",
|
|
||||||
"https://vigilante.pw/api/search?q={target}", "GET",
|
|
||||||
{"results": "$.results"},
|
|
||||||
input_type="email", output_type=["email"],
|
|
||||||
tags=["passive", "stealth"],
|
|
||||||
health_check_url="https://vigilante.pw", reliability_score=2, is_volatile=True),
|
|
||||||
|
|
||||||
# ── New free sources (v1.0.1) ─────────────────────────────────────────────
|
|
||||||
|
|
||||||
_base("proxynova_comb", "breaches",
|
_base("proxynova_comb", "breaches",
|
||||||
"https://api.proxynova.com/comb?query={target}", "GET",
|
"https://api.proxynova.com/comb?query={target}", "GET",
|
||||||
{"lines": "$.lines"},
|
{"lines": "$.lines"},
|
||||||
input_type="email", output_type=["email"],
|
input_type="email", output_type=["email"],
|
||||||
normalization_map={"lines": "credential_line"},
|
normalization_map={"lines": "credential_line"},
|
||||||
tags=["passive", "stealth"],
|
tags=["passive", "stealth"],
|
||||||
health_check_url="https://api.proxynova.com",
|
health_check_url="https://api.proxynova.com", reliability_score=3, is_volatile=True),
|
||||||
reliability_score=3, is_volatile=True),
|
|
||||||
|
# ── New free sources (v1.0.1) ─────────────────────────────────────────────
|
||||||
|
|
||||||
_base("shodan_internetdb", "scanners",
|
_base("shodan_internetdb", "scanners",
|
||||||
"https://internetdb.shodan.io/{target}", "GET",
|
"https://internetdb.shodan.io/{target}", "GET",
|
||||||
@@ -854,6 +852,10 @@ AUTHENTICATED_PREMIUM_SOURCES += [
|
|||||||
payload_template={"term": "{target}", "buckets": [], "lookuplevel": 0,
|
payload_template={"term": "{target}", "buckets": [], "lookuplevel": 0,
|
||||||
"maxresults": 100, "timeout": 0, "datefrom": "", "dateto": "",
|
"maxresults": 100, "timeout": 0, "datefrom": "", "dateto": "",
|
||||||
"sort": 4, "media": 0, "terminate": []},
|
"sort": 4, "media": 0, "terminate": []},
|
||||||
|
poll_endpoint="https://2.intelx.io/intelligent/search/result",
|
||||||
|
poll_id_field="id",
|
||||||
|
poll_id_param="id",
|
||||||
|
poll_json_root="records",
|
||||||
tags=["passive", "stealth"],
|
tags=["passive", "stealth"],
|
||||||
health_check_url="https://2.intelx.io", reliability_score=5),
|
health_check_url="https://2.intelx.io", reliability_score=5),
|
||||||
|
|
||||||
@@ -925,23 +927,16 @@ AUTHENTICATED_PREMIUM_SOURCES += [
|
|||||||
tags=["passive", "stealth"],
|
tags=["passive", "stealth"],
|
||||||
health_check_url="https://api.flare.io", reliability_score=4),
|
health_check_url="https://api.flare.io", reliability_score=4),
|
||||||
|
|
||||||
_base("leak_lookup", "breaches",
|
_auth("leak_lookup", "breaches",
|
||||||
"https://leak-lookup.com/api/search", "POST",
|
"https://leak-lookup.com/api/search", "POST",
|
||||||
{"results": "$.message"},
|
{"results": "$.message"},
|
||||||
|
headers={"X-API-Key": "{LEAK_LOOKUP_API_KEY}"},
|
||||||
|
api_key_slots=["{LEAK_LOOKUP_API_KEY}"],
|
||||||
input_type="email", output_type=["email"],
|
input_type="email", output_type=["email"],
|
||||||
payload_template={"query": "{target}", "type": "email_address"},
|
payload_template={"query": "{target}", "type": "email_address"},
|
||||||
tags=["passive", "stealth"],
|
tags=["passive", "stealth"],
|
||||||
health_check_url="https://leak-lookup.com", reliability_score=3, is_volatile=True),
|
health_check_url="https://leak-lookup.com", reliability_score=3, is_volatile=True),
|
||||||
|
|
||||||
_auth("cit0day", "breaches",
|
|
||||||
"https://cit0day.in/api/v1/search?query={target}", "GET",
|
|
||||||
{"results": "$.results"},
|
|
||||||
headers={"Authorization": "Bearer {CIT0DAY_API_KEY}"},
|
|
||||||
api_key_slots=["{CIT0DAY_API_KEY}"],
|
|
||||||
input_type="email", output_type=["email"],
|
|
||||||
tags=["passive", "stealth"],
|
|
||||||
health_check_url="https://cit0day.in", reliability_score=2, is_volatile=True),
|
|
||||||
|
|
||||||
# ── DNS Recon ─────────────────────────────────────────────────────────────
|
# ── DNS Recon ─────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
_auth("securitytrails_sub", "dns_recon",
|
_auth("securitytrails_sub", "dns_recon",
|
||||||
@@ -1154,7 +1149,7 @@ AUTHENTICATED_PREMIUM_SOURCES += [
|
|||||||
api_key_slots=["{TWITTER_BEARER_TOKEN}"],
|
api_key_slots=["{TWITTER_BEARER_TOKEN}"],
|
||||||
input_type="username", output_type=["username"],
|
input_type="username", output_type=["username"],
|
||||||
tags=["passive"],
|
tags=["passive"],
|
||||||
health_check_url="https://api.twitter.com", reliability_score=4),
|
health_check_url="https://api.twitter.com", reliability_score=1),
|
||||||
|
|
||||||
_auth("github_code_search", "code",
|
_auth("github_code_search", "code",
|
||||||
"https://api.github.com/search/code?q={target}", "GET",
|
"https://api.github.com/search/code?q={target}", "GET",
|
||||||
|
|||||||
@@ -2105,12 +2105,8 @@ class ProxyManager:
|
|||||||
sys.exit(1)
|
sys.exit(1)
|
||||||
|
|
||||||
_PROXY_SOURCES = [
|
_PROXY_SOURCES = [
|
||||||
(
|
"https://api.proxyscrape.com/v3/free-proxy-list/get?request=displayproxies&protocol=http&timeout=5000&proxy_format=protocolipport&format=text",
|
||||||
"https://api.proxyscrape.com/v2/"
|
"https://raw.githubusercontent.com/proxifly/free-proxy-list/main/proxies/protocols/http/data.txt",
|
||||||
"?request=displayproxies&protocol=http&timeout=5000"
|
|
||||||
"&country=all&ssl=all&anonymity=all"
|
|
||||||
),
|
|
||||||
"https://www.proxy-list.download/api/v1/get?type=http&anon=elite",
|
|
||||||
"https://raw.githubusercontent.com/TheSpeedX/PROXY-List/master/http.txt",
|
"https://raw.githubusercontent.com/TheSpeedX/PROXY-List/master/http.txt",
|
||||||
]
|
]
|
||||||
|
|
||||||
@@ -2225,6 +2221,7 @@ class DorkingEngine(Src):
|
|||||||
self._dead_proxies: set = set()
|
self._dead_proxies: set = set()
|
||||||
self._proxy_index: int = 0
|
self._proxy_index: int = 0
|
||||||
self.proxies = ProxyManager.get_proxies()
|
self.proxies = ProxyManager.get_proxies()
|
||||||
|
self._dead_instances: set = set()
|
||||||
|
|
||||||
def _get_next_proxy(self) -> Optional[str]:
|
def _get_next_proxy(self) -> Optional[str]:
|
||||||
live = [p for p in self.proxies if p not in self._dead_proxies]
|
live = [p for p in self.proxies if p not in self._dead_proxies]
|
||||||
@@ -2294,7 +2291,11 @@ class DorkingEngine(Src):
|
|||||||
from aiohttp_socks import ProxyConnector as _ProxyConnector
|
from aiohttp_socks import ProxyConnector as _ProxyConnector
|
||||||
except ImportError:
|
except ImportError:
|
||||||
_ProxyConnector = None
|
_ProxyConnector = None
|
||||||
instance = random.choice(_SEARX_INSTANCES)
|
live_instances = [i for i in _SEARX_INSTANCES if i not in self._dead_instances]
|
||||||
|
if not live_instances:
|
||||||
|
self._dead_instances.clear()
|
||||||
|
live_instances = list(_SEARX_INSTANCES)
|
||||||
|
instance = random.choice(live_instances)
|
||||||
url = f"{instance}/search?q={urllib.parse.quote(query)}&format=json&categories=general"
|
url = f"{instance}/search?q={urllib.parse.quote(query)}&format=json&categories=general"
|
||||||
proxy = self._get_next_proxy()
|
proxy = self._get_next_proxy()
|
||||||
try:
|
try:
|
||||||
@@ -2306,6 +2307,7 @@ class DorkingEngine(Src):
|
|||||||
async with sess.get(url, headers=_random_headers(),
|
async with sess.get(url, headers=_random_headers(),
|
||||||
timeout=aiohttp_mod.ClientTimeout(total=12)) as resp:
|
timeout=aiohttp_mod.ClientTimeout(total=12)) as resp:
|
||||||
if resp.status != 200:
|
if resp.status != 200:
|
||||||
|
self._dead_instances.add(instance)
|
||||||
if proxy:
|
if proxy:
|
||||||
self._dead_proxies.add(proxy)
|
self._dead_proxies.add(proxy)
|
||||||
return []
|
return []
|
||||||
@@ -2316,6 +2318,7 @@ class DorkingEngine(Src):
|
|||||||
if r.get("url")
|
if r.get("url")
|
||||||
]
|
]
|
||||||
except Exception:
|
except Exception:
|
||||||
|
self._dead_instances.add(instance)
|
||||||
if proxy:
|
if proxy:
|
||||||
self._dead_proxies.add(proxy)
|
self._dead_proxies.add(proxy)
|
||||||
return []
|
return []
|
||||||
@@ -2452,43 +2455,19 @@ class DorkEngine:
|
|||||||
def _search(self, query: str, engine: str) -> List[dict]:
|
def _search(self, query: str, engine: str) -> List[dict]:
|
||||||
hits = []
|
hits = []
|
||||||
try:
|
try:
|
||||||
urls = {
|
# Direct Google/Bing HTML scraping is blocked by CAPTCHA/consent walls
|
||||||
"google": f"https://www.google.com/search?q={urllib.parse.quote(query)}&num=10",
|
# since 2024. Route all engines through SearXNG JSON API.
|
||||||
"bing": f"https://www.bing.com/search?q={urllib.parse.quote(query)}&count=10",
|
url = f"{random.choice(_SEARX_INSTANCES)}/search?q={urllib.parse.quote(query)}&format=json&categories=general"
|
||||||
"ddg": f"{random.choice(_SEARX_INSTANCES)}/search?q={urllib.parse.quote(query)}&format=json&categories=general",
|
resp = self.s.get(url, timeout=15, use_cloudscraper=False)
|
||||||
}
|
|
||||||
use_cs = engine != "ddg" # SearXNG is a plain JSON API — no cloudscraper needed
|
|
||||||
resp = self.s.get(urls.get(engine, urls["google"]), timeout=15, use_cloudscraper=use_cs)
|
|
||||||
if not resp.ok:
|
if not resp.ok:
|
||||||
return hits
|
return hits
|
||||||
# DDG/SearXNG returns JSON
|
data = resp.json()
|
||||||
if engine == "ddg":
|
for r in data.get("results", [])[:10]:
|
||||||
try:
|
if r.get("url"):
|
||||||
data = resp.json()
|
|
||||||
for r in data.get("results", [])[:10]:
|
|
||||||
if r.get("url"):
|
|
||||||
hits.append({"title": r.get("title", ""), "url": r["url"], "snippet": r.get("content", "")})
|
|
||||||
except Exception:
|
|
||||||
pass
|
|
||||||
return hits
|
|
||||||
if not BeautifulSoup:
|
|
||||||
return hits
|
|
||||||
soup = BeautifulSoup(resp.text, "html.parser")
|
|
||||||
selectors = {
|
|
||||||
"google": ("div.g", "h3", "a[href]", ".VwiC3b"),
|
|
||||||
"bing": ("li.b_algo", "h2", "a", ".b_caption p"),
|
|
||||||
}
|
|
||||||
container, title_sel, link_sel, snippet_sel = selectors.get(engine, selectors["google"])
|
|
||||||
for item in soup.select(container)[:10]:
|
|
||||||
title_el = item.select_one(title_sel)
|
|
||||||
link_el = item.select_one(link_sel)
|
|
||||||
snip_el = item.select_one(snippet_sel)
|
|
||||||
if title_el:
|
|
||||||
url = link_el.get("href","") if link_el else ""
|
|
||||||
hits.append({
|
hits.append({
|
||||||
"title": title_el.get_text().strip(),
|
"title": r.get("title", ""),
|
||||||
"url": url if url.startswith("http") else "",
|
"url": r["url"],
|
||||||
"snippet": snip_el.get_text().strip() if snip_el else "",
|
"snippet": r.get("content", ""),
|
||||||
})
|
})
|
||||||
except Exception:
|
except Exception:
|
||||||
pass
|
pass
|
||||||
@@ -6409,6 +6388,39 @@ class NoxSourceProvider(FileSystemProvider):
|
|||||||
if status not in range(200, 300) or not text:
|
if status not in range(200, 300) or not text:
|
||||||
return []
|
return []
|
||||||
|
|
||||||
|
# Two-phase poll: if poll_endpoint is defined, treat the first response
|
||||||
|
# as a job submission, extract the job ID via poll_id_field, then poll
|
||||||
|
# poll_endpoint?<poll_id_param>=<id> until results arrive.
|
||||||
|
poll_endpoint = d.get("poll_endpoint", "")
|
||||||
|
if poll_endpoint:
|
||||||
|
try:
|
||||||
|
job_id = json.loads(text).get(d.get("poll_id_field", "id"))
|
||||||
|
except Exception:
|
||||||
|
job_id = None
|
||||||
|
if not job_id:
|
||||||
|
return []
|
||||||
|
poll_param = d.get("poll_id_param", "id")
|
||||||
|
poll_root = d.get("poll_json_root", d.get("json_root", ""))
|
||||||
|
poll_url = f"{poll_endpoint}?{poll_param}={job_id}"
|
||||||
|
delay = 2
|
||||||
|
for _ in range(4):
|
||||||
|
await asyncio.sleep(delay)
|
||||||
|
p_status, p_text, _ = await self._get(session, poll_url, headers=hdrs)
|
||||||
|
if p_status not in range(200, 300) or not p_text:
|
||||||
|
delay = min(delay * 2, 16)
|
||||||
|
continue
|
||||||
|
try:
|
||||||
|
items = json.loads(p_text)
|
||||||
|
for key in (poll_root.split(".") if poll_root else []):
|
||||||
|
if isinstance(items, dict):
|
||||||
|
items = items.get(key, [])
|
||||||
|
if isinstance(items, list) and items:
|
||||||
|
return self._by_json(p_text, poll_root, d.get("field_map", {}))
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
delay = min(delay * 2, 16)
|
||||||
|
return []
|
||||||
|
|
||||||
regex = d.get("regex_pattern", "")
|
regex = d.get("regex_pattern", "")
|
||||||
if regex:
|
if regex:
|
||||||
return self._by_regex(text, regex)
|
return self._by_regex(text, regex)
|
||||||
@@ -6528,8 +6540,15 @@ class SourceOrchestrator:
|
|||||||
"payload": raw.get("payload_template") or raw.get("payload") or {},
|
"payload": raw.get("payload_template") or raw.get("payload") or {},
|
||||||
# Pass resolved slot keys so FileSystemProvider can use them
|
# Pass resolved slot keys so FileSystemProvider can use them
|
||||||
"_slot_keys": slot_keys,
|
"_slot_keys": slot_keys,
|
||||||
|
# Two-phase poll support
|
||||||
|
"poll_endpoint": raw.get("poll_endpoint", ""),
|
||||||
|
"poll_id_field": raw.get("poll_id_field", "id"),
|
||||||
|
"poll_id_param": raw.get("poll_id_param", "id"),
|
||||||
|
"poll_json_root": raw.get("poll_json_root", ""),
|
||||||
}
|
}
|
||||||
sources.append(NoxSourceProvider(self._sem, self._db, self._config, defn))
|
inst = NoxSourceProvider(self._sem, self._db, self._config, defn)
|
||||||
|
inst._bypass_required = raw.get("bypass_required") or []
|
||||||
|
sources.append(inst)
|
||||||
logger.debug("SourceOrchestrator: loaded %s", jf.name)
|
logger.debug("SourceOrchestrator: loaded %s", jf.name)
|
||||||
except Exception as exc:
|
except Exception as exc:
|
||||||
logger.warning("SourceOrchestrator: failed %s — %s", jf.name, exc)
|
logger.warning("SourceOrchestrator: failed %s — %s", jf.name, exc)
|
||||||
@@ -6558,8 +6577,14 @@ class SourceOrchestrator:
|
|||||||
def get_sources(self, session: "Session", qtype: str) -> List[AsyncSource]:
|
def get_sources(self, session: "Session", qtype: str) -> List[AsyncSource]:
|
||||||
"""Return plugin sources applicable to qtype, pre-filtered to avoid creating unnecessary tasks."""
|
"""Return plugin sources applicable to qtype, pre-filtered to avoid creating unnecessary tasks."""
|
||||||
self._ensure_loaded()
|
self._ensure_loaded()
|
||||||
|
# curl_cffi presence cached in OPTIONAL after first _try_import call
|
||||||
|
_has_cffi = "curl_cffi" in OPTIONAL or _try_import("curl_cffi") is not None
|
||||||
sources: List[AsyncSource] = []
|
sources: List[AsyncSource] = []
|
||||||
for src in self._nox_sources:
|
for src in self._nox_sources:
|
||||||
|
bypass = getattr(src, "_bypass_required", []) or []
|
||||||
|
if "cloudflare" in bypass and not _has_cffi:
|
||||||
|
logger.debug("Skipping %s — cloudflare bypass required, curl_cffi absent", src.name)
|
||||||
|
continue
|
||||||
input_type = getattr(src, "_input_type", "")
|
input_type = getattr(src, "_input_type", "")
|
||||||
if not input_type or input_type == "any" or not qtype or input_type == qtype:
|
if not input_type or input_type == "any" or not qtype or input_type == qtype:
|
||||||
sources.append(src)
|
sources.append(src)
|
||||||
|
|||||||
+1
-1
@@ -10,7 +10,7 @@ brotli>=1.1.0 # brotli decompression for aiohttp br responses
|
|||||||
zstandard>=0.23.0 # zstd decompression for aiohttp zstd responses (Cloudflare/Fastly CDNs)
|
zstandard>=0.23.0 # zstd decompression for aiohttp zstd responses (Cloudflare/Fastly CDNs)
|
||||||
|
|
||||||
# ── Intelligence & Scraping ────────────────────────────────────────────
|
# ── Intelligence & Scraping ────────────────────────────────────────────
|
||||||
requests>=2.31.0
|
requests>=2.32.3
|
||||||
certifi>=2024.2.2 # up-to-date CA bundle for SSL verification
|
certifi>=2024.2.2 # up-to-date CA bundle for SSL verification
|
||||||
cloudscraper>=1.2.71 # Cloudflare-protected endpoint bypass
|
cloudscraper>=1.2.71 # Cloudflare-protected endpoint bypass
|
||||||
beautifulsoup4>=4.12.3
|
beautifulsoup4>=4.12.3
|
||||||
|
|||||||
@@ -1,32 +0,0 @@
|
|||||||
{
|
|
||||||
"name": "cit0day",
|
|
||||||
"category": "breaches",
|
|
||||||
"endpoint": "https://cit0day.in/api/v1/search?query={target}",
|
|
||||||
"method": "GET",
|
|
||||||
"requires_auth": true,
|
|
||||||
"selectors": {
|
|
||||||
"results": "$.results"
|
|
||||||
},
|
|
||||||
"rate_limit": 1.0,
|
|
||||||
"headers": {
|
|
||||||
"Authorization": "Bearer {CIT0DAY_API_KEY}"
|
|
||||||
},
|
|
||||||
"api_key_slots": [
|
|
||||||
"{CIT0DAY_API_KEY}"
|
|
||||||
],
|
|
||||||
"input_type": "email",
|
|
||||||
"output_type": [
|
|
||||||
"email"
|
|
||||||
],
|
|
||||||
"normalization_map": {},
|
|
||||||
"tags": [
|
|
||||||
"passive",
|
|
||||||
"stealth"
|
|
||||||
],
|
|
||||||
"health_check_url": "https://cit0day.in",
|
|
||||||
"expected_status": 200,
|
|
||||||
"reliability_score": 2,
|
|
||||||
"is_volatile": true,
|
|
||||||
"backup_endpoints": [],
|
|
||||||
"confidence": 0.55
|
|
||||||
}
|
|
||||||
@@ -94,7 +94,6 @@ SERVICE_REGISTRY: Dict[str, Dict] = {
|
|||||||
"GOOGLE_CX_KEY": {"display": "Google Custom Search (API key)", "public": False},
|
"GOOGLE_CX_KEY": {"display": "Google Custom Search (API key)", "public": False},
|
||||||
"GOOGLE_CX_ID": {"display": "Google Custom Search (CX ID)", "public": False},
|
"GOOGLE_CX_ID": {"display": "Google Custom Search (CX ID)", "public": False},
|
||||||
"GREYNOISE_API_KEY": {"display": "GreyNoise", "public": False},
|
"GREYNOISE_API_KEY": {"display": "GreyNoise", "public": False},
|
||||||
"HASHES_API_KEY": {"display": "Hashes.org", "public": False},
|
|
||||||
"HIBP_API_KEY": {"display": "HaveIBeenPwned", "public": False},
|
"HIBP_API_KEY": {"display": "HaveIBeenPwned", "public": False},
|
||||||
"HIPPO_API_KEY": {"display": "EmailHippo", "public": False},
|
"HIPPO_API_KEY": {"display": "EmailHippo", "public": False},
|
||||||
"HUNTER_API_KEY": {"display": "Hunter.io", "public": False},
|
"HUNTER_API_KEY": {"display": "Hunter.io", "public": False},
|
||||||
@@ -147,6 +146,8 @@ SERVICE_REGISTRY: Dict[str, Dict] = {
|
|||||||
"MALWAREBAZAAR_API_KEY": {"display": "MalwareBazaar (abuse.ch)", "public": False},
|
"MALWAREBAZAAR_API_KEY": {"display": "MalwareBazaar (abuse.ch)", "public": False},
|
||||||
"FULLHUNT_API_KEY": {"display": "FullHunt (attack surface)", "public": False},
|
"FULLHUNT_API_KEY": {"display": "FullHunt (attack surface)", "public": False},
|
||||||
"NETLAS_API_KEY": {"display": "Netlas.io (internet scanner)", "public": False},
|
"NETLAS_API_KEY": {"display": "Netlas.io (internet scanner)", "public": False},
|
||||||
|
# ── Added in v1.0.2 ───────────────────────────────────────────────
|
||||||
|
"LEAK_LOOKUP_API_KEY": {"display": "Leak-Lookup", "public": False},
|
||||||
}
|
}
|
||||||
|
|
||||||
_PRIVATE_KEYS = {k: v for k, v in SERVICE_REGISTRY.items() if not v["public"]}
|
_PRIVATE_KEYS = {k: v for k, v in SERVICE_REGISTRY.items() if not v["public"]}
|
||||||
|
|||||||
@@ -7,8 +7,10 @@
|
|||||||
"selectors": {
|
"selectors": {
|
||||||
"stealers": "$.stealers"
|
"stealers": "$.stealers"
|
||||||
},
|
},
|
||||||
"rate_limit": 1.0,
|
"rate_limit": 5.0,
|
||||||
"headers": {},
|
"headers": {
|
||||||
|
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/135.0.0.0 Safari/537.36"
|
||||||
|
},
|
||||||
"api_key_slots": [],
|
"api_key_slots": [],
|
||||||
"input_type": "email",
|
"input_type": "email",
|
||||||
"output_type": [
|
"output_type": [
|
||||||
@@ -25,7 +27,7 @@
|
|||||||
],
|
],
|
||||||
"health_check_url": "https://cavalier.hudsonrock.com",
|
"health_check_url": "https://cavalier.hudsonrock.com",
|
||||||
"expected_status": 200,
|
"expected_status": 200,
|
||||||
"reliability_score": 4,
|
"reliability_score": 3,
|
||||||
"backup_endpoints": [],
|
"backup_endpoints": [],
|
||||||
"confidence": 0.85
|
"confidence": 0.7
|
||||||
}
|
}
|
||||||
@@ -40,5 +40,9 @@
|
|||||||
"expected_status": 200,
|
"expected_status": 200,
|
||||||
"reliability_score": 5,
|
"reliability_score": 5,
|
||||||
"backup_endpoints": [],
|
"backup_endpoints": [],
|
||||||
|
"poll_endpoint": "https://2.intelx.io/intelligent/search/result",
|
||||||
|
"poll_id_field": "id",
|
||||||
|
"poll_id_param": "id",
|
||||||
|
"poll_json_root": "records",
|
||||||
"confidence": 1.0
|
"confidence": 1.0
|
||||||
}
|
}
|
||||||
@@ -3,17 +3,21 @@
|
|||||||
"category": "breaches",
|
"category": "breaches",
|
||||||
"endpoint": "https://leak-lookup.com/api/search",
|
"endpoint": "https://leak-lookup.com/api/search",
|
||||||
"method": "POST",
|
"method": "POST",
|
||||||
"requires_auth": false,
|
"requires_auth": true,
|
||||||
"selectors": {
|
"selectors": {
|
||||||
"results": "$.message"
|
"results": "$.message"
|
||||||
},
|
},
|
||||||
"rate_limit": 1.0,
|
"rate_limit": 1.0,
|
||||||
"headers": {},
|
"headers": {
|
||||||
|
"X-API-Key": "{LEAK_LOOKUP_API_KEY}"
|
||||||
|
},
|
||||||
"payload_template": {
|
"payload_template": {
|
||||||
"query": "{target}",
|
"query": "{target}",
|
||||||
"type": "email_address"
|
"type": "email_address"
|
||||||
},
|
},
|
||||||
"api_key_slots": [],
|
"api_key_slots": [
|
||||||
|
"{LEAK_LOOKUP_API_KEY}"
|
||||||
|
],
|
||||||
"input_type": "email",
|
"input_type": "email",
|
||||||
"output_type": [
|
"output_type": [
|
||||||
"email"
|
"email"
|
||||||
|
|||||||
@@ -1,35 +0,0 @@
|
|||||||
{
|
|
||||||
"name": "scylla_sh_search",
|
|
||||||
"category": "breaches",
|
|
||||||
"endpoint": "https://scylla.so/search?q={target}",
|
|
||||||
"method": "GET",
|
|
||||||
"requires_auth": false,
|
|
||||||
"selectors": {
|
|
||||||
"results": "$.*"
|
|
||||||
},
|
|
||||||
"rate_limit": 1.0,
|
|
||||||
"headers": {},
|
|
||||||
"api_key_slots": [],
|
|
||||||
"input_type": "email",
|
|
||||||
"output_type": [
|
|
||||||
"email",
|
|
||||||
"domain"
|
|
||||||
],
|
|
||||||
"normalization_map": {},
|
|
||||||
"tags": [
|
|
||||||
"passive",
|
|
||||||
"stealth"
|
|
||||||
],
|
|
||||||
"health_check_url": "https://scylla.so",
|
|
||||||
"expected_status": 200,
|
|
||||||
"reliability_score": 2,
|
|
||||||
"is_volatile": true,
|
|
||||||
"bypass_required": [
|
|
||||||
"cloudflare"
|
|
||||||
],
|
|
||||||
"user_agent_type": "browser",
|
|
||||||
"backup_endpoints": [
|
|
||||||
"https://scylla.so/api/search?q={target}"
|
|
||||||
],
|
|
||||||
"confidence": 0.55
|
|
||||||
}
|
|
||||||
@@ -24,7 +24,7 @@
|
|||||||
],
|
],
|
||||||
"health_check_url": "https://api.twitter.com",
|
"health_check_url": "https://api.twitter.com",
|
||||||
"expected_status": 200,
|
"expected_status": 200,
|
||||||
"reliability_score": 4,
|
"reliability_score": 1,
|
||||||
"backup_endpoints": [],
|
"backup_endpoints": [],
|
||||||
"confidence": 0.85
|
"confidence": 0.4
|
||||||
}
|
}
|
||||||
@@ -1,28 +0,0 @@
|
|||||||
{
|
|
||||||
"name": "vigilante_pw",
|
|
||||||
"category": "breaches",
|
|
||||||
"endpoint": "https://vigilante.pw/api/search?q={target}",
|
|
||||||
"method": "GET",
|
|
||||||
"requires_auth": false,
|
|
||||||
"selectors": {
|
|
||||||
"results": "$.results"
|
|
||||||
},
|
|
||||||
"rate_limit": 1.0,
|
|
||||||
"headers": {},
|
|
||||||
"api_key_slots": [],
|
|
||||||
"input_type": "email",
|
|
||||||
"output_type": [
|
|
||||||
"email"
|
|
||||||
],
|
|
||||||
"normalization_map": {},
|
|
||||||
"tags": [
|
|
||||||
"passive",
|
|
||||||
"stealth"
|
|
||||||
],
|
|
||||||
"health_check_url": "https://vigilante.pw",
|
|
||||||
"expected_status": 200,
|
|
||||||
"reliability_score": 2,
|
|
||||||
"is_volatile": true,
|
|
||||||
"backup_endpoints": [],
|
|
||||||
"confidence": 0.55
|
|
||||||
}
|
|
||||||
Reference in New Issue
Block a user