From Noise to Knowledge
The internet is noisy. Every day: 60+ new CVEs published. Hundreds of IOC feeds updated. Geopolitical events shift threat actor priorities. Social media amplifies disinformation. Economic indicators correlate with ransomware surges. All of it is data. Almost none of it is intelligence — until something connects it.
The ninja.ing ecosystem exists to make that connection. Not through dashboards bolted together with REST APIs, but through a unified knowledge graph where a vulnerability node connects to the technique that exploits it, connects to the actor that uses it, connects to the campaign that deployed it, connects to the geopolitical event that motivated it.
Two platforms sit at the centre of this. Signal handles cyber threat intelligence — the adversaries, their tools, their techniques, and the indicators they leave behind. Fusion handles everything else — geopolitics, economics, social intelligence, environmental data, military movements — and fuses it with Signal's cyber picture.
The Intelligence Pipeline
Every piece of intelligence follows the same path, regardless of source:
The critical insight: Claims are the universal adapter. Any feed, any format — whether it's a JSON blob from the NVD, a STIX bundle from MITRE, a CSV from CISA, or a GDELT event stream — gets decomposed into atomic subject/predicate/object triples. A Claim says: "this thing has this relationship to that thing, and here's where I learned it."
Signal: The Cyber Layer
20 ingesters pull from the core CTI feeds. NVD for vulnerabilities. MITRE ATT&CK for techniques and threat actors. OTX for IOCs. CISA KEV for known exploited vulnerabilities. Abuse.ch suite (MalwareBazaar, ThreatFox, Feodo, URLhaus) for malware samples and C2 infrastructure. GitHub Advisories. Phishing feeds. Ransomware trackers. CIRCL for intelligence sharing. OpenCTI for STIX import.
Signal's graph holds ~1M nodes across 14 labels: 89K+ indicators, 48K+ software entries, 21K+ infrastructure nodes, 1,900+ vulnerabilities, 1,200+ techniques, 245+ threat actors, and 40+ campaigns. All connected by edges that encode how the threat landscape actually works.
Fusion: The Everything Layer
84 ingesters across eight domains. Not just cyber — geopolitical events (GDELT, ACLED, GTD), economic indicators (FRED, IMF, World Bank), social intelligence (Reddit, Twitter, Mastodon, Telegram, Bluesky), environmental data (NASA FIRMS, GDACS, EMDAT), military and sanctions data (SIPRI, OFAC SDN, OpenSanctions), health (WHO), and technology (Shodan, OpenSky, AIS maritime).
Why does a threat intelligence platform need economic data? Because ransomware surges correlate with cryptocurrency prices. Because sanctions drive state-sponsored actors to find new funding. Because a geopolitical crisis in one region predicts cyber campaigns against targets in another. Fusion makes those connections visible.
The Claims Engine
Every ingester, every source, every feed — they all produce the same thing: Claims. A Claim is the atomic unit of intelligence. It's a frozen, immutable dataclass that says: "Subject X has Predicate relationship to Object Y, sourced from Z with confidence C."
The Claim Dataclass
@dataclass(frozen=True)
class Claim:
subject_type: str # "ThreatActor", "Vulnerability", ...
subject_key: str # "APT28", "CVE-2024-1234", ...
predicate: str # "USES", "EXPLOITS", "TARGETS", ...
object_type: Optional[str] # None for node-only claims
object_key: Optional[str]
first_seen: Optional[str]
last_seen: Optional[str]
source_name: str = "unknown"
source_url: Optional[str] = None
source_item_id: Optional[str] = None
confidence: float = 0.7
subject_props: Optional[Dict] = None # Merge onto subject node
edge_props: Optional[Dict] = None # Merge onto relationship
14 fields. Frozen. Immutable. Two Claims from different sources about the same fact will merge into a single graph edge, preserving the earliest first_seen and updating last_seen. Provenance is never lost — the source_name, source_url, and confidence travel with every assertion.
Graph Merge Strategy
Claims don't INSERT into the graph. They MERGE. The ingestion engine batches Claims in groups of 500, groups them by schema pattern (subject_type, predicate, object_type), and fires a single UNWIND Cypher query per group:
UNWIND $batch AS row
MERGE (s:ThreatActor {name: row.subject_key})
ON CREATE SET s.first_seen = row.first_seen
ON MATCH SET s.last_seen = row.last_seen
MERGE (o:Technique {name: row.object_key})
MERGE (s)-[r:USES]->(o)
SET r.confidence = row.confidence,
r.source = row.source_name
Deadlock retry with exponential backoff: 3 attempts, sleeping 2× the attempt number in seconds. Non-deadlock errors fail immediately. This handles Neo4j's transaction contention when multiple ingesters run concurrently.
Node Labels
Signal uses 14 node labels. Each represents a first-class entity in the threat intelligence domain:
| Label | Count | Role |
|---|---|---|
Indicator | ~89,000 | IOCs — hashes, IPs, domains, URLs |
Software | ~48,000 | Malware families, tools, legitimate software |
Infrastructure | ~21,000 | C2 servers, hosting providers, ASNs |
Vulnerability | ~1,900 | CVEs with CVSS, EPSS, KEV status |
Technique | ~1,200 | MITRE ATT&CK techniques & sub-techniques |
ThreatActor | ~245 | APT groups, cybercrime orgs, hacktivists |
Campaign | ~40 | Named operations & attack campaigns |
Mitigation | — | MITRE mitigations & defensive measures |
Source | — | Intelligence feed provenance nodes |
Event | — | Discrete security events |
EventSummary | — | Aggregated event timelines |
Alert | — | Generated alerts from detection rules |
DetectionRule | — | KQL, Sigma, YARA rules |
TelemetrySource | — | Log sources & data collection points |
Relationship Types
Edges are typed. Each encodes a specific semantic relationship:
Fusion’s Extended Schema
Fusion extends the schema with 20+ node labels to cover cross-domain entities: SocialPost, EconomicIndicator, GeopoliticalEvent, Country, Organization, SanctionedEntity, and more. Cross-domain edges connect a geopolitical crisis to the cyber campaigns it spawns, or an economic shock to the ransomware surge that follows.
Signal — Technical Architecture
Backend: adversary_graph_app.py
One FastAPI application. 225 endpoints — 151 GET, 62 POST, 5 PATCH, 4 DELETE, 3 WebSocket. Organised by domain: graph queries, ML analytics, twins (digital adversary profiles), threat attribution, KQL generation, process mining, causal inference, semantic search, geospatial analysis, CTI extraction, briefing generation, and more.
Key design decision: one file, one process. No microservices splitting. The intelligence domain is deeply interconnected — a risk scoring endpoint needs access to the graph, the ML cache, the twin profiles, and the semantic index. Splitting that into services would add network hops and serialisation overhead for zero architectural benefit.
Core Modules
Signal ships with 32 Python modules in core/. Each handles a distinct analytical capability:
ml.py — Risk propagation, community detection (Louvain), link prediction, centrality analysis, GDS graph projections, anomaly scoring.
gat.py — Graph Attention Networks for node classification.
graphsage.py — GraphSAGE inductive node embeddings.
semantic.py — Hybrid search combining LanceDB vector store with Neo4j fulltext. TF-IDF fallback when embedding models aren't available. Indexes all 1M+ nodes.
geo.py — H3 hexagonal heatmaps. 99 country centroid database. APT actor geographic overlay with threat density calculations.
causal.py — DoWhy-based causal analysis. 4 CTI scenarios: mitigation effectiveness, technique adoption drivers, infrastructure impact, IOC correlation.
extraction.py — LLM-powered (Claude API) entity extraction from unstructured text. Regex fallback. Entity review queue. Direct graph commit.
attribution.py — Multi-source threat actor origin analysis. Infrastructure tracing, temporal clock analysis, TTP fingerprint matching (weighted Jaccard), evidence fusion with Diamond Model output.
adversary_dna.py — 18-dimensional behavioral fingerprinting from access logs. Temporal entropy, velocity, method entropy, IRT stats. Archetyping: scanner, brute forcer, researcher, bot, targeted operator.
bus.py — NATS JetStream pub/sub. 9 subject hierarchies covering Signal, Fusion, Raz0r, V01d, Nexus, Kin0bi, 1D, V0id, and ecosystem-wide events.
federated.py — Federated threat intelligence sharing.
org_twin.py — Organisational digital twin modelling.
causal_rl.py — Reinforcement learning for causal response.
cascade.py — Cascade failure prediction.
neuromorphic.py — Neuromorphic graph processing.
twins.py — Adversary digital twin profiles, Monte Carlo simulation, playbook & wargame generation.
process_mining.py — Attack process flow discovery from event sequences.
kql.py — KQL detection rule generation for Microsoft Sentinel.
graph.py — Neo4j adapter, connection pooling, query helpers.
Frontend: 37 Floating Windows
Signal's UI is a Next.js 16 / React 19 application with a custom window manager. Not tabs. Not pages. Floating, draggable, resizable windows — like a desktop OS for threat intelligence. The user can arrange graph views, ML dashboards, twin profiles, and causal analysis side by side.
| Category | Windows |
|---|---|
| Graph & Visualisation | Graph, Theatre (3D), Galaxy, Heatmap |
| ML & Analytics | Risk, Communities, Predict, Centrality, Emergent |
| Intelligence | Twins, Wargame, Attribution, Causal, DNA, Diff, Briefing |
| Search & Extraction | Search (Spektr), Extract, Workbench |
| Detection | KQL, SIEM, Hunting, Emulation, Process Mining |
| Infrastructure | Traffic, Event Bus, Telemetry, DataLab |
| Admin | Admin, Settings, Users, Audit |
Auth Model
File-based JSON user store. Passwords hashed with bcrypt. JWTs signed with jose. Middleware enforcement in middleware.ts — every route is protected except explicit public paths. MFA gate available. Role-based admin access. SSO token exchange for cross-app authentication.
Sample API Endpoints
# Graph Intelligence
GET /graph/stats # Node/edge counts by label
GET /graph/actors # All threat actors with metrics
POST /search/semantic # Hybrid vector + fulltext search
# ML Pipeline
GET /ml/risk # Risk-propagated scores
GET /ml/communities # Louvain community detection
GET /ml/predict # Link prediction (future edges)
# Adversary Profiling
GET /twins/profile/{name} # Digital twin behavioural model
POST /twins/wargame # Monte Carlo actor vs. defence sim
GET /attribution/{actor} # ORIGAMI multi-source attribution
# Real-time
WS /ws/threats # Live threat feed stream
WS /ws/chat # Niko AI assistant
# Generation
GET /kql/generate # KQL detection rules
GET /briefing/generate # CISO briefing document
POST /extract # LLM CTI entity extraction
Fusion — Technical Architecture
Backend: fusion_app.py
FastAPI with 107 endpoints. Where Signal is deep on cyber, Fusion is wide across domains. The same Claims engine, the same graph store, but pointing at a much broader universe of data — and a set of analytical modules designed to find the connections between domains that no single-domain tool would ever surface.
84 Ingesters Across 8 Domains
NVD CVEs, CISA KEV, OTX, EPSS, ThreatFox, URLhaus, Feodo, MalwareBazaar, CrowdSec, Phishtank, OpenPhish, CIRCL MISP, GitHub Advisories, Exploit-DB
GDELT, ACLED, GTD, SIPRI arms transfers, GPI, INFORM Risk, OpenSanctions, OFAC SDN, ReliefWeb, FEWS NET, ND-GAIN, GPR Index, World Bank WGI, UNHCR, V-Dem, RSS (geopolitical)
FRED (macro + GSCPI), IMF WEO, World Bank, WTO trade, commodity prices, ILOSTAT, UN Comtrade
Twitter/X, Reddit, Mastodon, Telegram, Bluesky, RSS, Google Trends, Mastodon Trending
NASA FIRMS (wildfires), GDACS, EM-DAT, natural disasters, Safecast (radiation), WHO outbreaks, WHO GHO
Shodan intel, AIS maritime (AISstream), OpenSky flights, software registries (NPM, PyPI, GitHub)
MITRE ATT&CK TAXII, NIST CPE dictionary, EPSS probability scores
Core Modules
15 modules in core/, with analytical capabilities tuned for cross-domain fusion:
Narrative Clustering
Fusion's signature analytical capability. narrative.py ingests social posts from all platforms, vectorises them with TF-IDF (5,000 features, bigram support), clusters with DBSCAN, and runs coordination scoring to detect information operations. An LLM labels each cluster's theme, then the engine compares narratives against GDELT ground-truth events for a reality divergence score — how far is the online narrative drifting from what's actually happening?
Security Scanner
Built-in web scanner orchestrating Nuclei (template-based vulnerability scanning), testssl.sh (TLS assessment), and httpx (HTTP probing). Findings write back to the graph as SecurityFinding nodes linked to Domain and Vulnerability nodes. Single-scan queueing with thread-safe locking and cancellation support.
Frontend: Cross-Domain Dashboard
Next.js 16 / React 19 with a different approach to Signal's window manager. Fusion uses an AppShell with sidebar navigation — globe view, threat dashboard, social intelligence, narrative analysis, DataLab, scanner, twins, forecasting. Day/night mode toggle (warm paper palette in day, dark stealth in night). No server-side middleware — auth is handled client-side via the AppShell component.
Sample API Endpoints
# Cross-Domain Analysis
GET /ml/narrative-clusters # Social narrative clustering
GET /ml/narrative-divergence # Reality vs narrative drift
GET /ml/cross-domain-anomalies # Cross-domain anomaly detection
GET /ml/hidden-connections # Latent graph relationships
GET /ml/mega-risks # Compound multi-domain risks
# Intelligence Products
GET /intel/sitrep # Situation report
GET /briefing # Daily intelligence briefing
GET /actor/{name}/dossier # Full actor dossier
GET /globe/data # 3D globe risk overlay
# Social Intelligence
GET /social/feed # Cross-platform social feed
GET /social/entity/{type}/{key} # Entity social footprint
POST /niko/chat # Niko AI analyst assistant
# Scanner
POST /scanner/scan # Launch web security scan
GET /scanner/results # Scan findings
# Forecasting
GET /forecast/{horizon} # Multi-horizon threat forecast
POST /forecast/scenario # Scenario planning
The Stealth Stack
Container Topology
Each app follows the same Docker Compose pattern: Neo4j 5 + FastAPI backend + Next.js frontend. Signal adds Caddy (reverse proxy for all domains) and NATS (event bus). Some apps add Redis for real-time features.
11 domains
Security shield
Auth middleware
API routes
225+ endpoints
WebSocket
Bolt protocol
APOC & GDS
9 subjects
Cross-app pub/sub
Caddy Routing
One Caddy instance in Signal's Docker Compose handles HTTPS termination for all 11 production domains. The routing logic is subtle and ordering matters:
# 1. Next.js API routes go to UI container
handle /api/auth/* → ui:3000
handle /api/admin/* → ui:3000
handle /api/sso/* → ui:3000
# 2. All other /api/* go to Python backend
handle_path /api/* → api:18011 # strips /api prefix
# 3. Everything else goes to Next.js
handle /* → ui:3000
Critical detail: New Next.js API routes must be added to the Caddy config before the handle_path /api/* catch-all, or they'll be incorrectly routed to the Python backend.
Security Shield
All 11 domains import a shared Caddy (security_shield) snippet:
# Block common attack patterns
.git/* → 404 # VCS probe
*.php → 404 # PHP probe
wp-* → 404 # WordPress probe
Empty UA → abort # Drop connection
# Response headers
Content-Security-Policy: default-src 'self' ...
Permissions-Policy: camera=(), microphone=() ...
X-Content-Type-Options: nosniff
Request body limit: 10MB
fail2ban
Three jails watching Caddy access logs:
| Jail | Trigger | Ban Duration |
|---|---|---|
caddy-scanner | 5 blocked paths in 10 min | 24 hours |
caddy-auth | 10 auth failures in 5 min | 1 hour |
caddy-aggressive | 50 404s in 5 min | 12 hours |
Network Architecture
All apps run on a single Hetzner dedicated server. Non-RTM containers join RTM's Docker network (rapid-threat-modeler_default) to access the shared Caddy proxy. Each app has its own Neo4j instance on a unique Bolt port. In production, Neo4j and API ports are not exposed to the host — only Caddy's 443 is public.
# API Ports (internal only in prod)
Signal: 18011 Fusion: 18012 Raz0r: 18013
ANTOS: 18014 Kin0bi: 18015 Nexus: 18016
1D: 18017 V01d: 18018 V0id: 18019
Range: 18020 Knox: 18021 Social: 18022
War Room:18023
# Neo4j Bolt Ports (internal only in prod)
Signal: 17687 Fusion: 17688 Raz0r: 17689
Kin0bi: 17690 Nexus: 17691 1D: 17692
V01d: 17693 V0id: 17694 Range: 17695
Social: 17696 War Room:17697
# Public (Caddy)
HTTPS: 443
Deployment Pattern
Every deploy follows the same sequence. No CI/CD pipeline — deliberate simplicity:
# 1. Push code
git push
# 2. SSH, pull, rebuild
ssh root@server
cd /opt/{app}
git pull
docker compose -f docker-compose.yml \
-f docker-compose.prod.yml \
up --build -d api ui
# 3. If non-RTM app: restart Caddy from RTM dir
cd /opt/rapid-threat-modeler
docker compose -f docker-compose.yml \
-f docker-compose.prod.yml \
restart caddy
The Full Arsenal
15 systems. One SSO. One event bus. One graph mindset. Each built for a specific intelligence domain, all designed to share context through the graph and NATS.
| # | System | Domain | One-liner |
|---|---|---|---|
| 1 | Signal | CTI | Threat graph, 225 endpoints, 32 ML modules, 83 windows |
| 2 | Fusion | Cross-domain | 84 ingesters across 8 domains, narrative clustering, scanner |
| 3 | Raz0r | SIEM | Rust EDR agent, ransomware predictor, cross-node correlator |
| 4 | ANTOS | UX | Embedded at /antos — unified analyst desktop |
| 5 | Kin0bi | Financial | Real-time crypto/stocks/forex, anomaly detection, portfolio risk |
| 6 | Nexus | OSINT | Suspicion propagation, money flow, UBO resolution, sanctions |
| 7 | 1D | Identity | BloodHound/LDAP/Azure AD graph, attack paths, kerberoastable |
| 8 | V01d | Sentiment | GDELT/RSS/Reddit/FRED pipeline, Oracle score, ninjaTONE |
| 9 | V0id | Agents | 3 autonomous agents (Sentinel, Warden, Spectre), IR playbooks |
| 10 | Los Alamos | Wargaming | Red vs Blue agentic range, LLM-driven adversaries, ELO scoring |
| 11 | Knox | Secrets | Vault, crypto toolkit, privacy engine, TOTP authenticator |
| 12 | Social | Collaboration | TI messaging, IOC auto-detect, encrypted channels, NATS feed |
| 13 | War Room | IR | LiveKit video, shared timelines, IOC panel, breach tracker |
| 14 | NinjaClaw | CLI | Hardened CLI agent, 10 scanners, CIS rules, Signal intel link |
| 15 | GITAIR | DevSecOps | Git security scanning & air-gapped repository management |
SSO: One Identity Everywhere
Every app implements the same SSO handshake. When a user is authenticated in Signal and clicks through to Fusion, the flow is:
Each app has its own cookie name to avoid conflicts. The JWT payload is verified server-side. No shared session store — just cryptographic trust.
NATS Event Bus
JetStream provides durable, at-least-once delivery across all apps. 9 subject hierarchies:
signal.* # Signal threat events
fusion.* # Fusion cross-domain events
razor.* # SIEM detections & alerts
v01d.* # Sentiment & Oracle updates
nexus.* # OSINT investigation events
kin0bi.* # Financial anomalies
id.* # Identity exposure events
v0id.* # Agent actions & findings
ecosystem.* # System-wide coordination
When Raz0r detects a suspicious process, it publishes to razor.alert. V0id agents subscribe and auto-triage. Signal enriches the IOC. Fusion correlates with geopolitical context. All without any app knowing about the others — just messages on a bus.
Galaxy Visualization
Signal's /galaxy/data endpoint samples ~8,000 nodes from the ML graph, groups them by label, and computes a 3D radial cluster layout. V01d's ninjaTONE page renders this as a Three.js point cloud — a galaxy of threats you can fly through, click, and explore. It's not just pretty. It's the entire threat landscape in one view.
Subject to Change
This document describes the architecture as of March 2026. It will change. The ecosystem is alive — new ingesters, new ML modules, new analytical capabilities ship regularly. What won't change is the core philosophy: one graph, atomic claims, domain fusion.
The hardest problem in security isn't detection. It's connection. Every tool in this ecosystem exists to make one more connection visible — between an IOC and an actor, between an actor and a campaign, between a campaign and the geopolitical event that triggered it. When all those connections live in one graph, you stop reacting to alerts and start understanding adversaries.
The graph doesn't give you answers. It gives you the right questions — and the traversals to find them.