The Master Blueprint

Section I

From Noise to Knowledge

概念構造 — Conceptual Architecture

The internet is noisy. Every day: 60+ new CVEs published. Hundreds of IOC feeds updated. Geopolitical events shift threat actor priorities. Social media amplifies disinformation. Economic indicators correlate with ransomware surges. All of it is data. Almost none of it is intelligence — until something connects it.

The ninja.ing ecosystem exists to make that connection. Not through dashboards bolted together with REST APIs, but through a unified knowledge graph where a vulnerability node connects to the technique that exploits it, connects to the actor that uses it, connects to the campaign that deployed it, connects to the geopolitical event that motivated it.

Two platforms sit at the centre of this. Signal handles cyber threat intelligence — the adversaries, their tools, their techniques, and the indicators they leave behind. Fusion handles everything else — geopolitics, economics, social intelligence, environmental data, military movements — and fuses it with Signal's cyber picture.

The Intelligence Pipeline

Every piece of intelligence follows the same path, regardless of source:

Wild Raw feeds

→

Ingest 20 + 84 sources

→

Claim Atomic triples

→

Graph Neo4j MERGE

→

Enrich ML & analytics

→

Present 83 windows

The critical insight: Claims are the universal adapter. Any feed, any format — whether it's a JSON blob from the NVD, a STIX bundle from MITRE, a CSV from CISA, or a GDELT event stream — gets decomposed into atomic subject/predicate/object triples. A Claim says: "this thing has this relationship to that thing, and here's where I learned it."

Signal: The Cyber Layer

20 ingesters pull from the core CTI feeds. NVD for vulnerabilities. MITRE ATT&CK for techniques and threat actors. OTX for IOCs. CISA KEV for known exploited vulnerabilities. Abuse.ch suite (MalwareBazaar, ThreatFox, Feodo, URLhaus) for malware samples and C2 infrastructure. GitHub Advisories. Phishing feeds. Ransomware trackers. CIRCL for intelligence sharing. OpenCTI for STIX import.

Signal's graph holds ~1M nodes across 14 labels: 89K+ indicators, 48K+ software entries, 21K+ infrastructure nodes, 1,900+ vulnerabilities, 1,200+ techniques, 245+ threat actors, and 40+ campaigns. All connected by edges that encode how the threat landscape actually works.

Fusion: The Everything Layer

84 ingesters across eight domains. Not just cyber — geopolitical events (GDELT, ACLED, GTD), economic indicators (FRED, IMF, World Bank), social intelligence (Reddit, Twitter, Mastodon, Telegram, Bluesky), environmental data (NASA FIRMS, GDACS, EMDAT), military and sanctions data (SIPRI, OFAC SDN, OpenSanctions), health (WHO), and technology (Shodan, OpenSky, AIS maritime).

Why does a threat intelligence platform need economic data? Because ransomware surges correlate with cryptocurrency prices. Because sanctions drive state-sponsored actors to find new funding. Because a geopolitical crisis in one region predicts cyber campaigns against targets in another. Fusion makes those connections visible.

Section II

The Claims Engine

論理構造 — Logical Architecture

Every ingester, every source, every feed — they all produce the same thing: Claims. A Claim is the atomic unit of intelligence. It's a frozen, immutable dataclass that says: "Subject X has Predicate relationship to Object Y, sourced from Z with confidence C."

The Claim Dataclass

core/claims.py
@dataclass(frozen=True)
class Claim:
    subject_type:   str           # "ThreatActor", "Vulnerability", ...
    subject_key:    str           # "APT28", "CVE-2024-1234", ...
    predicate:      str           # "USES", "EXPLOITS", "TARGETS", ...

    object_type:    Optional[str] # None for node-only claims
    object_key:     Optional[str]

    first_seen:     Optional[str]
    last_seen:      Optional[str]

    source_name:    str   = "unknown"
    source_url:     Optional[str] = None
    source_item_id: Optional[str] = None
    confidence:     float = 0.7

    subject_props:  Optional[Dict] = None  # Merge onto subject node
    edge_props:     Optional[Dict] = None  # Merge onto relationship

14 fields. Frozen. Immutable. Two Claims from different sources about the same fact will merge into a single graph edge, preserving the earliest first_seen and updating last_seen. Provenance is never lost — the source_name, source_url, and confidence travel with every assertion.

Graph Merge Strategy

Claims don't INSERT into the graph. They MERGE. The ingestion engine batches Claims in groups of 500, groups them by schema pattern (subject_type, predicate, object_type), and fires a single UNWIND Cypher query per group:

Neo4j Merge Pattern
UNWIND $batch AS row
MERGE (s:ThreatActor {name: row.subject_key})
  ON CREATE SET s.first_seen = row.first_seen
  ON MATCH SET  s.last_seen  = row.last_seen
MERGE (o:Technique {name: row.object_key})
MERGE (s)-[r:USES]->(o)
  SET r.confidence = row.confidence,
      r.source     = row.source_name

Deadlock retry with exponential backoff: 3 attempts, sleeping 2× the attempt number in seconds. Non-deadlock errors fail immediately. This handles Neo4j's transaction contention when multiple ingesters run concurrently.

Node Labels

Signal uses 14 node labels. Each represents a first-class entity in the threat intelligence domain:

Label	Count	Role
`Indicator`	~89,000	IOCs — hashes, IPs, domains, URLs
`Software`	~48,000	Malware families, tools, legitimate software
`Infrastructure`	~21,000	C2 servers, hosting providers, ASNs
`Vulnerability`	~1,900	CVEs with CVSS, EPSS, KEV status
`Technique`	~1,200	MITRE ATT&CK techniques & sub-techniques
`ThreatActor`	~245	APT groups, cybercrime orgs, hacktivists
`Campaign`	~40	Named operations & attack campaigns
`Mitigation`	—	MITRE mitigations & defensive measures
`Source`	—	Intelligence feed provenance nodes
`Event`	—	Discrete security events
`EventSummary`	—	Aggregated event timelines
`Alert`	—	Generated alerts from detection rules
`DetectionRule`	—	KQL, Sigma, YARA rules
`TelemetrySource`	—	Log sources & data collection points

Relationship Types

Edges are typed. Each encodes a specific semantic relationship:

ThreatActor —USES→ Technique

ThreatActor —ATTRIBUTED_TO→ Campaign

Software —EXPLOITS→ Vulnerability

Technique —TARGETS→ Software

Indicator —INDICATES→ Software

Mitigation —MITIGATES→ Technique

Campaign —USES→ Infrastructure

Fusion’s Extended Schema

Fusion extends the schema with 20+ node labels to cover cross-domain entities: SocialPost, EconomicIndicator, GeopoliticalEvent, Country, Organization, SanctionedEntity, and more. Cross-domain edges connect a geopolitical crisis to the cyber campaigns it spawns, or an economic shock to the ransomware surge that follows.

Section III

Signal — Technical Architecture

信号 — Cyber Threat Intelligence

Backend: `adversary_graph_app.py`

One FastAPI application. 225 endpoints — 151 GET, 62 POST, 5 PATCH, 4 DELETE, 3 WebSocket. Organised by domain: graph queries, ML analytics, twins (digital adversary profiles), threat attribution, KQL generation, process mining, causal inference, semantic search, geospatial analysis, CTI extraction, briefing generation, and more.

Key design decision: one file, one process. No microservices splitting. The intelligence domain is deeply interconnected — a risk scoring endpoint needs access to the graph, the ML cache, the twin profiles, and the semantic index. Splitting that into services would add network hops and serialisation overhead for zero architectural benefit.

Core Modules

Signal ships with 32 Python modules in core/. Each handles a distinct analytical capability:

ML Engine core

ml.py — Risk propagation, community detection (Louvain), link prediction, centrality analysis, GDS graph projections, anomaly scoring.

gat.py — Graph Attention Networks for node classification.

graphsage.py — GraphSAGE inductive node embeddings.

Semantic Search tier a

semantic.py — Hybrid search combining LanceDB vector store with Neo4j fulltext. TF-IDF fallback when embedding models aren't available. Indexes all 1M+ nodes.

Geospatial tier a

geo.py — H3 hexagonal heatmaps. 99 country centroid database. APT actor geographic overlay with threat density calculations.

Causal Inference tier a

causal.py — DoWhy-based causal analysis. 4 CTI scenarios: mitigation effectiveness, technique adoption drivers, infrastructure impact, IOC correlation.

CTI Extraction tier a

extraction.py — LLM-powered (Claude API) entity extraction from unstructured text. Regex fallback. Entity review queue. Direct graph commit.

ORIGAMI Attribution tier a

attribution.py — Multi-source threat actor origin analysis. Infrastructure tracing, temporal clock analysis, TTP fingerprint matching (weighted Jaccard), evidence fusion with Diamond Model output.

Adversary DNA tier a

adversary_dna.py — 18-dimensional behavioral fingerprinting from access logs. Temporal entropy, velocity, method entropy, IRT stats. Archetyping: scanner, brute forcer, researcher, bot, targeted operator.

Event Bus infra

bus.py — NATS JetStream pub/sub. 9 subject hierarchies covering Signal, Fusion, Raz0r, V01d, Nexus, Kin0bi, 1D, V0id, and ecosystem-wide events.

S-Tier Modules

federated.py — Federated threat intelligence sharing.

org_twin.py — Organisational digital twin modelling.

causal_rl.py — Reinforcement learning for causal response.

cascade.py — Cascade failure prediction.

neuromorphic.py — Neuromorphic graph processing.

Supporting Modules

twins.py — Adversary digital twin profiles, Monte Carlo simulation, playbook & wargame generation.

process_mining.py — Attack process flow discovery from event sequences.

kql.py — KQL detection rule generation for Microsoft Sentinel.

graph.py — Neo4j adapter, connection pooling, query helpers.

Frontend: 37 Floating Windows

Signal's UI is a Next.js 16 / React 19 application with a custom window manager. Not tabs. Not pages. Floating, draggable, resizable windows — like a desktop OS for threat intelligence. The user can arrange graph views, ML dashboards, twin profiles, and causal analysis side by side.

Category	Windows
Graph & Visualisation	Graph, Theatre (3D), Galaxy, Heatmap
ML & Analytics	Risk, Communities, Predict, Centrality, Emergent
Intelligence	Twins, Wargame, Attribution, Causal, DNA, Diff, Briefing
Search & Extraction	Search (Spektr), Extract, Workbench
Detection	KQL, SIEM, Hunting, Emulation, Process Mining
Infrastructure	Traffic, Event Bus, Telemetry, DataLab
Admin	Admin, Settings, Users, Audit

Auth Model

File-based JSON user store. Passwords hashed with bcrypt. JWTs signed with jose. Middleware enforcement in middleware.ts — every route is protected except explicit public paths. MFA gate available. Role-based admin access. SSO token exchange for cross-app authentication.

Sample API Endpoints

Selected Endpoints (225 total)
# Graph Intelligence
GET  /graph/stats          # Node/edge counts by label
GET  /graph/actors         # All threat actors with metrics
POST /search/semantic      # Hybrid vector + fulltext search

# ML Pipeline
GET  /ml/risk              # Risk-propagated scores
GET  /ml/communities       # Louvain community detection
GET  /ml/predict           # Link prediction (future edges)

# Adversary Profiling
GET  /twins/profile/{name} # Digital twin behavioural model
POST /twins/wargame        # Monte Carlo actor vs. defence sim
GET  /attribution/{actor}  # ORIGAMI multi-source attribution

# Real-time
WS   /ws/threats           # Live threat feed stream
WS   /ws/chat              # Niko AI assistant

# Generation
GET  /kql/generate         # KQL detection rules
GET  /briefing/generate    # CISO briefing document
POST /extract              # LLM CTI entity extraction

Section IV

Fusion — Technical Architecture

融合 — Cross-Domain Intelligence

Backend: `fusion_app.py`

FastAPI with 107 endpoints. Where Signal is deep on cyber, Fusion is wide across domains. The same Claims engine, the same graph store, but pointing at a much broader universe of data — and a set of analytical modules designed to find the connections between domains that no single-domain tool would ever surface.

84 Ingesters Across 8 Domains

Cyber & Vulnerability 14

NVD CVEs, CISA KEV, OTX, EPSS, ThreatFox, URLhaus, Feodo, MalwareBazaar, CrowdSec, Phishtank, OpenPhish, CIRCL MISP, GitHub Advisories, Exploit-DB

Geopolitical 18

GDELT, ACLED, GTD, SIPRI arms transfers, GPI, INFORM Risk, OpenSanctions, OFAC SDN, ReliefWeb, FEWS NET, ND-GAIN, GPR Index, World Bank WGI, UNHCR, V-Dem, RSS (geopolitical)

Economic 9

FRED (macro + GSCPI), IMF WEO, World Bank, WTO trade, commodity prices, ILOSTAT, UN Comtrade

Social Intelligence 8

Twitter/X, Reddit, Mastodon, Telegram, Bluesky, RSS, Google Trends, Mastodon Trending

Environmental & Health 7

NASA FIRMS (wildfires), GDACS, EM-DAT, natural disasters, Safecast (radiation), WHO outbreaks, WHO GHO

Military & Specialty 6

Shodan intel, AIS maritime (AISstream), OpenSky flights, software registries (NPM, PyPI, GitHub)

Detection & Standards 3

MITRE ATT&CK TAXII, NIST CPE dictionary, EPSS probability scores

Core Modules

15 modules in core/, with analytical capabilities tuned for cross-domain fusion:

narrative.pyTF-IDF + DBSCAN clustering, coordination scoring, LLM labels, reality divergence

fusion_ml.pyCross-domain ML algorithms, hidden connections, signal correlations

twins.pyAdversary digital twins, Monte Carlo kill-chain simulation, wargaming

emergent.py7 detectors: TTP convergence, infra overlap, community drift, velocity anomaly, cascade, cross-domain bridges, prediction materialisation

datalab.pySelf-service graph CRUD, bulk import/export, saved queries, audit logging

forecast.pyMulti-horizon forecasting, scenario planning, sector analysis

process_mining.pyAttack flow discovery, conformance checking, dwell time analysis

ioc_extract.pyIOC pattern extraction from unstructured text

predictions.pyPrediction engine with historical tracking & verification

schema.pyNeo4j schema definitions & constraint management

Narrative Clustering

Fusion's signature analytical capability. narrative.py ingests social posts from all platforms, vectorises them with TF-IDF (5,000 features, bigram support), clusters with DBSCAN, and runs coordination scoring to detect information operations. An LLM labels each cluster's theme, then the engine compares narratives against GDELT ground-truth events for a reality divergence score — how far is the online narrative drifting from what's actually happening?

Security Scanner

Built-in web scanner orchestrating Nuclei (template-based vulnerability scanning), testssl.sh (TLS assessment), and httpx (HTTP probing). Findings write back to the graph as SecurityFinding nodes linked to Domain and Vulnerability nodes. Single-scan queueing with thread-safe locking and cancellation support.

Frontend: Cross-Domain Dashboard

Next.js 16 / React 19 with a different approach to Signal's window manager. Fusion uses an AppShell with sidebar navigation — globe view, threat dashboard, social intelligence, narrative analysis, DataLab, scanner, twins, forecasting. Day/night mode toggle (warm paper palette in day, dark stealth in night). No server-side middleware — auth is handled client-side via the AppShell component.

Sample API Endpoints

Selected Endpoints (107 total)
# Cross-Domain Analysis
GET  /ml/narrative-clusters      # Social narrative clustering
GET  /ml/narrative-divergence     # Reality vs narrative drift
GET  /ml/cross-domain-anomalies   # Cross-domain anomaly detection
GET  /ml/hidden-connections       # Latent graph relationships
GET  /ml/mega-risks              # Compound multi-domain risks

# Intelligence Products
GET  /intel/sitrep               # Situation report
GET  /briefing                   # Daily intelligence briefing
GET  /actor/{name}/dossier       # Full actor dossier
GET  /globe/data                 # 3D globe risk overlay

# Social Intelligence
GET  /social/feed                # Cross-platform social feed
GET  /social/entity/{type}/{key} # Entity social footprint
POST /niko/chat                  # Niko AI analyst assistant

# Scanner
POST /scanner/scan               # Launch web security scan
GET  /scanner/results            # Scan findings

# Forecasting
GET  /forecast/{horizon}         # Multi-horizon threat forecast
POST /forecast/scenario          # Scenario planning

Section V

The Stealth Stack

基盤 — Infrastructure

Container Topology

Each app follows the same Docker Compose pattern: Neo4j 5 + FastAPI backend + Next.js frontend. Signal adds Caddy (reverse proxy for all domains) and NATS (event bus). Some apps add Redis for real-time features.

Caddy

Reverse Proxy

HTTPS termination
11 domains
Security shield

Next.js

UI Container

React 19 SSR
Auth middleware
API routes

FastAPI

API Container

Python 3.14
225+ endpoints
WebSocket

Neo4j 5

Graph Database

1M+ nodes
Bolt protocol
APOC & GDS

NATS

Event Bus

JetStream
9 subjects
Cross-app pub/sub

Caddy Routing

One Caddy instance in Signal's Docker Compose handles HTTPS termination for all 11 production domains. The routing logic is subtle and ordering matters:

Caddy Routing Logic (simplified)
# 1. Next.js API routes go to UI container
handle /api/auth/*     → ui:3000
handle /api/admin/*    → ui:3000
handle /api/sso/*      → ui:3000

# 2. All other /api/* go to Python backend
handle_path /api/*     → api:18011  # strips /api prefix

# 3. Everything else goes to Next.js
handle /*              → ui:3000

Critical detail: New Next.js API routes must be added to the Caddy config before the handle_path /api/* catch-all, or they'll be incorrectly routed to the Python backend.

Security Shield

All 11 domains import a shared Caddy (security_shield) snippet:

Security Shield Rules
# Block common attack patterns
.git/*         → 404     # VCS probe
*.php          → 404     # PHP probe
wp-*           → 404     # WordPress probe
Empty UA       → abort   # Drop connection

# Response headers
Content-Security-Policy: default-src 'self' ...
Permissions-Policy: camera=(), microphone=() ...
X-Content-Type-Options: nosniff
Request body limit: 10MB

fail2ban

Three jails watching Caddy access logs:

Jail	Trigger	Ban Duration
`caddy-scanner`	5 blocked paths in 10 min	24 hours
`caddy-auth`	10 auth failures in 5 min	1 hour
`caddy-aggressive`	50 404s in 5 min	12 hours

Network Architecture

All apps run on a single Hetzner dedicated server. Non-RTM containers join RTM's Docker network (rapid-threat-modeler_default) to access the shared Caddy proxy. Each app has its own Neo4j instance on a unique Bolt port. In production, Neo4j and API ports are not exposed to the host — only Caddy's 443 is public.

Port Allocation
# API Ports (internal only in prod)
Signal:  18011    Fusion:  18012    Raz0r:   18013
ANTOS:   18014    Kin0bi:  18015    Nexus:   18016
1D:      18017    V01d:    18018    V0id:    18019
Range:   18020    Knox:    18021    Social:  18022
War Room:18023

# Neo4j Bolt Ports (internal only in prod)
Signal:  17687    Fusion:  17688    Raz0r:   17689
Kin0bi:  17690    Nexus:   17691    1D:      17692
V01d:    17693    V0id:    17694    Range:   17695
Social:  17696    War Room:17697

# Public (Caddy)
HTTPS:   443

Deployment Pattern

Every deploy follows the same sequence. No CI/CD pipeline — deliberate simplicity:

Deploy Sequence
# 1. Push code
git push

# 2. SSH, pull, rebuild
ssh root@server
cd /opt/{app}
git pull
docker compose -f docker-compose.yml \
  -f docker-compose.prod.yml \
  up --build -d api ui

# 3. If non-RTM app: restart Caddy from RTM dir
cd /opt/rapid-threat-modeler
docker compose -f docker-compose.yml \
  -f docker-compose.prod.yml \
  restart caddy

Section VI

The Full Arsenal

全体系 — Ecosystem

15 systems. One SSO. One event bus. One graph mindset. Each built for a specific intelligence domain, all designed to share context through the graph and NATS.

#	System	Domain	One-liner
1	Signal	CTI	Threat graph, 225 endpoints, 32 ML modules, 83 windows
2	Fusion	Cross-domain	84 ingesters across 8 domains, narrative clustering, scanner
3	Raz0r	SIEM	Rust EDR agent, ransomware predictor, cross-node correlator
4	ANTOS	UX	Embedded at /antos — unified analyst desktop
5	Kin0bi	Financial	Real-time crypto/stocks/forex, anomaly detection, portfolio risk
6	Nexus	OSINT	Suspicion propagation, money flow, UBO resolution, sanctions
7	1D	Identity	BloodHound/LDAP/Azure AD graph, attack paths, kerberoastable
8	V01d	Sentiment	GDELT/RSS/Reddit/FRED pipeline, Oracle score, ninjaTONE
9	V0id	Agents	3 autonomous agents (Sentinel, Warden, Spectre), IR playbooks
10	Los Alamos	Wargaming	Red vs Blue agentic range, LLM-driven adversaries, ELO scoring
11	Knox	Secrets	Vault, crypto toolkit, privacy engine, TOTP authenticator
12	Social	Collaboration	TI messaging, IOC auto-detect, encrypted channels, NATS feed
13	War Room	IR	LiveKit video, shared timelines, IOC panel, breach tracker
14	NinjaClaw	CLI	Hardened CLI agent, 10 scanners, CIS rules, Signal intel link
15	GITAIR	DevSecOps	Git security scanning & air-gapped repository management

SSO: One Identity Everywhere

Every app implements the same SSO handshake. When a user is authenticated in Signal and clicks through to Fusion, the flow is:

Signal JWT cookie

→

/api/auth/sso Generate token

→

Redirect ?sso_token=...

→

Fusion /api/auth/sso-exchange

→

Set Cookie ninja-fusion-token

Each app has its own cookie name to avoid conflicts. The JWT payload is verified server-side. No shared session store — just cryptographic trust.

NATS Event Bus

JetStream provides durable, at-least-once delivery across all apps. 9 subject hierarchies:

NATS Subjects
signal.*     # Signal threat events
fusion.*     # Fusion cross-domain events
razor.*      # SIEM detections & alerts
v01d.*       # Sentiment & Oracle updates
nexus.*      # OSINT investigation events
kin0bi.*     # Financial anomalies
id.*         # Identity exposure events
v0id.*       # Agent actions & findings
ecosystem.*  # System-wide coordination

When Raz0r detects a suspicious process, it publishes to razor.alert. V0id agents subscribe and auto-triage. Signal enriches the IOC. Fusion correlates with geopolitical context. All without any app knowing about the others — just messages on a bus.

Galaxy Visualization

Signal's /galaxy/data endpoint samples ~8,000 nodes from the ML graph, groups them by label, and computes a 3D radial cluster layout. V01d's ninjaTONE page renders this as a Three.js point cloud — a galaxy of threats you can fly through, click, and explore. It's not just pretty. It's the entire threat landscape in one view.

Section VII

Subject to Change

進化 — Evolution

This document describes the architecture as of March 2026. It will change. The ecosystem is alive — new ingesters, new ML modules, new analytical capabilities ship regularly. What won't change is the core philosophy: one graph, atomic claims, domain fusion.

The hardest problem in security isn't detection. It's connection. Every tool in this ecosystem exists to make one more connection visible — between an IOC and an actor, between an actor and a campaign, between a campaign and the geopolitical event that triggered it. When all those connections live in one graph, you stop reacting to alerts and start understanding adversaries.

The graph doesn't give you answers. It gives you the right questions — and the traversals to find them.

From Noise to Knowledge

The Intelligence Pipeline

Signal: The Cyber Layer

Fusion: The Everything Layer

The Claims Engine

The Claim Dataclass

Graph Merge Strategy

Node Labels

Relationship Types

Fusion’s Extended Schema

Signal — Technical Architecture

Backend: adversary_graph_app.py

Core Modules

Frontend: 37 Floating Windows

Auth Model

Sample API Endpoints

Fusion — Technical Architecture

Backend: fusion_app.py

84 Ingesters Across 8 Domains

Core Modules

Narrative Clustering

Security Scanner

Frontend: Cross-Domain Dashboard

Sample API Endpoints

The Stealth Stack

Container Topology

Caddy Routing

Security Shield

fail2ban

Network Architecture

Deployment Pattern

The Full Arsenal

SSO: One Identity Everywhere

NATS Event Bus

Galaxy Visualization

Subject to Change

Backend: `adversary_graph_app.py`

Backend: `fusion_app.py`