Insight³ — Technical Deep Dive — March 2026

Intelligence Mesh: Value Engineering Assessment

Technical architecture, algorithm inventory, data pipeline analysis, and IP differentiation assessment for the ninja.ing intelligence ecosystem. Prepared for prospective partners, acquirers, and technical evaluators.

1. Ecosystem Overview

The ninja.ing ecosystem comprises 17 operational intelligence platforms sharing a unified Neo4j graph database containing 1,000,000+ nodes and their relationships. The platforms span six intelligence domains: cyber threat intelligence, geopolitical intelligence, OSINT investigation, financial intelligence, identity security, and autonomous incident response.

All platforms are production-deployed across dedicated domains, serving authenticated users with real-time data ingestion, ML-driven analysis, and WebSocket-based live updates.

Platform	Domain	Intelligence Domain	Key Metric
Ninja Signal	ninjasignal.ninja	Cyber Threat Intelligence	89K+ IOCs, 245+ actors
Ninja Fusion	ninjafusion.ninja	Geopolitical Intelligence	80+ intelligence sources
Raz0r	ninjaraz0r.ninja	EDR & SIEM	33 rules, Rust agent
Ninja Nexus	ninjanexus.ninja	OSINT Investigation	Sanctions + ownership
Kin0bi	ninjaken0bi.ninja	Financial Intelligence	Crypto/equity/forex ML
Ninja 1D	1d.ninja.ing	Identity Security	AD/Entra attack paths
ANTOS	antos.ninja.ing	DevSecOps Pipeline	8 stages, 17 tools
Agentic V0id	ninjav0id.io	Autonomous IR	3 LLM agents, 8 playbooks
V01d	ninjav0id.io	Sentiment Intelligence	Oracle score, 18 feeds
Los Alamos	—	Red vs Blue Wargaming	LLM adversaries, ELO scoring
Knox	ninjav0id.io/knox	Secrets & Crypto	Post-quantum crypto, TOTP
Social	ninjasocial.ninja	TI Collaboration	Encrypted messaging, IOC detect
War Room	warroom.ninja	Incident Response	LiveKit video, breach tracker
Sabaki	ninja.ing/ninjasabaki	Vulnerability Triage	Multi-scanner, ServiceNow
Depth	ninja.ing/depth	Supply Chain Security	SBOM, dependency audit
NinjaClaw	PyPI	CLI Security Agent	10 scanners, CIS rules
GITAIR	gitair.ninja	DevSecOps	Air-gapped git intelligence

2. Technical Architecture

Stack

Backend: Python 3.14, FastAPI (async), uvicorn. One FastAPI app per platform.
Database: Neo4j 5.x with GDS 2.22.0 (Graph Data Science library). Shared graph in production, isolated per-platform in development.
Frontend: Next.js 16, React 19, Tailwind CSS v4. Floating-window UI paradigm per platform.
Reverse Proxy: Caddy (automatic TLS, HTTP/2). Single Caddyfile routes all 17 platforms + subdomains.
EDR Agent: Rust (Raz0r agent). ETW hooking, AMSI monitoring, memory scanning. ~2MB release binary.
Deployment: Docker Compose (dev + prod overlays). Two Hetzner dedicated servers (WireGuard tunnel), Cloudflare DNS (proxied for .ninja domains, Let’s Encrypt direct for .ninja.ing).
Real-time: WebSocket per platform for live data streaming.

Cross-Platform Integration Pattern

The critical architectural differentiator is the shared graph. In production, all 17 platforms' FastAPI backends connect to the same Neo4j instance. This means:

An IOC ingested by Signal is immediately queryable by Raz0r's correlator
A threat actor in Signal connects to techniques that connect to Raz0r detection rules
An OSINT entity in Nexus can be linked to identities in 1D and financial entities in Kin0bi
Fusion's geopolitical events provide context for Signal's threat actor activity
V0id's autonomous agents can traverse the entire graph for incident investigation

This is not API-level integration. It's graph-level integration — the relationships are native Neo4j edges, traversable in a single Cypher query.

3. Platform Inventory

Ninja Signal — Cyber Threat Intelligence

The foundational CTI platform. Ingests from NVD, MITRE ATT&CK, AlienVault OTX, CISA KEV, VirusTotal, AbuseIPDB, and 15+ additional feeds. 89,000+ Indicator nodes, 48,000+ Software nodes, 21,000+ Infrastructure nodes, 1,900+ Vulnerability nodes, 1,200+ Technique nodes, 245+ ThreatActor nodes.

Key Features: Risk propagation, community detection, KEV prediction, adversary digital twins (Monte Carlo campaign simulation), hidden connections, cross-domain analysis, activity forecasting, GraphSAGE risk learning.

~3,500-line main application file. 15+ ingesters. OpenCTI bidirectional sync (STIX 2.1 import/export).

Ninja Fusion — Geopolitical Intelligence

Cross-domain intelligence fusion platform. 68 sources including GDELT GKG, RSS feeds (BBC, Reuters, AP, Al Jazeera), FRED economic data, Reddit, academic papers, humanitarian alerts, prediction markets. VADER sentiment scoring, Monte Carlo simulation with conditional kill-chain probabilities, MAD-based anomaly detection.

Key Features: Adversary digital twins, emergent behavior detection (7 detectors), public SITREP page (SEO-optimized, free access), social features.

Raz0r — Graph-Native SIEM & EDR

Full SIEM implementation with custom Rust EDR agent.

Rust Agent: 25 source files, dual build (EXE + DLL). ETW hooking, AMSI monitoring, memory scanning, behavioral heuristics. Reports to SIEM via REST API.

Detection: 33 detection rules. 25 event types. Ransomware predictor (5-phase kill chain tracker, 24 indicators, SEV1 at Phase 4). Cross-node correlator for distributed campaign detection. Auto-rule generator from threat intelligence data.

Integration: AssetMapper links SIEM events to Indicator/Technique/ThreatActor/Software/Vulnerability nodes in Signal. BFS blast radius analysis from any event.

Ninja Nexus — OSINT Investigation

Investigative intelligence platform. 12 entity types: Person, Company, Property, BankAccount, Transaction, Document, Sanction, Address, Vessel, Domain, Jurisdiction, Investigation. Ingesters for OpenSanctions, OFAC SDN, ICIJ Offshore Leaks, UK Companies House, SEC EDGAR, OpenCorporates.

Key Features: Suspicion propagation with temporal decay and confidence intervals, money flow tracing, UBO (Ultimate Beneficial Owner) resolution, 7 emergent behavior detectors with false positive tracking and persistence.

Kin0bi — Financial Intelligence

Market intelligence platform. Data sources: Binance WebSocket (crypto), Finnhub (stocks), ECB (forex), FRED (macro), ApeWisdom + Reddit (sentiment). Real-time pipeline: pollers → async queue → batch writer → Neo4j.

Key Features: Log-return-based correlation (not raw prices), EMA+momentum prediction, Isolation Forest on residuals, historical VaR (no Gaussian assumption), streaming anomaly detection (Half-Space Trees).

Ninja 1D — Identity Intelligence

AD/Entra ID attack surface analysis. Ingesters for BloodHound 4.x JSON, LDAP/LDAPS, Azure AD (MS Graph OAuth2), CSV. 8 labels, 20+ relationship types covering all standard ACE types.

Key Features: BFS attack path analysis, Kerberoasting/AS-REP roastable detection, shadow admin identification, unconstrained delegation detection, group nesting analysis, identity risk scoring.

ANTOS — AI-Orchestrated DevSecOps

Static site (Next.js) documenting an 8-stage security pipeline with 17 tools. Claude AI triage layer. SAST, SCA, secrets detection, container scanning, DAST, compliance, IaC scanning.

Agentic V0id — Autonomous Defensive Agents

Three LLM-powered autonomous agents. Sentinel (triage), Warden (containment), Spectre (threat hunting). Connectors to Raz0r SIEM, Signal CTI, Azure AD. 8 IR playbooks (ransomware, BEC, data exfil, insider threat, DDoS, supply chain, credential compromise, zero-day). Detection engineering agents (coverage mapper, rule tuner, signature generator, correlation builder). Forensic collection agents (memory, disk, network, evidence packager) with chain-of-custody.

ninjaV01d — Predictive Sentiment Intelligence

Companion to V0id, deployed at ninjav0id.io (root). 16 data source pollers (GDELT, RSS, FRED, Reddit, HackerNews, Crypto Fear & Greed, USGS, WHO, ReliefWeb, Wikipedia, arXiv, Polymarket, NewsAPI, Finnhub News, GDELT Doc, EventRegistry). V01d Oracle composite scoring engine.

ML Phases: Cross-source consensus, LSTM forecasting, Hawkes cascade detection, graph embeddings (Node2Vec), narrative detection, geospatial sentiment diffusion, Granger causality (economic → sentiment), streaming anomaly detection.

4. Algorithm & ML Inventory

The following documents every ML algorithm deployed across the ecosystem, including implementation details, parameters, and rationale.

4.1 Graph Analytics

Risk Propagation (BFS with Temporal Decay)

Propagates risk scores through the graph from high-risk seed nodes (threat actors, known-exploited vulnerabilities). Each hop attenuates the score by a configurable damping factor. Edge age modulates propagation weight via temporal decay: exp(-0.001 * age_days).

Platforms: Signal, Fusion | Decay: 1yr=69%, 2yr=48%, 3yr=33% | Damping: configurable via ML_RISK_DAMPING

GraphSAGE Risk Propagation

2-layer neural network that learns per-edge-type risk propagation weights rather than using uniform damping. Aggregates neighbor features through mean-pooling, producing learned risk embeddings. Falls back to standard BFS propagation when training data is insufficient.

Platform: Signal | Layers: 2 | Aggregation: mean-pool | Training: supervised on known-risk labels

Suspicion Propagation (OSINT)

BFS propagation specific to OSINT investigation. Seeds from sanctioned entities, PEPs, and high-risk jurisdiction connections. Temporal decay at exp(-0.002 * age_days) (faster decay than CTI due to OSINT data volatility). Returns confidence intervals: 1 - exp(-0.1 * num_relationships).

Platform: Nexus | Seed types: Sanction, PEP, jurisdiction | Confidence: 5 edges=0.39, 10=0.63, 20=0.86

Community Detection (Louvain)

Standard Louvain modularity optimization for clustering graph nodes into communities. Used to identify related threat clusters, actor groups, and financial networks. When Neo4j GDS is available (ML_USE_GDS=true), delegates to native GDS Louvain for performance.

Platforms: Signal, Fusion, Nexus, Kin0bi | GDS toggle: ML_USE_GDS | Drill-down: /ml/communities/{id}

Node2Vec Graph Embeddings

Random walk-based graph embedding. Generates vector representations of nodes by performing biased random walks, then applying Skip-gram (Word2Vec). Used for hidden connection detection and similarity analysis.

Platforms: Signal, V01d | walk_length=20, num_walks=20, window=5, embedding_dim=64

GCN/GAT (Graph Attention Network)

Simplified single-layer linear attention mechanism for node classification. Pure-numpy implementation by default, optional PyTorch backend for GPU acceleration. Used for predicting node properties from neighborhood structure.

Platform: Signal | Implementation: core/gat.py | Fallback: numpy (no PyTorch required)

4.2 Prediction & Forecasting

KEV Predictor (Random Forest)

Predicts whether a CVE will be added to CISA's Known Exploited Vulnerabilities catalog. Features: CVSS score, EPSS probability, vendor, CWE, existing exploit references, attack complexity. Uses class_weight='balanced' to handle severe class imbalance (98% of CVEs are not in KEV). Evaluated with 5-fold stratified cross-validation.

Platform: Signal | Algorithm: RandomForestClassifier | CV: StratifiedKFold(5) | Balance: class_weight='balanced'

EMA + Momentum Price Prediction

Short-term trend prediction using Exponential Moving Average with momentum confirmation. Replaced naive linear regression on raw prices. Operates on log returns to prevent spurious correlation artifacts. Linear regression retained as fallback when EMA data is insufficient.

Platform: Kin0bi | Primary: EMA+momentum | Fallback: linear regression | Input: log returns

Hawkes Process Activity Forecasting

Self-exciting point process model for predicting future threat activity. Threat events are modeled as a temporal process where past events increase the probability of future events (clustering effect). Exponential decay kernel.

Platforms: Signal, V01d | Decay: configurable via ML_HAWKES_DECAY | Window: ML_HAWKES_WINDOW_HOURS

LSTM Sentiment Forecasting

Long Short-Term Memory network for predicting sentiment trajectory. Lookback window of 48 hours, forecast horizon of 24 hours. Uses entity-level and region-level aggregated sentiment as input features.

Platform: V01d | Hidden: 32 | Lookback: 48h | Forecast: 24h | Configurable via ML_LSTM_*

Granger Causality Testing

Statistical test for whether one time series helps predict another beyond the target's own history. Pure-NumPy implementation using F-tests on restricted vs. unrestricted autoregressive models. Tests both directions (x→y and y→x). Used in V01d Oracle to determine whether economic indicators actually predict sentiment shifts.

Platform: V01d | Max lag: ML_GRANGER_MAX_LAG (default 5) | Significance: F > 2.5 | Days: ML_GRANGER_DAYS (default 30)

Monte Carlo Campaign Simulation

Simulates threat actor campaign progression through kill chain phases. 1,000+ simulations per run. Uses conditional probability boosts for phase transitions (e.g., initial-access→execution gets 1.3x boost if prior phase succeeded). Actor-specific technique probabilities derived from historical data.

Platforms: Signal, Fusion (Adversary Digital Twins) | Simulations: 1000+ | Phase boosts: 1.15x-1.4x | Cap: 0.95

4.3 Anomaly Detection

MAD-Based Anomaly Detection

Median Absolute Deviation replaces z-scores across the ecosystem. modified_z = 0.6745 * (value - median) / MAD. Robust to outliers and fat-tailed distributions that invalidate the Gaussian assumption required by z-scores. Threshold: 3.5 (configurable).

Platforms: Signal, Fusion | Consistency factor: 0.6745 | Replaced: z-score (threshold 2.0)

Isolation Forest (Residual-Based)

Anomaly detection on EMA residuals of log returns, not raw prices. This prevents trend and seasonality from triggering false anomalies. Contamination parameter: 5%.

Platforms: Kin0bi, V01d | Input: log return residuals | Contamination: 0.05 | sklearn.ensemble.IsolationForest

Half-Space Trees (Streaming)

Online anomaly detection for streaming data. No batch retraining required — the model updates incrementally with each new observation. Used for real-time anomaly scoring on financial market data and sentiment streams.

Platforms: V01d, Kin0bi | Type: streaming/online | No retraining | core/streaming_anomaly.py

Ransomware Kill Chain Predictor

5-phase behavioral model: reconnaissance, weaponization, delivery, exploitation, actions-on-objectives. 24 behavioral indicators tracked per host. When Phase 4 is reached, triggers SEV1 MAJOR alert. Cross-node correlator detects distributed campaigns via phase alignment and entropy convergence.

Platform: Raz0r | Phases: 5 | Indicators: 24 | Alert threshold: Phase 4 | Cross-node: entropy convergence

4.4 Composite Scoring

V01d Oracle

Flagship composite intelligence score (0–100). Weighted components: 30% tone (VADER sentiment aggregate), 25% velocity (rate of change), 20% anomaly (Isolation Forest anomaly ratio), 15% topic heat (topic clustering density), 10% economic (Granger causality with economic indicators). Normalization calibrated to use full 0–100 range.

Platform: V01d | Components: 5 | Weights: configurable via V01D_ORACLE_WEIGHTS | Cache: 900s TTL

4.5 Emergent Behavior Detection

Emergent Detectors (7 per platform)

Pattern-matching detectors that identify emergent behaviors from graph structure changes. Examples: sanctions network mutation, jurisdiction hopping, shell company emergence (Nexus); velocity anomalies, bridge anomalies, community shifts (Signal/Fusion). Each signal includes a stable pattern hash for deduplication and false positive tracking with Neo4j persistence.

Platforms: Signal, Fusion, Nexus | FP tracking: SHA-256 pattern hash | Persistence: FalsePositive label in Neo4j

4.6 Autonomous Agents

LLM-Powered Decision Agents

Three autonomous agents (Sentinel, Warden, Spectre) using Claude as the reasoning engine. Each agent has access to the full graph for context. Decision steps include confidence thresholds — below-threshold decisions are escalated to human operators. Every action is logged and reversible.

Platform: V0id | Agents: 3 | Playbooks: 8 (YAML-driven) | Step types: reason, action, manual, wait

5. Data Pipeline Architecture

Ingestion Pattern

All platforms follow the same async pipeline pattern:

Pollers: Async tasks on configurable intervals (15s to 24h) that fetch from external sources
Queue: AsyncEventQueue (capacity: 50,000) with backpressure (80% pause, 50% resume)
Batch Writer: UNWIND CREATE for new data (3–5x faster than MERGE), MERGE for aggregations
Neo4j: Graph storage with indexes on key lookup fields per label
Post-write hooks: Correlator, AlertEngine, ML cache warming

Data Sources (95+)

Category	Sources	Platform
CTI Feeds	NVD, MITRE ATT&CK, OTX, CISA KEV, VirusTotal, AbuseIPDB, MalwareBazaar, URLhaus, Shodan, ThreatFox, PhishTank, SpamhausDBL, C2IntelFeeds, FeodoTracker, OpenCTI	Signal
News/OSINT	GDELT GKG, GDELT Doc, BBC/Reuters/AP RSS, Reddit, HackerNews, Wikipedia, NewsAPI, Finnhub News, EventRegistry	Fusion, V01d
Economic	FRED (Federal Reserve), ECB (forex), Polymarket (predictions), Crypto Fear & Greed	V01d, Kin0bi
Markets	Binance WebSocket (crypto), Finnhub (equities), ApeWisdom (social sentiment)	Kin0bi
Humanitarian	USGS (earthquakes), WHO (disease alerts), ReliefWeb (crises)	V01d
Academic	arXiv (preprints)	V01d
Sanctions	OpenSanctions, OFAC SDN, ICIJ Offshore Leaks	Nexus
Corporate	UK Companies House, SEC EDGAR, OpenCorporates	Nexus
Identity	BloodHound JSON, LDAP/LDAPS, Azure AD (MS Graph)	1D
Endpoint	ETW, AMSI, memory scanning, behavioral heuristics (Rust agent)	Raz0r

Retention

Configurable per severity per platform. Default: critical=365d, high=90d, medium=30d, low=7d. Batched DETACH DELETE every 6 hours. Non-destructive — aggregated EventSummary nodes persist beyond raw event retention.

6. Technical Differentiation

Unified Graph Architecture

The single most significant differentiator. All 17 platforms share one Neo4j graph. Cross-domain traversal (SIEM event → threat actor → OSINT entity → identity → financial entity) is a native graph operation, not an API integration layer. No comparable product exists in the market.

Mathematically Sound ML

Every algorithm was chosen for a specific mathematical reason. Log returns prevent spurious correlation. MAD handles fat tails. Temporal decay reflects real-world intelligence aging. Granger causality tests for actual predictive relationships, not just correlation. Class-balanced cross-validation prevents overfitting to majority class. This level of methodological rigor is rare in security tooling.

Full-Stack Vertical Integration

From Rust EDR agent (memory-level) through graph database (storage) to Next.js UI (presentation) to LLM agents (autonomous response). No external dependencies on third-party security platforms. The entire intelligence pipeline is self-contained and architecturally coherent.

Autonomous Incident Response

V0id's LLM agents with full graph access represent the next generation of IR. Unlike rule-based SOAR, each decision step involves reasoning over the entire threat graph. 8 IR playbooks covering the most common incident types. Every action reversible, every decision auditable.

Runtime Configurability

All ML parameters, polling intervals, queue sizes, thresholds, and feature flags are configurable via environment variables at runtime. No code changes required to tune the system. This enables rapid deployment customization for different operational contexts.

7. Intellectual Property Summary

Codebase Metrics

Component	Files	Primary Language	Notable
Signal (RTM)	~80+	Python + TypeScript	3,500+ line main app, 15+ ingesters
Fusion	~70+	Python + TypeScript	68-source fusion engine
Raz0r	~60+	Python + Rust + TypeScript	25 Rust source files (EDR agent)
Nexus	~65	Python + TypeScript	6 OSINT ingesters, 7 emergent detectors
Kin0bi	~70	Python + TypeScript	Real-time market pipeline
1D	~37	Python + TypeScript	4 identity ingesters
ANTOS	~20	TypeScript	Static analysis documentation
V0id + V01d	~120+	Python + TypeScript	LLM agents + 16 pollers + Oracle

Proprietary Algorithms

V01d Oracle: 5-component composite scoring with Granger causality economic integration
Cross-Node Ransomware Correlator: Phase alignment + entropy convergence for distributed campaign detection
GraphSAGE Risk Propagation: Per-edge-type learned risk weights (not published)
Adversary Digital Twins: Probabilistic actor behavioral models with Monte Carlo conditional simulation
Intelligence Mesh: The architectural pattern of 8 domains sharing one graph

Data Assets

1,000,000+ curated nodes with typed relationships
89,000+ IOCs with source provenance
245+ threat actor profiles with technique mappings
Sanctions/OSINT entity network (OpenSanctions, OFAC, ICIJ)
Historical sentiment and economic correlation data

8. Technical Risks & Mitigations

Key-Person Dependency

The ecosystem was built by a single engineer. Knowledge concentration is high. Mitigation: Consistent architecture pattern across all 17 platforms (FastAPI + Neo4j + Next.js + Caddy). Well-structured codebases. This technical guide, CLAUDE.md files, and dev diary serve as documentation. The uniformity of the stack means onboarding is learning one pattern, not eight.

Scalability

Current deployment is a two-server architecture (Hetzner dedicated — Ryzen 9 7950X3D/128GB for Signal+Fusion, Ryzen 5 3600/64GB for all other apps, connected via WireGuard). Neo4j's 1M+ nodes are well within single-instance capacity (Neo4j handles millions). Horizontal scaling would require Neo4j clustering (supported in Enterprise Edition) and container orchestration (Kubernetes). Mitigation: Docker Compose architecture maps cleanly to K8s. No hardcoded single-instance assumptions.

Third-Party Data Dependencies

Several ingesters depend on free-tier APIs (NVD, GDELT, OTX, OpenSanctions). Rate limits or API changes could disrupt ingestion. Mitigation: Modular ingester design — each ingester is a standalone module. Historical data is preserved in the graph regardless of API availability. Premium API keys can be added via environment variables.

LLM Dependency (V0id Agents)

Autonomous agents depend on Claude API availability and cost. Mitigation: Agents degrade gracefully to manual mode. All playbook steps have manual approval fallback. LLM is used for reasoning, not critical-path execution.

9. Market Positioning

Competitive Landscape

Competitor	Coverage	Graph-Native	Cross-Domain	Autonomous IR
CrowdStrike	EDR + TI	No	Limited	No (human MDR)
Palo Alto (XSIAM)	SIEM + SOAR	No	Limited	Rule-based
Recorded Future	CTI	Partial	CTI only	No
Splunk (Cisco)	SIEM + SOAR	No	Via apps	Rule-based
OpenCTI	CTI	Yes (Neo4j)	CTI only	No
Maltego	OSINT	Yes	OSINT only	No
ninja.ing	All 6 domains	Yes (Neo4j)	Full mesh	LLM agents

Value Proposition

No existing product covers all six intelligence domains in a single graph. Enterprises currently assemble this capability from 10–15 vendors at $500K–$2M/year in licensing, plus integration costs. The ninja.ing mesh provides equivalent or superior coverage with native graph integration and autonomous response, deployable as a single Docker Compose stack.

Go-to-Market Options

SaaS: Multi-tenant deployment with per-platform or full-mesh pricing
On-Premise: Single Docker Compose stack for air-gapped / regulated environments
OEM / White-Label: Individual platforms (e.g., Raz0r SIEM or Nexus OSINT) licensed to security vendors
Managed Intelligence: Hosted mesh with curated data and analyst support

← Read the Executive Narrative