Insight³ — Technical Deep Dive — March 2026

Intelligence Mesh: Value Engineering Assessment

Technical architecture, algorithm inventory, data pipeline analysis, and IP differentiation assessment for the ninja.ing intelligence ecosystem. Prepared for prospective partners, acquirers, and technical evaluators.

1. Ecosystem Overview

The ninja.ing ecosystem comprises 17 operational intelligence platforms sharing a unified Neo4j graph database containing 1,000,000+ nodes and their relationships. The platforms span six intelligence domains: cyber threat intelligence, geopolitical intelligence, OSINT investigation, financial intelligence, identity security, and autonomous incident response.

All platforms are production-deployed across dedicated domains, serving authenticated users with real-time data ingestion, ML-driven analysis, and WebSocket-based live updates.

Platform Domain Intelligence Domain Key Metric
Ninja Signalninjasignal.ninjaCyber Threat Intelligence89K+ IOCs, 245+ actors
Ninja Fusionninjafusion.ninjaGeopolitical Intelligence80+ intelligence sources
Raz0rninjaraz0r.ninjaEDR & SIEM33 rules, Rust agent
Ninja Nexusninjanexus.ninjaOSINT InvestigationSanctions + ownership
Kin0bininjaken0bi.ninjaFinancial IntelligenceCrypto/equity/forex ML
Ninja 1D1d.ninja.ingIdentity SecurityAD/Entra attack paths
ANTOSantos.ninja.ingDevSecOps Pipeline8 stages, 17 tools
Agentic V0idninjav0id.ioAutonomous IR3 LLM agents, 8 playbooks
V01dninjav0id.ioSentiment IntelligenceOracle score, 18 feeds
Los AlamosRed vs Blue WargamingLLM adversaries, ELO scoring
Knoxninjav0id.io/knoxSecrets & CryptoPost-quantum crypto, TOTP
Socialninjasocial.ninjaTI CollaborationEncrypted messaging, IOC detect
War Roomwarroom.ninjaIncident ResponseLiveKit video, breach tracker
Sabakininja.ing/ninjasabakiVulnerability TriageMulti-scanner, ServiceNow
Depthninja.ing/depthSupply Chain SecuritySBOM, dependency audit
NinjaClawPyPICLI Security Agent10 scanners, CIS rules
GITAIRgitair.ninjaDevSecOpsAir-gapped git intelligence

2. Technical Architecture

Stack

Cross-Platform Integration Pattern

The critical architectural differentiator is the shared graph. In production, all 17 platforms' FastAPI backends connect to the same Neo4j instance. This means:

This is not API-level integration. It's graph-level integration — the relationships are native Neo4j edges, traversable in a single Cypher query.

3. Platform Inventory

Ninja Signal — Cyber Threat Intelligence

The foundational CTI platform. Ingests from NVD, MITRE ATT&CK, AlienVault OTX, CISA KEV, VirusTotal, AbuseIPDB, and 15+ additional feeds. 89,000+ Indicator nodes, 48,000+ Software nodes, 21,000+ Infrastructure nodes, 1,900+ Vulnerability nodes, 1,200+ Technique nodes, 245+ ThreatActor nodes.

Key Features: Risk propagation, community detection, KEV prediction, adversary digital twins (Monte Carlo campaign simulation), hidden connections, cross-domain analysis, activity forecasting, GraphSAGE risk learning.

~3,500-line main application file. 15+ ingesters. OpenCTI bidirectional sync (STIX 2.1 import/export).

Ninja Fusion — Geopolitical Intelligence

Cross-domain intelligence fusion platform. 68 sources including GDELT GKG, RSS feeds (BBC, Reuters, AP, Al Jazeera), FRED economic data, Reddit, academic papers, humanitarian alerts, prediction markets. VADER sentiment scoring, Monte Carlo simulation with conditional kill-chain probabilities, MAD-based anomaly detection.

Key Features: Adversary digital twins, emergent behavior detection (7 detectors), public SITREP page (SEO-optimized, free access), social features.

Raz0r — Graph-Native SIEM & EDR

Full SIEM implementation with custom Rust EDR agent.

Rust Agent: 25 source files, dual build (EXE + DLL). ETW hooking, AMSI monitoring, memory scanning, behavioral heuristics. Reports to SIEM via REST API.

Detection: 33 detection rules. 25 event types. Ransomware predictor (5-phase kill chain tracker, 24 indicators, SEV1 at Phase 4). Cross-node correlator for distributed campaign detection. Auto-rule generator from threat intelligence data.

Integration: AssetMapper links SIEM events to Indicator/Technique/ThreatActor/Software/Vulnerability nodes in Signal. BFS blast radius analysis from any event.

Ninja Nexus — OSINT Investigation

Investigative intelligence platform. 12 entity types: Person, Company, Property, BankAccount, Transaction, Document, Sanction, Address, Vessel, Domain, Jurisdiction, Investigation. Ingesters for OpenSanctions, OFAC SDN, ICIJ Offshore Leaks, UK Companies House, SEC EDGAR, OpenCorporates.

Key Features: Suspicion propagation with temporal decay and confidence intervals, money flow tracing, UBO (Ultimate Beneficial Owner) resolution, 7 emergent behavior detectors with false positive tracking and persistence.

Kin0bi — Financial Intelligence

Market intelligence platform. Data sources: Binance WebSocket (crypto), Finnhub (stocks), ECB (forex), FRED (macro), ApeWisdom + Reddit (sentiment). Real-time pipeline: pollers → async queue → batch writer → Neo4j.

Key Features: Log-return-based correlation (not raw prices), EMA+momentum prediction, Isolation Forest on residuals, historical VaR (no Gaussian assumption), streaming anomaly detection (Half-Space Trees).

Ninja 1D — Identity Intelligence

AD/Entra ID attack surface analysis. Ingesters for BloodHound 4.x JSON, LDAP/LDAPS, Azure AD (MS Graph OAuth2), CSV. 8 labels, 20+ relationship types covering all standard ACE types.

Key Features: BFS attack path analysis, Kerberoasting/AS-REP roastable detection, shadow admin identification, unconstrained delegation detection, group nesting analysis, identity risk scoring.

ANTOS — AI-Orchestrated DevSecOps

Static site (Next.js) documenting an 8-stage security pipeline with 17 tools. Claude AI triage layer. SAST, SCA, secrets detection, container scanning, DAST, compliance, IaC scanning.

Agentic V0id — Autonomous Defensive Agents

Three LLM-powered autonomous agents. Sentinel (triage), Warden (containment), Spectre (threat hunting). Connectors to Raz0r SIEM, Signal CTI, Azure AD. 8 IR playbooks (ransomware, BEC, data exfil, insider threat, DDoS, supply chain, credential compromise, zero-day). Detection engineering agents (coverage mapper, rule tuner, signature generator, correlation builder). Forensic collection agents (memory, disk, network, evidence packager) with chain-of-custody.

ninjaV01d — Predictive Sentiment Intelligence

Companion to V0id, deployed at ninjav0id.io (root). 16 data source pollers (GDELT, RSS, FRED, Reddit, HackerNews, Crypto Fear & Greed, USGS, WHO, ReliefWeb, Wikipedia, arXiv, Polymarket, NewsAPI, Finnhub News, GDELT Doc, EventRegistry). V01d Oracle composite scoring engine.

ML Phases: Cross-source consensus, LSTM forecasting, Hawkes cascade detection, graph embeddings (Node2Vec), narrative detection, geospatial sentiment diffusion, Granger causality (economic → sentiment), streaming anomaly detection.

4. Algorithm & ML Inventory

The following documents every ML algorithm deployed across the ecosystem, including implementation details, parameters, and rationale.

4.1 Graph Analytics

Risk Propagation (BFS with Temporal Decay)

Propagates risk scores through the graph from high-risk seed nodes (threat actors, known-exploited vulnerabilities). Each hop attenuates the score by a configurable damping factor. Edge age modulates propagation weight via temporal decay: exp(-0.001 * age_days).

Platforms: Signal, Fusion | Decay: 1yr=69%, 2yr=48%, 3yr=33% | Damping: configurable via ML_RISK_DAMPING

GraphSAGE Risk Propagation

2-layer neural network that learns per-edge-type risk propagation weights rather than using uniform damping. Aggregates neighbor features through mean-pooling, producing learned risk embeddings. Falls back to standard BFS propagation when training data is insufficient.

Platform: Signal | Layers: 2 | Aggregation: mean-pool | Training: supervised on known-risk labels

Suspicion Propagation (OSINT)

BFS propagation specific to OSINT investigation. Seeds from sanctioned entities, PEPs, and high-risk jurisdiction connections. Temporal decay at exp(-0.002 * age_days) (faster decay than CTI due to OSINT data volatility). Returns confidence intervals: 1 - exp(-0.1 * num_relationships).

Platform: Nexus | Seed types: Sanction, PEP, jurisdiction | Confidence: 5 edges=0.39, 10=0.63, 20=0.86

Community Detection (Louvain)

Standard Louvain modularity optimization for clustering graph nodes into communities. Used to identify related threat clusters, actor groups, and financial networks. When Neo4j GDS is available (ML_USE_GDS=true), delegates to native GDS Louvain for performance.

Platforms: Signal, Fusion, Nexus, Kin0bi | GDS toggle: ML_USE_GDS | Drill-down: /ml/communities/{id}

Node2Vec Graph Embeddings

Random walk-based graph embedding. Generates vector representations of nodes by performing biased random walks, then applying Skip-gram (Word2Vec). Used for hidden connection detection and similarity analysis.

Platforms: Signal, V01d | walk_length=20, num_walks=20, window=5, embedding_dim=64

GCN/GAT (Graph Attention Network)

Simplified single-layer linear attention mechanism for node classification. Pure-numpy implementation by default, optional PyTorch backend for GPU acceleration. Used for predicting node properties from neighborhood structure.

Platform: Signal | Implementation: core/gat.py | Fallback: numpy (no PyTorch required)

4.2 Prediction & Forecasting

KEV Predictor (Random Forest)

Predicts whether a CVE will be added to CISA's Known Exploited Vulnerabilities catalog. Features: CVSS score, EPSS probability, vendor, CWE, existing exploit references, attack complexity. Uses class_weight='balanced' to handle severe class imbalance (98% of CVEs are not in KEV). Evaluated with 5-fold stratified cross-validation.

Platform: Signal | Algorithm: RandomForestClassifier | CV: StratifiedKFold(5) | Balance: class_weight='balanced'

EMA + Momentum Price Prediction

Short-term trend prediction using Exponential Moving Average with momentum confirmation. Replaced naive linear regression on raw prices. Operates on log returns to prevent spurious correlation artifacts. Linear regression retained as fallback when EMA data is insufficient.

Platform: Kin0bi | Primary: EMA+momentum | Fallback: linear regression | Input: log returns

Hawkes Process Activity Forecasting

Self-exciting point process model for predicting future threat activity. Threat events are modeled as a temporal process where past events increase the probability of future events (clustering effect). Exponential decay kernel.

Platforms: Signal, V01d | Decay: configurable via ML_HAWKES_DECAY | Window: ML_HAWKES_WINDOW_HOURS

LSTM Sentiment Forecasting

Long Short-Term Memory network for predicting sentiment trajectory. Lookback window of 48 hours, forecast horizon of 24 hours. Uses entity-level and region-level aggregated sentiment as input features.

Platform: V01d | Hidden: 32 | Lookback: 48h | Forecast: 24h | Configurable via ML_LSTM_*

Granger Causality Testing

Statistical test for whether one time series helps predict another beyond the target's own history. Pure-NumPy implementation using F-tests on restricted vs. unrestricted autoregressive models. Tests both directions (x→y and y→x). Used in V01d Oracle to determine whether economic indicators actually predict sentiment shifts.

Platform: V01d | Max lag: ML_GRANGER_MAX_LAG (default 5) | Significance: F > 2.5 | Days: ML_GRANGER_DAYS (default 30)

Monte Carlo Campaign Simulation

Simulates threat actor campaign progression through kill chain phases. 1,000+ simulations per run. Uses conditional probability boosts for phase transitions (e.g., initial-access→execution gets 1.3x boost if prior phase succeeded). Actor-specific technique probabilities derived from historical data.

Platforms: Signal, Fusion (Adversary Digital Twins) | Simulations: 1000+ | Phase boosts: 1.15x-1.4x | Cap: 0.95

4.3 Anomaly Detection

MAD-Based Anomaly Detection

Median Absolute Deviation replaces z-scores across the ecosystem. modified_z = 0.6745 * (value - median) / MAD. Robust to outliers and fat-tailed distributions that invalidate the Gaussian assumption required by z-scores. Threshold: 3.5 (configurable).

Platforms: Signal, Fusion | Consistency factor: 0.6745 | Replaced: z-score (threshold 2.0)

Isolation Forest (Residual-Based)

Anomaly detection on EMA residuals of log returns, not raw prices. This prevents trend and seasonality from triggering false anomalies. Contamination parameter: 5%.

Platforms: Kin0bi, V01d | Input: log return residuals | Contamination: 0.05 | sklearn.ensemble.IsolationForest

Half-Space Trees (Streaming)

Online anomaly detection for streaming data. No batch retraining required — the model updates incrementally with each new observation. Used for real-time anomaly scoring on financial market data and sentiment streams.

Platforms: V01d, Kin0bi | Type: streaming/online | No retraining | core/streaming_anomaly.py

Ransomware Kill Chain Predictor

5-phase behavioral model: reconnaissance, weaponization, delivery, exploitation, actions-on-objectives. 24 behavioral indicators tracked per host. When Phase 4 is reached, triggers SEV1 MAJOR alert. Cross-node correlator detects distributed campaigns via phase alignment and entropy convergence.

Platform: Raz0r | Phases: 5 | Indicators: 24 | Alert threshold: Phase 4 | Cross-node: entropy convergence

4.4 Composite Scoring

V01d Oracle

Flagship composite intelligence score (0–100). Weighted components: 30% tone (VADER sentiment aggregate), 25% velocity (rate of change), 20% anomaly (Isolation Forest anomaly ratio), 15% topic heat (topic clustering density), 10% economic (Granger causality with economic indicators). Normalization calibrated to use full 0–100 range.

Platform: V01d | Components: 5 | Weights: configurable via V01D_ORACLE_WEIGHTS | Cache: 900s TTL

4.5 Emergent Behavior Detection

Emergent Detectors (7 per platform)

Pattern-matching detectors that identify emergent behaviors from graph structure changes. Examples: sanctions network mutation, jurisdiction hopping, shell company emergence (Nexus); velocity anomalies, bridge anomalies, community shifts (Signal/Fusion). Each signal includes a stable pattern hash for deduplication and false positive tracking with Neo4j persistence.

Platforms: Signal, Fusion, Nexus | FP tracking: SHA-256 pattern hash | Persistence: FalsePositive label in Neo4j

4.6 Autonomous Agents

LLM-Powered Decision Agents

Three autonomous agents (Sentinel, Warden, Spectre) using Claude as the reasoning engine. Each agent has access to the full graph for context. Decision steps include confidence thresholds — below-threshold decisions are escalated to human operators. Every action is logged and reversible.

Platform: V0id | Agents: 3 | Playbooks: 8 (YAML-driven) | Step types: reason, action, manual, wait

5. Data Pipeline Architecture

Ingestion Pattern

All platforms follow the same async pipeline pattern:

  1. Pollers: Async tasks on configurable intervals (15s to 24h) that fetch from external sources
  2. Queue: AsyncEventQueue (capacity: 50,000) with backpressure (80% pause, 50% resume)
  3. Batch Writer: UNWIND CREATE for new data (3–5x faster than MERGE), MERGE for aggregations
  4. Neo4j: Graph storage with indexes on key lookup fields per label
  5. Post-write hooks: Correlator, AlertEngine, ML cache warming

Data Sources (95+)

Category Sources Platform
CTI FeedsNVD, MITRE ATT&CK, OTX, CISA KEV, VirusTotal, AbuseIPDB, MalwareBazaar, URLhaus, Shodan, ThreatFox, PhishTank, SpamhausDBL, C2IntelFeeds, FeodoTracker, OpenCTISignal
News/OSINTGDELT GKG, GDELT Doc, BBC/Reuters/AP RSS, Reddit, HackerNews, Wikipedia, NewsAPI, Finnhub News, EventRegistryFusion, V01d
EconomicFRED (Federal Reserve), ECB (forex), Polymarket (predictions), Crypto Fear & GreedV01d, Kin0bi
MarketsBinance WebSocket (crypto), Finnhub (equities), ApeWisdom (social sentiment)Kin0bi
HumanitarianUSGS (earthquakes), WHO (disease alerts), ReliefWeb (crises)V01d
AcademicarXiv (preprints)V01d
SanctionsOpenSanctions, OFAC SDN, ICIJ Offshore LeaksNexus
CorporateUK Companies House, SEC EDGAR, OpenCorporatesNexus
IdentityBloodHound JSON, LDAP/LDAPS, Azure AD (MS Graph)1D
EndpointETW, AMSI, memory scanning, behavioral heuristics (Rust agent)Raz0r

Retention

Configurable per severity per platform. Default: critical=365d, high=90d, medium=30d, low=7d. Batched DETACH DELETE every 6 hours. Non-destructive — aggregated EventSummary nodes persist beyond raw event retention.

6. Technical Differentiation

Unified Graph Architecture

The single most significant differentiator. All 17 platforms share one Neo4j graph. Cross-domain traversal (SIEM event → threat actor → OSINT entity → identity → financial entity) is a native graph operation, not an API integration layer. No comparable product exists in the market.

Mathematically Sound ML

Every algorithm was chosen for a specific mathematical reason. Log returns prevent spurious correlation. MAD handles fat tails. Temporal decay reflects real-world intelligence aging. Granger causality tests for actual predictive relationships, not just correlation. Class-balanced cross-validation prevents overfitting to majority class. This level of methodological rigor is rare in security tooling.

Full-Stack Vertical Integration

From Rust EDR agent (memory-level) through graph database (storage) to Next.js UI (presentation) to LLM agents (autonomous response). No external dependencies on third-party security platforms. The entire intelligence pipeline is self-contained and architecturally coherent.

Autonomous Incident Response

V0id's LLM agents with full graph access represent the next generation of IR. Unlike rule-based SOAR, each decision step involves reasoning over the entire threat graph. 8 IR playbooks covering the most common incident types. Every action reversible, every decision auditable.

Runtime Configurability

All ML parameters, polling intervals, queue sizes, thresholds, and feature flags are configurable via environment variables at runtime. No code changes required to tune the system. This enables rapid deployment customization for different operational contexts.

7. Intellectual Property Summary

Codebase Metrics

Component Files Primary Language Notable
Signal (RTM)~80+Python + TypeScript3,500+ line main app, 15+ ingesters
Fusion~70+Python + TypeScript68-source fusion engine
Raz0r~60+Python + Rust + TypeScript25 Rust source files (EDR agent)
Nexus~65Python + TypeScript6 OSINT ingesters, 7 emergent detectors
Kin0bi~70Python + TypeScriptReal-time market pipeline
1D~37Python + TypeScript4 identity ingesters
ANTOS~20TypeScriptStatic analysis documentation
V0id + V01d~120+Python + TypeScriptLLM agents + 16 pollers + Oracle

Proprietary Algorithms

Data Assets

8. Technical Risks & Mitigations

Key-Person Dependency

The ecosystem was built by a single engineer. Knowledge concentration is high. Mitigation: Consistent architecture pattern across all 17 platforms (FastAPI + Neo4j + Next.js + Caddy). Well-structured codebases. This technical guide, CLAUDE.md files, and dev diary serve as documentation. The uniformity of the stack means onboarding is learning one pattern, not eight.

Scalability

Current deployment is a two-server architecture (Hetzner dedicated — Ryzen 9 7950X3D/128GB for Signal+Fusion, Ryzen 5 3600/64GB for all other apps, connected via WireGuard). Neo4j's 1M+ nodes are well within single-instance capacity (Neo4j handles millions). Horizontal scaling would require Neo4j clustering (supported in Enterprise Edition) and container orchestration (Kubernetes). Mitigation: Docker Compose architecture maps cleanly to K8s. No hardcoded single-instance assumptions.

Third-Party Data Dependencies

Several ingesters depend on free-tier APIs (NVD, GDELT, OTX, OpenSanctions). Rate limits or API changes could disrupt ingestion. Mitigation: Modular ingester design — each ingester is a standalone module. Historical data is preserved in the graph regardless of API availability. Premium API keys can be added via environment variables.

LLM Dependency (V0id Agents)

Autonomous agents depend on Claude API availability and cost. Mitigation: Agents degrade gracefully to manual mode. All playbook steps have manual approval fallback. LLM is used for reasoning, not critical-path execution.

9. Market Positioning

Competitive Landscape

Competitor Coverage Graph-Native Cross-Domain Autonomous IR
CrowdStrikeEDR + TINoLimitedNo (human MDR)
Palo Alto (XSIAM)SIEM + SOARNoLimitedRule-based
Recorded FutureCTIPartialCTI onlyNo
Splunk (Cisco)SIEM + SOARNoVia appsRule-based
OpenCTICTIYes (Neo4j)CTI onlyNo
MaltegoOSINTYesOSINT onlyNo
ninja.ingAll 6 domainsYes (Neo4j)Full meshLLM agents

Value Proposition

No existing product covers all six intelligence domains in a single graph. Enterprises currently assemble this capability from 10–15 vendors at $500K–$2M/year in licensing, plus integration costs. The ninja.ing mesh provides equivalent or superior coverage with native graph integration and autonomous response, deployable as a single Docker Compose stack.

Go-to-Market Options

← Read the Executive Narrative