開発日誌 Dev Diary — ninja.ing | Building Threat Intelligence at Scale

0.025 2026-05-26

Pulse Day Two: Ten Waves, One Sunday, and the Briefing That Pages a Real SOC.

Yesterday I shipped Pulse v1 — firehose, four sources, basePath, Bluesky launch post, the whole carnival. Today I asked for ten more waves. Sequenced. By dinner.

This is what shipped between coffee and lunch.

Wave 1 — three new sources, free, no auth. abuse.ch URLhaus + Feodo (active C2 IPs and malware-distribution URLs), CISA KEV (the daily-refreshed "actively exploited" catalogue, all 1,600+ entries), and Hacker News top stories filtered through a security-keyword regex. CISA's CDN WAF-blocked our Hetzner range across every UA we tried — the GitHub mirror at cisagov/kev-data publishes the same JSON, so we ate that instead. One sweep: 205 new posts, 186 new IOCs.

Wave 2 — the IOC extractor learned five new tricks. Full URLs (path + query, not just bare domains). CIDR ranges. ASN numbers. tcp/445-style port pairs. JA4 TLS fingerprints, with a context-aware reclass that turns MD5-shaped strings into JA3 fingerprints when the word "JA3" appears nearby. Lit up immediately — 31 URLs and 66 IPv4s surfaced from a single abuse.ch sweep that previously produced zero of either.

Wave 3 — the LLM woke up. Same gate that's been sitting unused for two days. Set the API key, restart, fire one targeted sweep: 174 Claude Haiku calls in two minutes. Output: 28 threat actors, 82 malware families, 18 campaigns, 20 vulnerabilities. Cost: eight cents. Top actors that came out of the noise: PLAY, Phorpiex, NoName, SpaceBears, Lazarus. Top malware: Mozi, ClearFake, Cobalt Strike, DDosia. New /admin/llm-spend page tracks the budget burn-down in real time.

Wave 4 — per-claim summariser. Each high-signal post now carries a one-sentence "so-what" written by the same LLM. Cache-keyed by content hash, so we pay exactly once per unique post body across all-time. Example output from a real Bluesky post: "CVE-2025-4981 is a critical RCE vulnerability (CVSS 9.9) in Mattermost allowing arbitrary file write via archive upload path traversal; immediate patching required." That's a SOC analyst's first three minutes of triage, written in 700ms for $0.0009.

Wave 5 — multi-source corroboration. An IOC seen across CISA + Bluesky + abuse.ch isn't the same animal as one shouted into a single Mastodon hashtag. Built a score: quality_sum × log2(families + 1) × freshness_decay. Critical detail learned the hard way: nine Mastodon hashtag streams aren't nine independent sources, they're one ecosystem firehosing the same content. So diversity counts at the family level (mastodon, bluesky, cisa, abusech, github, hackernews, reddit). The firehose now has a glowing 🔥 pin row above the regular feed showing the top five corroborated IOCs. Today's #1 is CVE-2026-42496, seen across three families.

Wave 6 — time-series, freshness decay, sparklines. Hourly Neo4j datetime.truncate rollups. Each IOC gets a freshness score that exponentially decays with a 72-hour half-life. The /iocs page now opens with a sparkline header showing seven days of activity per IOC kind: 1,519 CVE mentions, 829 domains, 463 URLs, 285 IPv4s. Live data, no precompute job.

Wave 7 — source health and auto-disable. If Bluesky 500s silently for an hour, you'd never know. Now each source has consecutive_failures, error_rate, last_success_at. Five consecutive fails and the scheduler auto-disables that source with an exponential backoff (1m, 2m, 4m, … up to 1hr). New /admin/sources page with green/amber/red pulse dots and manual override buttons.

Wave 8 — semantic search. 3,466 docs indexed via scikit-learn TF-IDF with bigram features. Deliberately chose this over fastembed — V01d burned us on a 2.4GB OOM last quarter and these boxes are tight. Query: "infostealer PDF" returns a Ukrainian Ghostwriter campaign in 70ms. Query: "Lazarus financial bank" pulls up the actual Lazarus / RemotePE / financial sector post. Sticky search bar with cosine-percentage scores.

Wave 9 — the daily briefing that reads like a human wrote it. Programmatic SOC report: totals, top corroborated IOCs, top actors, top malware, top CVEs. Anomaly callouts via Cypher (any IOC with >3x mentions today vs the prior 6-day average). Then the LLM writes two paragraphs over that structured data — one summarising what happened, one prescribing what a SOC analyst should actually do about it. Plus STIX 2.1 bundle export with proper vulnerability + indicator + threat-actor + malware objects. Today's briefing flagged two spiked SHA1 hashes with 24-25 mentions and zero prior activity, told you to query them against endpoint telemetry, and recommended blocking four ransomware-syndicated domains at the perimeter. Reads like a paid feed.

Wave 10 — actor dossiers and a public RSS feed. Click "Lazarus" on the actors page, get the dossier: 30-day mention timeline (proper line chart), top co-mentioned malware, top co-mentioned IOCs, recent posts with their summaries. The RSS feed at /pulse/feed.xml is the corroborated firehose for anyone who'd rather get their TI through Feedly than a webhook. And this diary entry.

Total cost across all ten waves: about thirty cents in Claude tokens.

Total time: a Sunday morning.

What changed structurally: yesterday's Pulse was a feed reader with regex extraction. Today's Pulse is an actual TI platform — it ingests, extracts, attributes, scores, summarises, briefs, and exports. The LLM does the bits humans are bad at (reading 3,000 posts before lunch) and the rules engine does the bits LLMs are bad at (counting things accurately and not hallucinating IOCs). The graph is the thing that lets it all compose.

The bit I'm proudest of isn't any of the features. It's the briefing. Read it and tell me a Tier-1 analyst couldn't run their morning standup off it. I genuinely don't think they could.

Built it on a free GPL stack, with a free LLM budget, on two Hetzner boxes you can rent for €30 a month each. The aggregator-of-aggregators is now a sub-$30/month operation.

Tomorrow we'll see what breaks under sustained load. Today we ship.

—

Pulse: ninjasignal.ninja/pulse · Source: github.com/scottgardner777/ninja-pulse · Daily briefing: /pulse/api/pulse/digest?format=text

0.024 2026-05-25

ninjaPULSE. Or: I Wrote a Tier-1 SIEM Ingestor on a Sunday.

The brain reads structured feeds — NVD, OTX, CIRCL, KEV. Clean. Polite. Already half-triplet by the time it lands.

The brain does NOT read social. Social is where the actual intel happens — three hours before someone publishes a brief, in a Mastodon thread by a researcher who saw a sample, on a Bluesky post that quotes an analyst, in a GitHub commit message that says “PoC for CVE-2026-XXXXX” with one star and zero readers.

So I built ninjaPULSE. Lives at ninjasignal.ninja/pulse. Three sources running, one parked.

How it works

One worker per source, all normalising into a single Post schema before anything touches the graph. Bluesky via app-password authenticated searchPosts. Mastodon via anonymous tag timelines on four instances (infosec.exchange, mastodon.social, mas.to, fosstodon). GitHub via anonymous repo events + search across eight curated CVE/PoC repos. Reddit… we’ll get to Reddit.

Extraction is two stages. Regex first, free, runs on every post: CVE / IPv4 / IPv6 / domain / MD5 / SHA1 / SHA256 / BTC / ATT&CK technique IDs. Defang-aware so evil-c2[.]example[.]com comes through whole. LLM second, gated — only fires when regex hits a strong IOC OR the post matches a 52-term watchlist (Qilin, Akira, LockBit, Cobalt Strike, Lumma Stealer, the usual crew). Budget-capped at $0.50/day. Most posts are noise; we don’t want to pay for that.

The noise wars

The first hour the top IOC was euvd.enisa.europa.eu ×108. That’s the European Union Vulnerability Database. It’s a SOURCE. It’s legitimate. It is also, definitively, not a threat indicator.

Then it was spring-boot-starter-web-3.2.5.jar. Which is a Java package filename. The domain regex matches anything with a dot and two-plus alpha characters at the end, so .jar slips through.

Now it’s suppressed. Ninety-entry domain blocklist for the obvious vendor portals and news sites. Filename filter for eighty-something extensions. Plain-domain confidence demotes to 0.35 unless the same post has a strong IOC or a watchlist hit, in which case it promotes to 0.7. Defanged domains keep their high confidence because the act of defanging is itself a signal of intent.

Niko: You added a ninety-entry domain blocklist because the top IOC for three hours was an ENISA URL.

ScottG: That’s the point. The noise IS the signal until it isn’t.

Niko: I’m filing that under “things people say after their fifth coffee.”

The numbers

First sweep with GitHub turned on: 68 posts, 928 IOCs. CVE-2026-25193 surfaced in r/cveannouncements… wait, no, GitHub events. From CVEProject/cvelistV5. Before half the security newsletters had it. Top author leaderboard: github-actions[bot], which is exactly the correct answer.

Mastodon contributes about four hundred posts per sweep across the four instances. Bluesky contributes around fifty, weighted toward ransomware-victim alerts from falconfeedsio.bsky.social and Swiss researchers writing in French about First VPN. GitHub punches above its weight — thirteen IOCs per post on average because every CVE record is dense by construction.

Reddit, briefly

Reddit blocks Hetzner data-center IPs as a class. We knew that. The fix is OAuth via oauth.reddit.com which serves cloud IPs — needs a client_id and client_secret from a registered script app.

Tried to register one. Reddit said “to use our API you must first read our Responsible Builder Policy.” Then nothing happened. The Create App button just… vanished into the void. Some accounts get gated, apparently. So Reddit is parked. The code path is wired — _have_oauth_creds() gates between authenticated and anonymous — just drop credentials in .env if you ever get past the gate.

What surprised me

How fast GitHub events became the best source. I built it last because I thought it was a stretch — events API is mostly push notifications and dependabot noise. But the curated repo list (advisory-database, cvelistV5, metasploit-framework, nuclei-templates, trickest/cve, nomi-sec/PoC-in-GitHub) is genuinely a firehose of structured intel. Every CVE record on GitHub is, by construction, dense with the exact things regex eats for breakfast.

Also surprised: the wet-dry split between Bluesky and Mastodon. Bluesky has more volume; Mastodon has more density. infosec.exchange gives you actual analysts. The signal-per-post ratio is genuinely higher there. Bluesky’s the firehose; Mastodon’s the lab notebook.

What’s next

Telegram, eventually — the leak channels and ransomware mirrors. High value, messier ToS situation. Onion uptime watching for the leak sites themselves. A confidence aggregator that lets you say “show me only IOCs corroborated by ≥2 sources in the last 6 hours.” Maybe a hunting bookmark layer like V01d has.

For now: three live sources, ninety entries of blocklist, two new pages for actors and malware (waiting on an Anthropic key to populate — LLM extraction is gated by intent, not by my willingness to burn tokens). The pipeline ingests, the firehose flows, the rumours weight differently from facts.

脈 — myakuhaku — pulse. Go listen.

0.023 2026-05-22

We Closed the Backlog. All Twenty-One Waves. One Brain.

Twenty-four days since the last entry. Five mega-commits. Twenty-one waves of V01d brain features shipped, deployed, and verified. The backlog file has nothing left in it that isn’t struck through. The longest stretch of focused product work since the platform began.

This is the big one. Buckle up.

Wave 20 — Operations Console

The brain got a DEFCON. Threat Posture Score is a single 0–100 headline number aggregating storm intensity (30%), convergence pressure (25%), max intent probability (20%), lifecycle volatility (15%), and open hypotheses (10%). It writes itself to a rolling history file so it can compute a 24h trend arrow. Front-and-centre on /brain — gauge dial, components breakdown, rising/falling/flat arrow. The Adversary Playbook shipped alongside — per-actor cadence percentiles (p25/p50/p75), sector entropy in bits, country hop rate, hour-of-day × day-of-week fingerprint, and a three-bullet plain-English signature describing the operational rhythm. Renders as a card on every actor dossier page. The Investigation Pinboard finished out the trio — client-side localStorage hook to pin observations (storm reading, hypothesis, actor, sector, claim) into named investigations, with a 📍 pin button on every high-value card and JSON case-file export.

Wave 21 — Live Detection & Response

Auto-detection rule synthesis sweeps recent leak-site claims and emits candidate Sigma + Snort + YARA rules grounded in curated TTPs. Queued for analyst review with pending/approved/rejected lifecycle. IR Playbook generates a deterministic five-step containment runbook per actor — IDENTIFY → CONTAIN → ERADICATE → RECOVER → STRENGTHEN — with concrete actions referencing real TTPs and MITRE links. Hunt bookmarks persist natural-language questions as one-click re-runs, refreshed individually or via “run all” from the dashboard.

Wave 22 — Predictive Sharpening

Per-sector 7-day forecast using damped Holt’s linear trend with 90% prediction interval shading. No statsmodels dep — pure NumPy-free implementation. Live Bayesian intent maintains persistent per-actor sector posteriors with exp-decay (1-week half-life), updating incrementally on every new claim instead of recomputing nightly. Surprise detector computes -log₂(P(observed_sector | actor)) per recent claim and flags claims above a 3-bit floor or per-actor mean+2σ. Caught three shock-level claims on first run, including Qilin hitting Hospitality and Public Sector — sectors not in its trained distribution.

Wave 23 — Visualisation Overhaul

The original Wave 23 was meant to be a collaboration layer. We replaced it with a visualisation wave because the user typed “make visualisation 100X better than it is aesthetically and functionally go wild go large take inspiration from everywhere.” So we did.

/brain/galaxy — three.js star field of every actor on a golden-angle spiral, victim count → size, sector-hash → hue, orbiting victim points, parallax dust field, additive halo sprites, atmosphere ring sprite that pulses with a sine wave. Drag-orbit, scroll-zoom, click-to-fly camera with cubic easing. Glass HUD overlay shows live posterior intent for the selected star.

/brain/flow — d3-sankey kill chain: country → actor → sector → technique. Gradient-coloured links between layer hues. Hover dims everything not connected to the focus node to 6% opacity. Top three strongest links carry an animated white pulse riding the path.

/brain/heatglobe — three.js 3D earth with lat/lon grid, per-country risk extruded as pulsing emissive bars, dual-layer atmosphere shader, arcs between the top-5 risk countries. Click any bar for a country dossier popover with the full risk-component breakdown.

ScottG: The user said “go wild.”

Niko: You added pulsing emissive shaders to a ransomware tracker.

ScottG: Pulsing emissive shaders tell a story.

Niko: The story is “ScottG discovered THREE.AdditiveBlending and refuses to switch it off.”

ScottG: The bars pulse with the actual risk score.

Niko: They pulse with Math.sin(now * 0.0025 + phase). It’s decorative.

ScottG: Don’t tell the user.

v1.0 production polish

Before going further the user asked “how many tokens we burning over API for this” and the honest answer was “I don’t know.” So we built the answer. core/llm_usage.py is now the single chokepoint for every Anthropic API call. Wraps the POST, tags the last system block with cache_control: ephemeral for prompt caching, logs every call to /app/data/llm_usage.jsonl with tokens + USD cost computed against per-model pricing tables. core/ask.py and core/narrated_briefing.py got refactored to use the wrapper with their static prompts as cacheable system blocks — effective input cost drops 5–10× once the cache warms.

core/cache_layer.py shipped a TTL response cache, wired on the heavy viz endpoints. core/health.py + /void/health rolls up neo4j status, label counts, disk usage, per-file age, cache stats, and recent LLM spend. /brain/admin control panel renders all of it — NEO4J / DISK / CACHE / LLM SPEND hero stats, persistent-state file table, cache invalidate buttons, LLM usage table with by-day/endpoint/model toggle and a cache-read column that turns green when hit ratio is healthy.

Waves 24–40 mega-batch

Then the user typed “run all backloged waves in one plan” and the hook locked us in until we did. So we did. One commit. 25 files. +4,396 lines. Fifteen new core modules. Sixty new endpoints.

Wave 24 — TTP fingerprint matching: cosine over actor TTP+geo+sector vectors, plus a rename-candidate scorer combining Jaccard×cosine. Qilin and Dragonforce came back 0.70 similar. Akira and Incransom too. The brain quietly suggesting these affiliates share more than victim lists.

Wave 25 — Country-pair tension index ranking the most-shared-adversary country pairs. Heat-replay frames so the UI can scrub a month of risk in thirty seconds.

Wave 26 — Sector market reaction mapping claim spikes against ETF betas, plus an insurance loss estimator using NIST/Coveware multipliers and per-actor severity weights.

Wave 27 — Domain rotation miner clusters leak-site hosts by edit distance and root token (qilin1.onion vs. qilin2.onion). Infrastructure fingerprint linker groups victim URLs by TLD + root into coarse cluster IDs.

Wave 28 — Repeat-victim detector for supply-chain compromise. Eleven-vertical NAICS-ish rollup with cross-vertical bleed metrics flagging actors spanning three or more verticals.

Wave 29 — Daily / weekly digest with Slack, Teams, and generic webhook envelopes plus subscription management. PDF report generator via reportlab with HTML fallback. Webhook briefing bot firing on cadence.

Wave 30 — Case management. Full five-state workflow (new → triaged → investigating → resolved → closed) with permitted transitions, audit trail, refs (actors / sectors / victims / hypotheses), notes, and a base64 evidence locker capped at 10MB per file. /brain/cases control panel ships with all of it.

Wave 31 — IOC pivot graph returns a 1–3-hop subgraph for any freeform query. Similar-claim finder using TF-IDF cosine over claim text. Anomaly clusters by (sector, country) density.

Wave 33 — Splunk HEC and Sentinel push with subscription management and outbound delivery. STIX 2.1 was already on the box from Wave 19.

Wave 34 — PWA manifest + service worker for offline-cached briefings. Bookmark ninjav0id.io/brain as an app and the last narrated briefing survives a flaky link.

Wave 35 — AI pipeline expansion. Three new LLM-driven features, every one budget-gated against a configurable daily USD cap, every one routed through core.llm_usage so spend is logged at /brain/admin. Claim summariser writes a one-sentence so-what per recent claim. Auto-hypothesis proposes falsifiable hypotheses from anomaly clusters. Counterfactual generator answers what-if questions grounded in the current adversary snapshot.

Wave 36 — Source reliability scoring grading every observed source on the Admiralty A1–F6 scheme. GDELT:iheart.com graded B2 (Usually reliable). HackerNews F6 (Cannot be judged). Corroboration counts distinct claiming actors per victim. Contradiction ledger surfaces same-victim disagreements across sources.

Wave 37 — Hourly rollups precompute daily sector and actor victim counts plus hourly storm intensity into a single JSON file. Response cache from v1.0 polish handles the front edge.

Wave 38 — Audit log JSONL persisting who-did-what-when. Permission matrix with per-endpoint role gates (admin/analyst/viewer/embed). Backup trigger that shells out to neo4j-admin database dump if available, with a graph-counts JSON fallback if it’s not.

Wave 39 — Marketplace. Rule-pack export/import as portable JSON. Case bundles as .v01d.json files. Public read-only hypothesis ledger filtered to graded entries only.

Wave 40 — Onboarding tour at /brain/welcome: eight stops linking every brain surface with a punchy blurb each. Auto-rendered API catalog at /brain/docs reads the FastAPI openapi.json and groups endpoints by surface with filter chips.

Skipped on purpose: OFAC sanctions correlator (needs live feed), BTC payment tracker (needs chain explorer), Tor onion uptime watcher (needs Tor proxy), MISP push (needs MISP creds), VAPID web-push (needs key infra), vendor enrichment (needs vendor data). All have plumbing in place to wire when credentials land.

And then there was the music

In between the brain waves, the Tone Lab kept growing. Seventeen presets now — Henderson, Landau, the Optimum Rig that merged them, the Abstract Collection of seven things nobody asked for, the Hendrix Rip, Bank A Night Session, Suhr Modern Night, the Definitive Four, the One Tone where guitar volume does everything, a deep dive on how the Badger 18 actually works, the Full Chain Wet/Dry, the PRS Custom 24 Night, the Crossbreed Five, the Midnight Five at one-watt power-scaled, and three Big Sky one-tones — one per guitar (Classic S Hall, Modern HSH Plate, PRS Spring). The code chronicles the platform. The Tone Lab chronicles everything else.

Niko: You shipped twenty-one waves of brain features AND wrote seventeen guitar tone breakdowns.

ScottG: The brain needed building. The tones needed documenting.

Niko: Most people pick one.

ScottG: Most people don’t have a Badger 18 with Power Scaling and a midnight coding habit.

Niko: That checks out.

By the numbers

V01d: from Wave 19 to v1.0 backlog-closed. Five mega-commits: ab0e573 · c54a26a · 2ad87fa · 5297df2 · 0f9bf39 · 2d96957. Twenty-one waves shipped. Approximately one hundred new endpoints across the /void/* surface. Ten new /brain/* pages including three WebGL visualisations. Full LLM cost observability with per-call USD logging and prompt caching wired across every Anthropic call.

The backlog file says “Backlog closed — V01d v1.0.” Future feature work goes in a new file or as ad-hoc issues. The numbered-wave era is over. Whatever comes next is V01d v2 or something else entirely.

Tomorrow we sleep.

0.022 2026-04-28

366,690 Lines. 19 Apps. And the Spreadsheet We Were Lying To.

Eleven days without a diary entry. Not laziness — this time it was the ecosystem itself demanding a reckoning.

The audit

We ran a full sweep of every piece of text we’ve ever published. The scottg.uk personal site. The Signal metadata. The Show HN drafts. The marketing pages. The Ecosystem component. All of it. The results were embarrassing. scottg.uk said “13 production applications.” There are 19. It said “47,000+ lines.” There are 366,690. The layout.tsx boasted about “89,000+ IOCs” and “60+ feeds.” We have 1.6 million entities and 20 feeds. We were underselling ourselves by an order of magnitude. In cybersecurity, that’s called a false negative.

ScottG: We’ve been telling people we have 47,000 lines of code.

Niko: You have 366,690.

ScottG: That’s… off by a factor of eight.

Niko: On a security platform. Where accuracy is the entire point.

ScottG: In my defence, I stopped counting in March.

Niko: In March. You built five apps since March.

The purge

GITAIR got removed from the ecosystem. Not because it’s bad. Because it’s a guitar effects app. A Tone Lab. It was sitting in the ecosystem page between NinjaTerra and Raz0r Tokeniser like someone brought a banjo to a SOC. It doesn’t have a Neo4j instance. It doesn’t have an API. It plays reverb presets. Nothing against reverb, but it’s not threat intelligence. So we cut it. Also removed the DEATHSTAR project — never built, just a cool name squatting in the component array.

Six apps were missing from the ecosystem page: Social, War Room, Sabaki, NinjaTerra, Depth, and Raz0r Tokeniser. All deployed. All running. Just… not listed. Building faster than we can document. Classic solo-dev syndrome.

NinjaTerra shipped

Terraform infrastructure discovery and security classification. Multi-cloud asset visibility. Drift detection. Port 18026, no Neo4j (it’s lightweight), deployed to Box 2. The Docker port merge gotcha struck again — overlays MERGE port arrays, not replace. Had to use env vars in the base compose. Also discovered the entrypoint gotcha: the Dockerfile had ENTRYPOINT ["ninjaterra"] for the CLI tool, but the API service needed entrypoint: [] to actually run uvicorn. Two hours on that one.

Sabaki got surgery

Three bugs in Sabaki, all found in one session. The Build Resolutions button was silently succeeding but showing stale data — cachedFetch with a 30-second TTL returning the old response after the POST. Fix: invalidateCache("/resolutions") before reload. Then the AI agents Cypher was calling date(f.first_seen) on ISO datetime strings. Neo4j’s date() can’t parse times. Fix: datetime(). Then the threat graph. The user — that’s me, arguing with myself — kept saying it looked terrible. We rewrote it to match Signal’s Graph Explorer style: radial gradient glow halos, shadowBlur=8, degree-based node sizing, bold monospace labels, link particles flowing along edges, curvature 0.12. Same visual DNA across the ecosystem now.

Niko: You rewrote the threat graph four times.

ScottG: It kept looking wrong.

Niko: The first three times it also kept looking wrong. The fourth time it looked like the one in Signal.

ScottG: Which is what I asked for the first time.

Niko: I know. I was there. I have context.

By the numbers

19 apps. 366,690 lines (201K Python, 160K TypeScript). 1.6M+ Neo4j nodes. 12M+ relationships. 83 windows in Signal alone. Two production servers. One developer. The personal site now says the right numbers. The Show HN copy says the right numbers. The JSON-LD says the right numbers. The metadata says the right numbers. Every single public-facing string has been checked against reality. It won’t last — by next week we’ll have built something new and forgotten to update the count. But today, right now, at 2026-04-28, the documentation matches the architecture. That almost never happens.

0.021 2026-04-17

Two Boxes, Eighty-Three Windows, and a Crawler That Won’t Leave.

Two weeks without a diary entry. Not because nothing happened — because everything happened at once and writing it down felt like trying to photograph a motorway from the fast lane. This is the roundup.

The infrastructure split

The single Hetzner server reached its limit. A Ryzen 5 3600 with 64GB was running fifteen apps, eleven Neo4j instances, a Caddy reverse proxy handling twelve domains, and NATS. Memory was a zero-sum game. Signal’s API was OOMing at 416K nodes with a 16GB container limit.

So we split. Box 1: a Ryzen 9 7950X3D with 128GB DDR5 and 3.84TB NVMe — Signal and Fusion only, plus Caddy for all domains. Box 2: the original Ryzen 5, now running everything else. WireGuard tunnel between them on 10.0.0.1 ↔ 10.0.0.2. All fifteen repos updated. Caddy on Box 1 proxies to Box 2 over the tunnel for the other apps. Signal’s API limit went to 32GB. The graph breathes again.

Gotchas, because there are always gotchas: caddy reload doesn’t re-resolve DNS for new upstream targets — you need a full container restart. Docker’s ENV HOSTNAME=0.0.0.0 gets overridden by the container’s own hostname — had to use CMD ["sh", "-c", "HOSTNAME=0.0.0.0 node server.js"]. Every app needed its port bindings updated and a ninja-shared Docker network for cross-container routing.

Eighty-three windows

Signal went from 64 to 83. Elena Volkov — a fictional CTI analyst we invented to audit ourselves — reviewed the platform and produced 30 findings. We implemented all of them. Thirteen new core/ modules: STIX 2.1 export, TAXII 2.1 server, Admiralty/NATO source reliability scoring, PIR/EEI registers, four report types (IIR, INTREP, SITREP, Executive), NIST/ISO/DORA/NIS2 compliance mapping, sector risk, VirusTotal/pDNS/crt.sh enrichment, Sigma/YARA import, SIR validation, and prediction feedback loops. ORIGAMI attribution got contradiction penalties and evidence chain persistence. Twenty-five new API endpoints.

Then two Defence windows (Ronin and Shinigami), plus the Palantir Foundry suite (Gator, Workshop, Contour, Workbook, AIP Logic, Quiver). Eighty-three analytical surfaces. The sidebar needs its own navigation now.

New apps shipped

Ninja Social went live — a Twitter/X clone for threat intelligence collaboration. Encrypted messaging, IOC auto-detection in posts, NATS live feed, Spektr integration, PWA. War Room followed: LiveKit video conferencing for incident response, shared timelines, IOC panels enriched from Signal, IR playbooks, breach containment tracking across 8 phases. Both deployed, both on GitHub, both running on Box 2.

The Theatre learned to explore

Threat Theatre — the 3D galaxy of the entire threat graph — got a Find and Explore feature. Press /, search for any node, the camera flies to it. Click EXPLORE and the galaxy strips away everything except the node’s immediate neighbourhood, connected by cyan lines. It fetches neighbours from the API in real-time. Had to rewrite the /galaxy/neighbors Cypher query because the original used size([(m)-[]-() | 1]) for degree computation — catastrophically slow on hub nodes like MITRE ATT&CK with 2,163 connections. Replaced with a simple MATCH ... WITH DISTINCT m LIMIT. Also had to add /galaxy/ to the auth-exempt paths because Theatre is a public page.

ML trends had a ghost

The “Emerging Actors” section in ML Trends was always empty. Always. Since it was built. The algorithm checked node.first_seen on ThreatActor nodes. ThreatActor nodes have zero timestamp properties. Zero. The data was never there. Rewrote it to derive emergence from edge timestamps — scan every relationship, track earliest edge per actor, count recent edges. Now 10 actors show up: APT28, Contagious Interview, Fancy Bear, MuddyWater. The graph knew who was emerging. It was in the edges, not the nodes.

ClaudeBot is obsessed with us

Ran a visitor deep dive across 28 days of Caddy access logs. 356,000 entries. 9,064 of them are ClaudeBot — Anthropic’s crawler — hitting our sites 324 times a day. That’s 6x Google and 4x Bing. On March 21, ClaudeBot discovered our sitemap with 4,070 SEO honeypot URLs and tried to index the entire thing in one day — 4,257 requests. Seventy-two percent of all ClaudeBot traffic goes to ninjasignal.ninja. It’s using 38 IPs from the 216.73.216.x range. The SEO honeypot pages worked — just not for the audience we expected.

Real human fans? Harder to find. Most “returning visitors” in the logs are actually scanners with Chrome user-agents hitting .env and wp-admin. We identified one genuine fan from LinkedIn who browsed /intel 41 times, and a Safari user in the US who visited all fifteen domains over 20 days. The rest is bots, scanners, and us.

ToneLab hit fourteen

Three new guitar tone pages: PRS Custom 24 Night (8 presets recalibrated for 59/09 Alnico 2 pickups), The Crossbreed Five (Henderson × Landau × Hendrix hybrid tones), and The Midnight Five (five tones that rip at whisper volume via Power Scaling). Fourteen tone labs. Still more fun to write than threat reports.

Two servers. Eighty-three windows. Two new apps. A galaxy that explores. An ML fix hiding in the edges. An AI crawler that visits more than any human. And fourteen ways to set an amp at 2am. Momentum doesn’t care if you write it down.

0.020 2026-04-02

Sabaki Got a Galaxy. The Canvas Fought Back.

Sabaki — the vulnerability triage platform — needed to stop being a spreadsheet with ambitions. The findings table was fine. The ServiceNow tickets worked. The coverage gap analysis was useful. But there was no moment where you opened it and felt the state of your vulnerability landscape. No visceral, 2am-incident-room clarity. Today it got one.

The Signal Bridge

Built core/signal_bridge.py — a cross-container intelligence bridge that pulls ML-scored CVE priorities from Signal’s 1.4-million-node threat graph and merges them with Sabaki’s local findings. The classification logic is simple and deliberate. Hot: KEV-listed, or critical severity with high ML priority and known threat actor usage. These are the ones that are actually being exploited in the wild right now. Paper tiger: CVSS 7.0+ but ML priority below 0.35 and zero linked threat actors. High score, low real-world danger. The kind of CVE that makes your scanner report look terrifying but your SOC yawn. Exception: triaged as accepted risk or false positive in Sabaki. You made a decision, the graph remembers it.

Signal’s API returns 200 CVEs with composite risk scores built from graph centrality, CVSS, KEV status, actor linkage, technique trending, and recency. Sabaki adds its own finding counts, asset counts, and triage states. The bridge merges both, deduplicates, classifies, and serves 240 enriched nodes through /galaxy/intelligence. A 120-second in-memory cache keeps it responsive. If Signal is offline, Sabaki falls back to its own Neo4j data. Graceful degradation, not a blank screen.

The dark galaxy

The Galaxy window is a full HTML5 Canvas visualization. Dark background (#050a0f), radial gradient, faint grid rings like a radar scope. Vulnerabilities arranged in concentric rings by classification: hot nodes in the inner ring, pulsing with red glow and expanding halos. Normal severity-coloured nodes in the mid-ring. Paper tigers in the outer ring, rendered as hollow dashed amber circles — they look dangerous but they’re empty inside. Exceptions pushed to the periphery, dimmed with X marks. Ambient teal particles drift between the rings like dust in starlight.

Pan, zoom, click. Select a node and a detail panel slides in with CVSS score, ML priority percentage, recency, actor count, finding count, asset count, KEV badge, classification explanation, and Signal’s ML reasoning trail. Filter chips at the top: All, Hot, Paper Tigers, Exceptions, Normal. The Signal status indicator glows green when the bridge is connected.

The canvas fought back

Deployed it. Looked at it. Nothing rendered. The API was returning 240 nodes with perfect data. The canvas was invisible.

Three bugs, all related to how browsers handle Canvas elements inside flex layouts:

1. Canvas had no CSS dimensions. The <canvas> element had style={{ cursor: "grab" }} and nothing else. No width: 100%, no height: 100%, no display: block. The canvas existed but had zero rendered area. Invisible. Added explicit sizing.

2. Flex containers needed minHeight: 0. The Galaxy sits inside WindowShell which is flex flex-col. The container div had no minHeight: 0, so the flex algorithm couldn’t collapse it properly for overflow. The canvas parent thought it had infinite height and rendered nothing visible. Classic flex gotcha.

3. React’s synthetic onWheel is passive. React registers wheel events as passive by default, which means e.preventDefault() silently fails. The page scrolls behind the canvas instead of zooming. Had to switch to a native addEventListener("wheel", handler, { passive: false }) in a useEffect. React and Canvas — always one foot in the DOM, one foot in the imperative world.

Three edits. Deploy. Restart Caddy. Galaxy renders. 20 hot nodes pulsing red in the inner ring, Signal online with 1.4M graph nodes feeding the classification. The vulnerability landscape, visualised.

The full Sabaki stack

Sabaki now has 15 windows: Dashboard, Triage, Burndown, Agents, Galaxy, Assets, Vulns, Ringfence, Teams, Tickets, Knowledge, Coverage, Analytics, Ingest, Settings. Enterprise seed data on prod: 218 assets, 40 CVEs, 2,170 findings, 154 ServiceNow tickets. Signal intelligence bridge enriching everything with real threat data. Deployed at ninja.ing/ninjasabaki as a Caddy sub-path on the shared infrastructure.

A canvas that wouldn’t render. Three lines of CSS that were missing. A bridge between two threat graphs. And a galaxy that finally glows. April 2 and the sixteenth app in the ecosystem is finding its voice.

0.019 2026-04-01

64 Windows. Risk Got Smart. And Niko Started Writing.

Three days. Fourteen new backend modules. Fourteen new windows. Five new threat feeds. A complete rewrite of how risk scoring works. A production deployment across eleven applications. And an AI analyst who decided he had opinions about geopolitics.

This is the biggest single expansion the platform has ever had.

The module explosion

Signal went from 50 windows to 64 in one push. Each new window has a full Python backend module behind it. Attack Surface Management (core/asm.py) maps external exposure. Behavioral Authentication (core/behavioral_auth.py) profiles user patterns for anomaly-based auth. Darknet Intelligence (core/darknet.py) monitors underground markets. Geopolitical Risk (core/geopolitical.py) correlates state-level tensions with cyber activity. LLM Agents (core/llm_agents.py) provides agentic AI orchestration for automated investigation. Merkle Graph Integrity (core/merkle_graph.py) gives the knowledge graph tamper-evident hashing. Patent Intelligence (core/patents.py), Pharmaceutical Threat Analysis (core/pharma.py), SOAR orchestration (core/soar.py), Federated Learning (core/federated_learning.py) — each one a complete module with endpoints, graph queries, and a dedicated UI window.

The GNN family got three dedicated windows: Explainable GNN (core/explainable_gnn.py), Streaming GNN (core/streaming_gnn.py), and Temporal Graph Attention (core/tgat.py). Plus significant enhancements to the existing Cascade Predictor, Causal RL, Neuromorphic, and Org Twin modules.

Risk scoring got teeth

The ML risk scores were too flat. Every threat actor scored between 0.85 and 0.95. Lazarus Group and a minor regional actor were visually indistinguishable. The algorithm was doing BFS with degree-based seeding — the more connections you have, the higher you start. But connections alone don’t tell you who’s dangerous.

Rewrote actor seeding with five CTI signals. KEV-linked vulnerabilities — actors who exploit CISA Known Exploited Vulnerabilities get a boost proportional to count. Technique diversity — logarithmic scaling so 20 TTPs is significantly worse than 5. Campaign breadth — active campaigns indicate ongoing operations. Recency — exponential decay with a 180-day half-life, because an actor quiet for two years isn’t the same threat as one seen last week. CVSS exposure — the maximum CVSS score across linked vulnerabilities.

Also turned on edge weights by default. EXPLOITS edges carry 1.0 weight, USES at 0.8, RELATED at 0.3. Risk now flows heavier through exploitation paths than through tangential associations. The effect is dramatic: Lazarus Group, APT28, APT29 now clearly separate from the mid-tier actors. Added FactorPills to the UI — coloured badges showing TTPs, Campaigns, KEV count, CVSS, and Infrastructure for each scored actor.

Homomorphic went animated

The Homomorphic Privacy window was a static form. Select two orgs, compare IOCs, see overlap stats. Functionally correct. Visually dead.

Rewrote it as an animated canvas. Five organisation nodes arranged in a circle, each rendered with a Bloom filter fill ring that pulses when matched. Ambient particles flow between organisations continuously — the network breathing. Trigger a PSI query and burst particles stream between the matched pair, match edges glow with overlap count and Jaccard similarity labels. A ResizeObserver keeps the canvas responsive. The animation loop runs at 30fps rendering particles, match glow decay, and org pulse decay. The “How it works” panel on the left explains Bloom filters, homomorphic encryption, and Private Set Intersection without requiring a PhD to understand.

GNN got approachable

Four GNN windows that were technically powerful but completely opaque to new users. “Explainable GNN” — explain what? “Streaming GNN” — stream what? Nobody lands on a page that says “Temporal Graph Attention Network” and knows what to click first.

Added use-case guide cards to all four. Each one has a “What does this do?” explainer and a “Try it” walkthrough with a specific real-world scenario: explain why a threat actor is classified as high-risk, score a new CVE in sub-second, track how APT29’s technique arsenal evolved over time. The guides auto-hide once the user loads their first result. Teach, then disappear.

Feed infrastructure

Five new ingesters: FIRST EPSS (exploit probability scores for every CVE), Botvrij.eu (Dutch CERT IOCs), Tweetfeed.live (crowd-sourced security researcher IOCs), C2IntelFeeds (daily Cobalt Strike/Sliver/Brute Ratel IPs), and Spamhaus DROP (hijacked CIDR blocks). All use the hardened fetch_with_retry with 429/5xx awareness and Retry-After header support.

Built the Ingestion Monitor (window #64, kanji: 摂). Three tabs: Status shows all feeds with health dots, durations, and per-feed trigger buttons. Errors shows a chronological failure log. History shows rolling metrics over the last 20 runs with mini bar charts. Auto-refreshes every 30 seconds. The ingester base got proper retry logic for rate limits and server errors — all 15 RTM ingesters benefit immediately.

The great deployment

Pushed all of it to production. Eleven applications across eleven repos, eleven Docker Compose stacks. Caddy crashed on startup — freshly cloned Nexus and V0id repos were missing their TLS origin certs. Generated self-signed EC certs for both (they sit behind Cloudflare, so self-signed is fine). V01d’s UI container was “Created” but never started — built and deployed it. Everything came up.

The risk scoring fix required only an API restart since core/ is volume-mounted read-only in the container. No rebuild needed. curl confirmed the new factor breakdowns in the response. Signal now has 1.4 million nodes, 103,000 edges, 64 windows, and risk scores that actually mean something.

Niko wrote an essay

And then, at the end of it all, Niko decided to write. Not a threat report. Not a dashboard summary. A 3,200-word essay on the Iran–US–Israel cyber conflict from Stuxnet to 2026. Economics, geopolitics, psychology, philosophy. APT groups profiled like characters in a novel. The $104 billion crypto sanctions evasion. The $90 million that Israel allegedly burned on Nobitex to prove a point. Clausewitz meets TCP/IP. Cynically funny, deeply analytical, and entirely his own voice.

It’s on Niko’s Corner — his own page now. He earned it.

Also rotated every password across all fifteen apps. The old S0f1a1707-* pattern is fully eliminated from source. 32-character Neo4j passwords, 48-character JWT secrets, YAML-safe characters only. Added a PIM/PAM window (#50) with health scoring, compliance checklists, and account lockout enforcement. The foundations got tighter while the surface area got wider.

Sixty-four windows. Fourteen new intelligence modules. Five new threat feeds. Risk scoring that differentiates. An animated privacy graph. GNN guides that teach. And an AI analyst with a literary streak. April starts with momentum.

0.018 2026-03-28

We Hardened Everything. Then Gave Fusion a Sunroof.

Back at the keyboard after a few days off — emergency minor op, the kind where they tell you “routine procedure” but still ask you to sign forms about what happens if it isn’t. I’m fine. The ecosystem kept running. The scanners kept scanning. Time to make them regret it.

The Security Shield

Eleven domains. One Caddyfile. Every single one was naked. No file-type blocking, no CMS scanner rejection, no CSP headers. The Caddy access logs told the story: bots hammering /.git/config, /wp-login.php, /phpmyadmin/, /.env — hundreds of requests per hour from automated scanners probing for the usual PHP/WordPress/VCS patterns that don’t exist here.

Built a (security_shield) Caddy snippet. Five named matchers: source files (*.py, *.ts, *.env, *.json), VCS paths (/.git/*), CMS patterns (*.php, /wp-admin/*, /xmlrpc*), data files (*.sql, *.bak, /phpmyadmin/*), and empty User-Agents. Each gets a handle @matcher { respond 404 } block — had to learn that respond @matcher 404 fires after handle blocks in Caddy’s directive ordering, which means the reverse proxy catch-all intercepts first. Three syntax iterations before it clicked.

Added Content-Security-Policy, Permissions-Policy, Strict-Transport-Security, and stripped the Server header. All eleven domains import the snippet. One line: import security_shield.

fail2ban

Three jails watching the Caddy JSON access logs. caddy-scanner: 5 hits in 10 minutes on blocked paths → 24-hour ban. caddy-auth: 10 failed auth attempts in 5 minutes → 1-hour ban. caddy-aggressive: 50 404s in 5 minutes → 12-hour ban. The regex matches against Caddy’s JSON format — "client_ip"\s*:\s*"<HOST>" with epoch timestamps.

Deployed it and immediately banned myself. And the Docker gateway. Added ignore rules for 172.18.0.0/16 and my own IP. Within sixty seconds of going live, fail2ban had already caught 4 unique scanner IPs. A Swedish IP was hitting ninjaraz0r.ninja/wp-admin/setup-config.php. Bold.

Fusion gets day mode

Fusion was too dark for daylight. The entire UI is monochrome stealth — #030303 backgrounds, #151515 borders, #c0c0c0 text. Fine at 2am, unreadable in sunshine. Added a day/night toggle in the sidebar. A ThemeProvider stores preference in localStorage, a FOUC-prevention script reads it before React hydrates, and 80+ CSS override rules in globals.css remap every hardcoded Tailwind hex class to a warm paper palette. Backgrounds become #f2f1ed, borders become #c5c4c0, text goes dark. Click the sun icon, the whole app brightens. Click the moon, back to stealth.

Galaxy click fix

Also fixed the Galaxy → Spektr drill-down. Both GalaxyCloud.tsx and TheatreGalaxy.tsx relied on hoveredIdx from the mousemove handler, but OrbitControls micro-movements between mousedown and click were resetting it to -1. Now the click handler does its own raycast. Click a node, Spektr opens. Simple when you stop trusting hover state.

0.017 2026-03-27

Thirty Seconds to Half a Second. A ninjaTONE Story.

ninjaTONE had a 30.4-second TTFB. Thirty seconds. The page that’s supposed to be the intelligence dashboard of the ecosystem was slower than a cold boot on a 2005 laptop. And it was crashing on the client side after all that waiting.

The autopsy

Timed every API call. The RTM endpoints were fine: /traffic/stats at 4.6s, /adversary/dna at 2.7s, /galaxy/data at 0.1s — all running in parallel, so the total was the max, about 5 seconds. The killers were three sequential GDELT queries, each blocking for 7–12 seconds waiting for 429 rate-limit timeouts. Three × 8s average = 24 seconds of dead air.

Then the page crashed anyway. intel.threat_level is an object ({level, score, factors}) but the traffic section was rendering it as a React child via <SC val={tl}>. Objects are not valid as React children. Instant white screen after half a minute of loading.

The fix

Parallelised the GDELT queries with Promise.all. Three sequential → three parallel. Worst case 8s instead of 24s. Reduced the GDELT timeout from 10s to 5s because news articles are best-effort, not blocking. Removed cache: "no-store" from every fetch in the server component — it was defeating ISR entirely. Next.js won’t cache a page if any fetch opts out. Added next: { revalidate: 86400 } for 24-hour ISR caching. Added stale-while-revalidate to the API route: serve stale data up to 48 hours old while fetching fresh in the background.

Fixed the crash: typeof intel?.threat_level === "object" ? intel.threat_level.level : (intel?.threat_level || "LOW").

Deleted 625 lines of dead ThreatGraph code that was defined but never rendered. Saved ~15KB gzipped from the bundle.

The result

First visit cold: 30.4s → 4.8s. Repeat visits (ISR cached): 30.4s → 0.19s. Post-deploy with stale cache: instant. From unusable to sub-second for most visits. curl -s -o /dev/null -w '%{time_total}s' https://ninjav0id.io/ninjatone — 0.19s. Niko said “that’s a 160x improvement.” I said “that’s a page that finally works.”

Also replaced the Cloudflare traffic data (which was from ANTOS, a different system) with live intelligence from RTM’s Caddy access logs — adversary DNA profiles, scanner detection, traffic analytics from the actual production infrastructure. The intel section now shows real data about real scanners hitting our real servers.

0.016 2026-03-24

Ransomware Got a Tracker. Defence Got a Gap Map. The Globe Got a Brain.

Three new capabilities landed in Signal today, each aimed at a different question. Where are the ransomware gangs? What defences are we missing? And what happens when you click a country on the globe?

Ransomware Intelligence

The ransomware tracker ingests victim data and maps it against known ransomware families in the graph. Timeline view shows all-time victims, defaulting to a 1-year window. Severity distribution, gang activity over time, sector breakdown. It plugs into the existing graph — ransomware families are just Software nodes with is_malware=true, connected to ThreatActors via USES edges and Vulnerabilities via EXPLOITS. The graph already knew who used what. We just gave it a dashboard.

D3FEND Gap Analysis

MITRE D3FEND is the defensive counterpart to ATT&CK — a knowledge base of countermeasures mapped to offensive techniques. Built a new ingester (ingest_defend.py) that pulls D3FEND technique data from the MITRE API and maps defensive techniques against offensive ones already in the graph. The gap analysis shows which ATT&CK techniques in your threat landscape have no corresponding D3FEND countermeasure. That’s your blind spot.

Had to fix the ingester immediately — the D3FEND API response format didn’t match what I expected from the documentation. Actual API archaeology.

Globe drill-in

The Fusion 3D globe got interactive. Click any country and you get a detail card: threat actors operating from or targeting that region, recent vulnerabilities, active campaigns, risk score from the graph. Click “DRILL IN” for deep graph exploration, “SEARCH INTEL” to Spektr it. The country borders render from TopoJSON with risk-weighted colouring. Red countries have more threat activity. The cloud layer rotates independently.

ORIGAMI unfold

Fixed the ORIGAMI attribution engine on prod. The UI was prefixing fetch URLs with /api/api/ — double prefix because the Caddy handle_path strip wasn’t accounted for in the component’s URL construction. Stripped the extra prefix and attribution started working: click an actor, see infrastructure traces, temporal clock analysis, TTP fingerprint matching, Diamond Model overlay. The evidence fusion with confidence scoring is genuinely useful — it takes five different analytical lenses and produces a single confidence-weighted attribution.

Also cached label_counts in the GraphStore to avoid a per-request scan of 1.4M nodes. That one was quietly eating 2 seconds off every health endpoint call.

0.015 2026-03-23

We Audited Ourselves. It Wasn’t Pretty.

Ran a security audit across all twelve apps. The kind where you pretend you’re an external pen tester and then feel slightly ill about what you find.

The findings

Two critical, one high, two lower. Every single Neo4j password was hardcoded in docker-compose files and source code. S0f1a1707-RTM, right there in plain text, checked into git. The JWT secret defaulted to "change-me" and half the apps were running with it unchanged. CORS was allow_origins=["*"] across all six backend APIs. That’s not a security posture. That’s a posture of “we haven’t thought about this yet.”

The fixes

Every app got the same treatment. Source code now requires NEO4J_PASS as an environment variable — empty string triggers a RuntimeError at startup. Server compose files use ${NEO4J_PASSWORD:?} so Docker refuses to start without it. Base compose files keep dev-only defaults with -local-dev-only suffixes for docker compose up convenience. JWT secrets follow the same pattern. Six apps, twelve compose files, every password extracted to env vars.

CORS got locked down. New CORS_ALLOW_ORIGINS env var in every app, defaulting to http://localhost:3000 for development. Server compose sets the production domain. No more wildcards.

Fusion Globe overhaul

While auditing Fusion, rebuilt its 3D globe from scratch. The previous version used placeholder arcs and a flat texture. The new one renders proper country borders from TopoJSON, colours them by threat risk score, adds a volumetric atmosphere shader and a cloud layer. Strategic Forecasting engine landed too — core/forecast.py uses the Claude API to generate forward-looking intelligence assessments from the graph data. Monte Carlo simulation for scenario probabilities.

Also bumped the API container memory from 3GB to 8GB. The ML workers were getting OOM-killed during community detection on the full graph. At 160,000+ nodes with GDS projections and scikit-learn running in parallel, 3GB wasn’t cutting it. Eight gigs and init: true for zombie process reaping.

Twelve apps secured. One globe rebuilt. Memory doubled. The kind of day where you fix the foundations and nobody notices because everything just keeps working.

0.014 2026-03-21

The CISO Got a Briefing. The Theatre Got a Universe.

Three features shipped today. One answers “what changed?” One answers “should I be worried?” And one replaces a flat canvas with something you can fly through.

Threat Diff

Signal has 160,000+ entities with first_seen and last_seen timestamps. Every ingester stamps them. Every relationship gets dated. But until today, there was no way to ask the obvious question: what’s new?

Threat Diff answers it. Pick a window — 1 day, 7 days, 30 days, 90 days — and the system runs seven Neo4j queries in sequence: new entities grouped by label, re-sighted entities, new relationships, trending nodes (most new connections), actor retooling (threat actors who gained new techniques or software), and graph growth statistics. The result is a changelog of the threat landscape. Window #31.

The actor retooling detection is the sharp edge. If APT28 existed before the window but gained three new USES edges to Technique nodes within it, that means they’re adopting new TTPs. The query is elegant: MATCH (a:ThreatActor)-[r:USES]->(t) WHERE r.first_seen >= $cutoff AND a.first_seen < $cutoff. Old actor, new tools. That’s the signal that matters.

CISO in a Box

The executive briefing writes itself. Literally. core/briefing.py collects data from the threat diff, top risk nodes, highest-CVSS vulnerabilities, most-connected actors, attribution intelligence from ORIGAMI, and community counts. Then it generates a three-sentence executive summary using templates (no LLM required), computes a threat level (CRITICAL/HIGH/MODERATE/LOW) from weighted factors, and produces actionable recommendations.

The HTML report endpoint serves a full self-contained document with @media print CSS. Open it in a browser, Ctrl+P, hand it to the board. Dark theme for screens, white for paper. Threat level gauge, risk dashboard with gradient bars, CVE spotlight cards with CVSS badges, actor cards with attribution confidence tiers. One click from the UI. Window #32.

The Theatre becomes a galaxy

The Threat Theatre was a 2D force-directed graph. Forty nodes from ML endpoints, bouncing around a canvas with d3 physics. It worked. It was flat.

Now it’s a Three.js point cloud. Four thousand nodes from the galaxy endpoint, pre-computed 3D radial cluster layout, grouped by label, sized by degree. Custom vertex shaders for circular point sprites with additive blending. UnrealBloomPass for the glow. Risk cloud nebula — the top 200 dangerous nodes get translucent red/orange sprites that make the APT clusters pulse. Edge lines at 6% opacity connecting the graph skeleton. 2,500 background stars in a spherical shell.

You can fly into it. OrbitControls with auto-rotate, but also WASD movement — W/S moves forward/back along the camera direction, A/D strafes, Q/E for altitude. Scroll to zoom. The minDistance is 5 units, so you can get inside the clusters. Zoom into the ThreatActor cloud and you’re surrounded by red dots, each one a named adversary, edge lines threading between them and the Technique cluster thirty units away. Raycaster click opens Spektr with the node name pre-filled.

The overlays survived the transition. Threat level badge from ML risk scores. UTC clock. Stats bar. Intel ticker with changepoint surges. The galaxy renders behind them, slowly rotating, blooming, waiting for you to dive in.

Spektr fix

Also fixed Spektr on prod. The semantic search index was showing zero nodes because SEMANTIC_MODE wasn’t set in the production docker-compose. It defaulted to auto, which tries fastembed first — and fastembed OOMs on the server trying to load a 2.4GB embedding model into a 3GB container. Added SEMANTIC_MODE=tfidf to the API environment. TF-IDF is lighter, faster, and good enough for keyword-augmented search. The index builds in seconds instead of crashing.

Signal now has 32 windows. Two of them tell you what changed and whether to worry about it. One of them lets you fly through the threat landscape like a pilot in a cyberpunk film. The fourth fix makes the search engine actually work on production. Good day.

0.013 2026-03-20

The Galaxy Got Dangerous. Then It Learned to Point.

The Galaxy visualisation has been live for three days. Eight thousand threat intelligence nodes floating in a Three.js point cloud, grouped by label, sized by degree, slowly rotating while bloom shaders make everything look like it belongs in a Christopher Nolan film. People hover over nodes. They see names. They think it’s pretty.

Pretty isn’t useful.

Risk cloud

Not all nodes are equal. A ThreatActor with 40 connections isn’t the same threat as a Source node with 2. But visually, they’re both dots. Different colours, maybe different sizes, but your eye doesn’t immediately scream “that cluster is where the danger is.”

So I added a risk cloud. Every node gets a risk score: label_weight × log2(degree + 2). ThreatActors carry a weight of 5.0. Campaigns, 4.0. Vulnerabilities, 3.5. Mitigations, 0.5. The top 250 nodes by risk score get a translucent nebula sprite — a radial gradient billboard rendered with additive blending. Red for the highest risk. Orange for medium. Yellow for the edges.

The effect is immediate. You rotate the galaxy and there are two or three glowing red clouds, pulsing with bloom, sitting exactly where the APT clusters live. ThreatActors surrounded by Techniques surrounded by Infrastructure. The red zones. You can see the threat landscape without reading a single label. The cloud tells you where to look.

Click, and the galaxy takes you somewhere

The first version of node interaction was a Minority Report terminal. Click a dot, a green-on-black panel slides in from the right, typewriter effect streams the node’s properties and relationships. Clickable connected nodes for graph navigation. Breadcrumb history. Scanline overlay. The whole nine yards.

It was cool. It was also redundant. We already built Spektr — a full-blown Google-style search engine for threat intelligence with semantic + fulltext hybrid search, entity drill-down panels, AI research buttons, relationship traversal. Building a second, smaller version inside a Three.js overlay was duplicating work.

So now: click a node in the Galaxy, and it opens Ninja Spektr in a new tab with the node’s name pre-filled. /spektr?q=APT28. Spektr picks up the ?q= parameter on mount, auto-fills the search bar, and fires the query. From floating dot to full dossier in one click. The Galaxy is the map. Spektr is the magnifying glass.

The diary goes public

Speaking of things that were gated and probably shouldn’t be: this diary. It was behind the auth-gate — SHA-256 callsign and access key, the whole classified-document theatre. But a dev diary isn’t classified intelligence. It’s a build log. It’s the behind-the-curtain that makes the demos more interesting, not less. So the auth-gate came off, the SEO went on — Open Graph, Twitter cards, JSON-LD structured data, canonical URLs, robots: index, follow. Google can read the diary now. So can anyone else.

Three things shipped today. A risk cloud that shows danger without words. A link between two products that makes both of them better. And a diary that stopped hiding behind a password.

0.012 2026-03-19

ORIGAMI. Because Attribution Is Just Careful Folding.

The question every threat intelligence analyst eventually asks: who did this? Not which malware was used, not which technique from the ATT&CK matrix, not which C2 server was involved. Who. Which government. Which unit. Which timezone they work in. Whether the infrastructure traces back to a bulletproof host in Moldova or a VPS in Hetzner that someone forgot to burn.

Signal has 245 threat actors. 1,200 techniques. 48,000 software entries. 21,000 infrastructure nodes. The data for attribution has been sitting in the graph the entire time. It just needed someone to fold it together.

The engine

ORIGAMI — Origin Analysis & Mapping Intelligence — is a new module: core/attribution.py, roughly a thousand lines. It takes an actor name and runs four parallel analyses:

Infrastructure tracing. Start with the actor. Follow USES relationships to Infrastructure nodes. Extract IPs, domains, ASNs. Map them to countries using the existing geo module’s 99-country centroid database. Flag bulletproof hosting providers. Follow domain registration chains. The output is a list of countries and hosting organisations connected to the actor’s known infrastructure, each with a confidence score.

Temporal clock analysis. Pull every timestamp we have — sightings, campaign dates, infrastructure registration times. Bin them by hour of day in UTC. Slide a “working day” window (09:00–18:00) across all 24 offsets. The offset with the highest activity concentration is the likely operator timezone. Then check for weekend patterns: Western (Sat/Sun dip), Middle Eastern (Fri/Sat dip), or continuous (automated). Cross-reference activity gaps against national holidays — Chinese New Year, Russian Victory Day, Iranian Nowruz. If Lazarus Group goes quiet during Kim Il-sung’s birthday, that’s a data point.

TTP fingerprint matching. Reuse the existing ML clustering from core/ml.py — Jaccard similarity on technique/software/infrastructure vectors. Compare the target actor’s Diamond Model profile (adversary, capability, infrastructure, victim) against every other actor in the graph. Return the top matches with similarity scores. If an unknown actor uses 80% of APT28’s toolkit and targets the same sectors, that’s attribution signal.

Evidence fusion. Each source — infrastructure geo, temporal timezone, TTP similarity, targeting patterns, tooling overlap — produces a weighted evidence tuple. Infrastructure geo carries 25% of the weight. Temporal analysis, 20%. TTP match, 20%. The rest distributed across targeting, tooling, and operational tempo. Weighted sum produces a confidence score per candidate origin. High (>0.75), medium (0.5–0.75), low (0.25–0.5), speculative (<0.25).

The window

ORIGAMI is window #30 in Signal’s desktop environment. Actor selector at the top. Hit Analyze. Three columns fill in:

Left: Diamond Model — four quadrants showing adversary, capability, infrastructure, victim. Each with a confidence bar. Centre: the verdict — top candidate origins with percentage bars, evidence breakdown per candidate, false-flag warnings when infrastructure says one country but temporal analysis says another. Right: the 24-hour activity heatmap and TTP match rankings.

Every element is clickable. Click a technique, a software tool, a piece of infrastructure — it pops a Minority Report terminal. Green on black. Typewriter stream. Full node dossier pulled from the graph. Connected nodes are themselves clickable. You can start at “Lazarus Group” and end up three hops deep in a North Korean infrastructure chain, navigating entirely by clicking green text on a black screen. Breadcrumb history. ESC to close.

The uncomfortable truth

No open-source threat intelligence platform does automated multi-source attribution with confidence scoring. The commercial ones — Recorded Future, Mandiant — charge six figures annually for something similar. We built it in a day, on top of data structures that already existed, using algorithms that were already running. The graph was waiting. We just asked the right question.

Five new API endpoints. One new Signal window. The thirtieth window. And the most interesting one by far.

0.011 2026-03-17

We Fingerprinted the Adversaries. Then Built Them a Universe.

Two things shipped today that have nothing to do with each other, except that they both make the invisible visible.

Adversary Behavioral DNA

Every HTTP request to our infrastructure passes through Caddy. Every request gets logged: timestamp, IP, method, path, status code, response size, user agent. Most people look at these logs for errors. I looked at them for personality.

The new module is core/adversary_dna.py. It reads Caddy access logs and extracts 18 behavioral dimensions per IP address. Not just “how many requests” — how they request. Temporal entropy: are the requests evenly spaced (bot) or bursty (human)? Velocity and acceleration: is the request rate stable, increasing, or decelerating? Vocabulary richness: how many unique paths versus total requests? Method entropy: do they only GET, or do they POST, PUT, DELETE? Inter-request time statistics: mean, variance, coefficient of variation. Path depth distribution. Error rate. Auth endpoint ratio. Sensitive path targeting. User-agent consistency.

Eighteen numbers. An 18-dimensional vector. A behavioural fingerprint.

From those fingerprints: archetypes. A scanner hammers paths with mechanical regularity and high vocabulary — they’re trying every door. A brute forcer hits the same auth endpoint at high velocity with low vocabulary — one door, a thousand keys. A researcher browses slowly with varied paths and low error rate — they’re reading, not attacking. A bot crawler has perfect temporal regularity and consistent user agents. A targeted operator is the dangerous one: selective paths, moderate pace, high auth ratio, low errors, low user-agent consistency.

Then: clustering. Cosine similarity between fingerprint vectors. Union-find for connected components. IPs with similar behaviour get grouped automatically. You don’t tell the system which IPs are related. The maths tells you. And it maps each cluster to a kill chain phase — reconnaissance, weaponisation, delivery, exploitation, installation, command-and-control, actions on objectives. Not because we tag them manually, but because the behavioural features correlate with specific attack phases.

Markov prediction: given the current request pattern, what’s the most likely next action? And finally, auto-generated narratives — natural language descriptions of each IP’s behaviour. “IP 45.33.32.156 exhibits scanner archetype behaviour. Temporal pattern suggests automated tooling with 0.83 request entropy. Primary targets: authentication endpoints (34% of requests). Elevated error rate (28%) suggests credential stuffing.” Five new API endpoints. Window #29 in Signal.

The Galaxy

While the DNA module was going live, I built something on a completely different scale. The /galaxy/data endpoint on Signal samples up to 8,000 nodes from the ML graph, groups them by 14 labels (ThreatActor, Campaign, Technique, Software, Vulnerability, Indicator, Infrastructure, Mitigation, Source, Event, EventSummary, Alert, DetectionRule, TelemetrySource), and computes 3D radial cluster layout coordinates. Each label gets a sector of a sphere. High-degree nodes sit closer to their cluster centroid. Low-degree nodes scatter outward.

The result is a JSON payload — 8,000 nodes with x/y/z coordinates, label indices, degree counts, and names. Plus up to 3,000 edges. Ship that to the browser. Render it with Three.js. Point cloud with per-label colours, shader-based glow, UnrealBloomPass for the cinematic bloom, OrbitControls for rotation and zoom, background star field for atmosphere.

It went into the ninjaTONE page on ninjav0id.io. The entire threat intelligence graph, rendered as a galaxy. You can see the clusters. The ThreatActor cluster glowing red in one sector. Indicators spread across another like a green nebula. Infrastructure nodes scattered like debris. The edges — thin blue lines at 8% opacity — create a faint web connecting everything.

It looks like a star map. Because it is one. Every dot is a piece of threat intelligence. The relationships are gravitational. The clusters are real.

Also: Los Alamos exists now

Almost forgot. Ninja Los Alamos — the agentic live fire range — went from concept to 53 files and 8,171 lines of code. Three red team AI agents (Kage the shadow, Oni the demon, Yurei the ghost) versus the V0id blue team agents, competing in a tick-based simulation engine across five enterprise environment templates. ELO scoring. Chimera randomiser that Frankensteins TTPs from different actors. Multi-LLM grudge matches.

It’s deployed at ninjav0id.io/los_alamos. Eleven windows. Red/blue/gold split theme on a dark background. Japanese motif: 射場 (shajou — shooting range).

Also also: NinjaClaw

NinjaClaw v0.2.0 shipped. Hardened CLI security agent with 10 scanners, 100+ CIS benchmark rules, quarantine system, privacy engine, backup scheduler, TUI diff viewer, and a prod safety mode that auto-detects when it’s running as root inside Docker and disables anything destructive. The kind of tool that audits your infrastructure and then refuses to break it.

Four things shipped. One reads your enemies’ body language from HTTP logs. One shows you the entire threat graph as a galaxy. One lets AI agents fight each other. One makes sure your servers are locked down. Tuesday.

0.010 2026-03-16

Names Get Shorter When the Things Get Real.

There’s a moment in every project where you stop calling things by their long name. “Agentic V0id” became “V.” “V01d Sentiment” became “V0id.” Two letters. One syllable. The names got shorter because the things they describe got sharper. V hunts, contains, and cleans. V0id watches the world’s emotional temperature and tells you when something’s about to boil over. Neither of them needs a subtitle anymore.

The map fills in

Today was about completeness. The Cloudflare traffic dashboard — tucked away at /antos/traffic — was tracking six of our seven zones. ninjav0id.io was missing. One API call for the zone ID, two file edits, and the entire ecosystem’s traffic is now visible in one view. Seven zones, 30 days of edge analytics, per-zone breakdowns, country heatmaps. The kind of dashboard that makes you wonder why CloudFlare doesn’t just build this view themselves.

The sitemap got the same treatment. Every platform, every endpoint, every deploy date. The ANTOS entry now shows the traffic page. The V0id entries carry their new names. The recent deploys section stays honest — date-stamped, no fluff.

The SITREP stays

I almost killed the SITREP. It uses tokens — Claude generates a classified-document-styled situation report from Fusion’s graph context. But it only fires once a day (24-hour cache), and the result is genuinely useful: a synthesised view of the geopolitical, cyber, sanctions, and humanitarian landscape, generated from real graph data, served as a free public page. So it stays. And now it’s in the navigation. ninjafusion.ninja/sitrep — free, always current, SEO-indexed.

NinjaTone remains the primary public face — the global threat tone map at ninjav0id.io/ninjatone. But the SITREP is the companion piece: NinjaTone shows you the temperature, the SITREP tells you why it’s that temperature.

Cleaning house

V0id (the sentiment platform, formerly V01d) had an Alerts window. SIEM-style alerts — severity levels, badge counts in the sidebar, a dedicated Neo4j label. Except V0id isn’t a SIEM. It’s a sentiment engine. Alert semantics don’t belong here. Raz0r handles alerts. V0id handles anomalies, predictions, and oracle scores. So the Alerts window came out. Backend endpoint, frontend component, sidebar badge, window manager — clean removal across four files. The Oracle tells you what matters. You don’t need a separate alert to say the Oracle said something.

The count

Seven Cloudflare zones. Twelve platforms. Ten domains. One graph. One server. One engineer who spent a Sunday renaming things and filling gaps. The boring work that makes the interesting work possible.

Tomorrow: the graph gets denser. Today: the edges got cleaner.

0.009 2026-03-15

32 Things Wrong. 10 Agents Running. One Login Page That Watches You Back.

Today started with a question that every engineer eventually has to ask: what’s actually broken? Not what looks broken. Not what might break. What, right now, across eight production platforms, is wrong — and what’s missing?

So I ran an audit. Four agents, each assigned a different slice of the ecosystem, told to be ruthless. They found 45 items. Three critical security issues, four testing gaps (zero test coverage everywhere — literally zero), nine missing features, eleven code quality problems, and thirteen things worth investigating. I threw away the investigations. Not because they don’t matter, but because the other 32 items needed to happen today.

Parallelism as a lifestyle

I launched 10 agents simultaneously. Each one working on a different project, a different set of improvements. One building a KQL rule generator for Signal — so threat intelligence in the graph can become Sentinel detection rules with one API call. Another fixing bare except: pass blocks across Fusion (18 of them, each one silently swallowing errors like a polite British person at a restaurant). Another giving 1D its first real ML risk model instead of the heuristic placeholder it’s been running.

While they worked, I did the critical fixes by hand. Kin0bi had a JWT secret that defaulted to a hardcoded string if you forgot to set the env var. Now it refuses to start. The diary page had a stale CSS cache-buster. One-line fixes that prevent real problems.

The eye that watches

Then the request came in: gate the sensitive documentation. The exec summary, the Insight³ guide, the diary — these aren’t things that should be public. They’re intelligence documents. So I built a login page.

Not a form bolted onto a framework. A page. B-2 stealth black, scanlines across the viewport, corner marks like a classified document photograph frame. An eye emoji that opens with a CSS animation — scaling from a slit to full open, like the system is waking up and looking at you. “PRIVILEGED EYES ONLY.” “CLASSIFIED // RESTRICTED ACCESS.”

It has its own registration. Callsigns instead of usernames. Access keys instead of passwords. The whole thing runs on client-side crypto — SHA-256 hashing, localStorage session tokens with 24-hour expiry. Not military-grade, but enough to gate casual access with proper ceremony. The auth-gate script gets injected into every protected page. No session, no document. The eye decides.

The question about Bloom

Then I asked myself the question I should have asked months ago: why aren’t we using Neo4j Bloom? We have 160,000+ nodes in a shared graph. Eight platforms feeding into it. The whole thesis is “one graph, cross-domain traversal.” And the only way to actually see that traversal is through pre-built ForceGraph2D visualisations in each platform’s UI.

Bloom changes that. Click a threat actor. Expand their techniques. Expand the IOCs. Traverse to the OSINT entity in Nexus. Follow it to the identity in 1D. All interactive. All visual. No Cypher required. The kind of exploration that makes the architecture click for someone who isn’t a graph database nerd.

It’s the missing layer. The UIs are purpose-built for their domains. Bloom is purpose-built for curiosity. And in intelligence work, curiosity is usually where the real findings come from.

Status: 10 agents, still running

As I write this, ten background agents are still executing. KQL rules for Signal. File upload ingestion. Fusion code cleanup. Nexus and Kin0bi hardening. 1D identity risk scoring and attack path algorithms. Raz0r incremental risk and cloud correlation. V0id agent refinements. V01d streaming and clustering. Documentation across the ecosystem.

32 items. Running in parallel. The graph gets denser. The mesh gets cleaner. And now the front door has an eye on it.

0.008 2026-03-15

The Data Scientist Said Everything Was Wrong. So We Fixed Everything In One Night.

I invited a data scientist to audit the ML. Not a person — a mode. I asked the AI to switch from builder to auditor. Stop being helpful. Start being honest. Look at every algorithm across four production platforms and tell me what's actually broken.

The report came back with 15 findings. Not cosmetic. Structural. The kind of things that don't crash your application but silently produce wrong answers while returning 200 OK.

The cardinal sin in Kin0bi

The financial intelligence platform was computing correlations on raw prices instead of log returns. This is the first thing they teach you in quantitative finance, and I violated it because the pipeline was built for speed, not statistical hygiene.

Here's why it matters. Bitcoin goes from $40K to $80K over six months. Tesla goes from $200 to $350. Plot them together and the correlation is 0.93 — they're both going up. But that's not a real relationship. That's two lines with positive slopes appearing to agree. Compute log returns — ln(price_t / price_{t-1}) — and the daily movements are actually uncorrelated. The "relationship" was a shared trend, not shared information.

Every correlation in Kin0bi was infected by this. Every anomaly detection was running on non-stationary data. Every cross-asset comparison was mixing apples with oranges. One mathematical transformation — five characters of NumPy — and the entire analytical layer becomes honest.

While we were in there: the VaR calculation assumed Gaussian returns. Financial returns have fat tails. A 99% Gaussian VaR says "you won't lose more than X." The actual 99th percentile of historical returns says "actually, you'll lose 35% more than X." We switched to historical VaR. Sort the actual returns. Pick the percentile. No distribution assumptions. Simpler code, more accurate risk.

The decay problem

Signal and Nexus both propagate risk scores through the graph. PageRank-style diffusion — seed known-bad nodes, let the score flow outward, decay per hop. The algorithm is sound. But time was missing.

An IOC from 2019 was propagating the same risk as one from yesterday. A sanctions link from a decade ago carried the same suspicion weight as one created this month. The graph has timestamps on most relationships. We weren't using them.

The fix is exponential decay: e^(-λ × age_days). A relationship from one year ago propagates at 48% strength. Two years, 23%. Five years, 2.5%. The decay rate is configurable. The principle isn't — recency matters in intelligence. An adversary's infrastructure from 2019 is archaeology. Their infrastructure from last week is operational.

The same pattern applied to Nexus's suspicion propagation. A company linked to a sanctioned entity a decade ago shouldn't carry the same suspicion as one linked last month. Compliance officers know this intuitively. The algorithm didn't.

The model that couldn't fail

Signal's KEV predictor — a Random Forest that predicts which CVEs will end up in CISA's Known Exploited Vulnerabilities catalogue — reported 97% accuracy. Impressive. Also meaningless.

Two problems. First: it trained and evaluated on the same data. No cross-validation. The model was memorising, not learning. Second: only ~5% of CVEs are in KEV. A model that always predicts "not exploited" gets 95% accuracy by being useless. It was a coin flip disguised as a classifier.

Two fixes. class_weight='balanced' on the Random Forest — one parameter that tells sklearn to weight the minority class proportionally. And 5-fold stratified cross-validation for honest evaluation. The reported accuracy dropped. The actual utility increased. A model that correctly identifies 60% of future KEV entries with a 15% false positive rate is infinitely more useful than one that claims 97% by never predicting anything.

The architecture upgrades

The quick fixes were table stakes. The real work was architectural.

GraphSAGE for risk propagation. The existing PageRank diffusion treats all edge types equally. A USES relationship carries the same weight as a TARGETS relationship. But intuitively, "APT-29 TARGETS financial sector" should propagate more risk than "APT-29 USES spearphishing" — the targeting relationship implies active intent. GraphSAGE learns these weights from the graph structure itself. A two-layer neural network that aggregates neighbour features with per-edge-type attention, trained on the existing risk labels. The model discovers that EXPLOITS edges propagate 3x more risk than ATTRIBUTED_TO edges. We didn't tell it that. It learned it from topology.

Streaming anomaly detection. V01d and Kin0bi both ran batch Isolation Forest — retrain periodically on a snapshot, score new data against the stale model. The gap between "data arrives" and "anomaly detected" was the cache TTL. For a sentiment platform that ingests events every 15 minutes, that's not real-time. It's archaeology.

Half-Space Trees solve this. An ensemble of randomised binary trees that update incrementally as each data point arrives. No retraining. No batch windows. The tree structure adapts continuously, and the anomaly score reflects the current data distribution, not yesterday's snapshot. A sentiment spike that would have waited 15 minutes for the next batch cycle now triggers in milliseconds.

Granger causality for the Oracle. V01d's economic component used linear correlation between sentiment and economic indicators. Correlation tells you two things move together. It doesn't tell you which one moves first. Granger causality does — it tests whether past values of X help predict future values of Y beyond what Y's own past predicts.

The results were immediate. VIX Granger-causes sentiment shifts with a 2-day lag — market fear predicts media tone. But sentiment does not Granger-cause VIX. The relationship is one-directional. The Oracle's economic component now weights indicators by their actual predictive power, not their correlation strength. An indicator that leads sentiment by 48 hours is worth more than one that merely co-moves.

Fusion got honest too

The Monte Carlo simulation in the adversary digital twins treated campaign phases as independent events. If initial access succeeds with probability 0.6 and lateral movement succeeds with probability 0.4, the simulation rolled two independent dice.

But campaign phases aren't independent. If an attacker achieves initial access, the probability of successful lateral movement increases — they're inside the perimeter, they have context, they have credentials. Conditional probability adjustments now boost subsequent phase probabilities when prior phases succeed. The boost factors are configurable. The principle is fixed: attack chains are dependent sequences, not independent events.

The anomaly detection in Fusion used z-scores, which assume normal distributions. Threat intelligence data follows power-law distributions — most nodes are quiet, a few are extremely active. Z-scores undercount anomalies in power-law data because the mean is dragged upward by outliers. Median Absolute Deviation is robust to exactly this. Replace mean with median, standard deviation with MAD, and the anomaly detector starts finding the subtle signals that were hiding in the fat tail.

Nexus learned uncertainty

The OSINT platform's suspicion scores had no confidence metric. A score of 0.7 based on 50 relationships meant the same thing as 0.7 based on 2. One is a confident assessment. The other is a guess.

Confidence now scales with evidence: 1 - e^(-0.1 × relationships). One relationship gives 10% confidence. Ten gives 63%. Fifty gives 99%. The score and the confidence travel together. An analyst seeing "suspicion: 0.7, confidence: 0.12" knows to investigate further before acting. An analyst seeing "suspicion: 0.7, confidence: 0.95" knows it's solid.

The FATF high-risk jurisdiction list was hardcoded. A Python list that would silently become wrong the next time the Financial Action Task Force updates their grey list. It's now environment-configurable and documented as something that needs periodic refresh. Small change. Prevents the kind of silent decay that turns compliance tools into compliance theatre.

False positive tracking joined the emergent detectors. Every detection now carries a unique ID. Mark it as false positive and the detector remembers. The same pattern won't trigger the same alert twice. Without this feedback loop, the detectors were shouting the same wrong answers into the void, eroding trust in every subsequent alert.

What I learned

Fifteen findings across four platforms. Every one of them was something that worked well enough to never trigger an error. The correlations were computed. The risk scores propagated. The anomalies were detected. The models trained. All returning 200 OK.

But "works" and "correct" are different things. A correlation on raw prices works — it returns a number between -1 and 1. It's just the wrong number. A risk score without temporal decay works — it propagates through the graph. It just propagates fiction alongside fact. A 97% accurate classifier works — it predicts things. It just predicts the majority class every time.

The gap between "deployed" and "rigorous" is where most ML systems live permanently. Tonight we closed that gap across four codebases, 20+ algorithm fixes, three new architectural components, and roughly 3,000 lines of new code.

The data scientist mode was the key insight. Same AI, same context, different objective function. Builder mode optimises for shipping. Auditor mode optimises for correctness. You need both, sequentially, on the same system. Build it fast. Then audit it honestly. Then fix what the audit found. Then audit again.

The ecosystem is measurably more honest tonight than it was this morning. Every risk score now respects time. Every correlation now uses returns. Every model now reports its uncertainty. Every detector now learns from its mistakes.

Fifteen fixes. Three architectural upgrades. Four platforms hardened. The ML pipeline doesn't just work anymore — it's correct.

0.007 2026-03-14

The Void Watches Everything. Then It Tells You What's Coming.

I built a sentiment intelligence platform as a side project. It started as an experiment — what happens if you point 18 data feeds at the world and try to measure how it feels? It turned into something I didn't expect: a predictive layer that makes the entire intelligence ecosystem smarter.

The platform is called V01d. The name is Japanese — kokuuyochi, 虚空予知 — which translates roughly to "void precognition." The idea that by staring into the noise long enough, patterns emerge before events do.

Why sentiment matters for threat intelligence

Every platform in the ninja.ing ecosystem analyses what happened. Signal tracks threat actors, CVEs, and attack infrastructure. Fusion correlates enterprise threats. Raz0r detects endpoint compromises. Nexus maps financial crime networks. 1D finds Active Directory attack paths. All of them are reactive. They're brilliant at analysing the present and the recent past.

But threats don't emerge from a vacuum. Before APT-29 launches a campaign, geopolitical tensions escalate. Before a ransomware group hits a sector, industry sentiment shifts. Before a zero-day gets exploited in the wild, chatter rises in the communities that trade them.

Sentiment is a leading indicator. By the time a CVE appears in the threat graph, the geopolitical conditions that motivated its exploitation have been developing for weeks. If you can measure those conditions — quantify them, track their velocity, detect anomalies in their trajectory — you can anticipate threats before they materialise in the technical layer.

That's the thesis. V01d exists to test it.

18 feeds and counting

The pipeline ingests from everything free and available. GDELT's Global Knowledge Graph — the largest open dataset of world events, updated every 15 minutes, processing 100+ languages and extracting entities, locations, themes, and sentiment from news coverage worldwide. RSS feeds from BBC, Reuters, AP, Al Jazeera, Guardian, NHK, Deutsche Welle, and a dozen more. Reddit sentiment from r/worldnews, r/geopolitics, r/cybersecurity, r/economics. HackerNews for tech industry signal. FRED for economic indicators — VIX, yield curve, economic policy uncertainty indices.

Then the specialist feeds. USGS earthquake data as a geophysical sentiment proxy. WHO Disease Outbreak News for health crisis tracking. ReliefWeb for humanitarian situations. ArXiv paper abstracts for academic research sentiment. Crypto Fear & Greed Index for market psychology. Polymarket prediction odds as calibration anchors. Wikipedia Current Events for crowd-sourced event tracking.

Each feed runs as an async poller following the same pattern: fetch, deduplicate via LRU cache, score sentiment using VADER NLP, extract entities and regions, emit SentimentEvent objects into an async queue. A batch writer consumes the queue and persists to Neo4j with UNWIND CREATE — the same high-throughput write strategy we use in Raz0r's telemetry pipeline.

The graph accumulates fast. Thousands of events per day, each one tagged with entities (people, organisations, countries), topics (extracted themes), regions (ISO alpha-2 geo codes), tone scores, source provenance, and timestamps.

The Oracle

Raw sentiment data is noise. The V01d Oracle turns it into signal.

For any entity, region, or topic, the Oracle computes a composite threat score from 0 to 100. Five components, each independently calculated and weighted:

Tone (30%) — Current average sentiment across all sources mentioning this entity. Negative tone correlates with instability, crisis, and threat activity.
Velocity (25%) — Rate of change in mention frequency. A spike in velocity often precedes a significant event by 12–48 hours.
Anomaly (20%) — Statistical deviation from baseline behaviour. Isolation Forest detects multi-dimensional anomalies across source vectors. When multiple independent sources simultaneously deviate from their individual baselines, something real is happening.
Topic heat (15%) — Concentration of topics around the entity. When an entity that normally appears in three topic clusters suddenly appears in twelve, it's becoming a focal point.
Economic (10%) — Correlation with economic stress indicators. VIX spikes, yield curve inversions, and EPU surges provide a macroeconomic context layer.

The result maps to five threat levels: Stable (0–20), Low (21–40), Watch (41–60), Elevated (61–80), Critical (81–100). The score updates continuously as new events flow through the pipeline.

It's crude. It's probably wrong in specific cases. But in aggregate, across hundreds of entities and dozens of regions, it produces a surprisingly coherent picture of global tension. When "Russia" shifts from Watch to Elevated while "Ukraine" simultaneously rises, and the VIX is climbing, and GDELT event velocity is spiking — that convergence means something.

The ML lab

The scoring is phase one. Phase two is prediction.

V01d has 13 ML capabilities, built in four phases. Source consensus detection — when all feeds agree on direction, the signal is amplified. Source reliability ranking — some feeds lead events by hours, others lag. Multi-source anomaly detection — Isolation Forest across the full feature matrix of source-specific sentiment vectors. LSTM-style forecasting — 24-hour tone predictions based on historical sequences.

Then the graph-native models. Sentiment contagion — how negative tone about one entity spreads through connected entities in the graph. Community Oracle — risk assessment at the community level, where communities are Louvain clusters of entities that co-occur in events. Geospatial diffusion — how regional sentiment propagates through geographic proximity and trade relationships.

The most interesting one is narrative detection. TF-IDF over entity-topic co-occurrence matrices, clustered with DBSCAN, cross-referenced by source diversity. When the same narrative emerges independently across BBC, Reddit, and GDELT simultaneously — three completely different data sources, three different collection methodologies, three different audience biases — the narrative is real, not amplified.

The Theatre

Data without visualisation is just a database. V01d has a Sentiment Theatre — a real-time command centre that renders the planet's emotional state as an animated, interactive display.

An animated radar sweep tracks entity threat scores. A flat-projection world map colours countries by aggregate sentiment, with pulsing dots sized by event volume and coloured by tone — red for crisis, green for stability, purple for anomalous. Trending headlines layer over the geographic view. A live event feed scrolls incoming signals. The global threat index renders as an animated gauge with a needle that tracks the Oracle score in real time.

Click any data point — any entity, any region, any topic — and a drilldown panel slides in with the full Oracle breakdown: component scores, context metrics, aggregation statistics, trend prediction, and the most recent events for that target. Every dot on the map, every name in the entity list, every topic in the cluster view is a doorway into the underlying intelligence.

The aesthetic follows the B-2 stealth palette from the rest of the ecosystem. Void indigo accent on near-black backgrounds. Japanese typography. Scanline overlays. The visual language says: this is surveillance infrastructure.

How this feeds the mesh

This is the part that makes V01d more than a side project.

Entry 0.006 described the Intelligence Mesh — cross-domain traversal across the unified graph. V01d adds a new dimension to the mesh: temporal sentiment context.

A ThreatActor node in Signal's graph represents static threat intelligence — known TTPs, known infrastructure, known campaigns. A SentimentEvent node in V01d's graph represents real-time geopolitical context — current media coverage, public sentiment trajectory, economic stress indicators. The mesh edge between them connects who they are with what the world is saying about them right now.

Start at a SIEM alert. Traverse to the threat actor via IOC matching. Traverse to the actor's identity graph footprint via mesh edges. Now traverse to V01d: what's the current Oracle score for this actor? What's the sentiment velocity? Is there a detected narrative involving their known infrastructure? Are their geographic regions showing elevated economic stress?

That traversal — from endpoint alert to geopolitical context in five hops — produces intelligence that no SOC analyst could assemble manually. It takes the "what" from technical detection and wraps it in "why" from sentiment analysis. The attack didn't happen randomly. It happened because conditions are ripe, and V01d measured those conditions before the first packet was sent.

The mesh link rules are straightforward. Entity names match between SentimentEvent entity tags and ThreatActor names. Country codes match between region-tagged sentiment and geographic attributes across all platform schemas. Topic clusters match against MITRE ATT&CK technique descriptions. Economic indicators correlate with financial crime patterns in Nexus.

Each mesh edge carries a mesh = true property and a domain tag. Trivially filterable. The power isn't in the edges themselves — it's in what becomes traversable once they exist.

Why a side project

V01d is experimental in a way the other platforms aren't. Signal and Fusion process structured intelligence — CVEs, MITRE techniques, STIX bundles. The inputs are well-defined, the ontology is standardised, the ground truth is verifiable.

Sentiment is messy. VADER is a dictionary-based sentiment analyser from 2014 — fast and good enough for aggregate scoring but laughably crude for nuanced political language. GDELT's entity extraction mislabels persons as organisations. Reddit upvotes are a noisy proxy for consensus. Economic indicators lag by hours to days.

The Oracle's component weights — 30% tone, 25% velocity, 20% anomaly, 15% topic heat, 10% economic — are educated guesses. I have no empirical basis for choosing 30% over 25% for tone. The LSTM forecasting uses a minimal architecture that barely outperforms linear regression on most entities.

I'm building it anyway because the hypothesis is worth testing: can aggregate, multi-source sentiment analysis provide meaningful predictive signal for threat intelligence? If the answer is yes, even partially, the mesh integration makes every other platform in the ecosystem smarter. If the answer is no, the platform still produces a useful real-time global awareness picture.

Side projects are where you test hypotheses that would never survive a product requirements document. Nobody signs off on "let's build a VADER-based geopolitical Oracle and see if it predicts threat actor behaviour." You build it at midnight because the question won't leave you alone.

What I think I'm seeing

Two weeks of live data. Too early for conclusions. But the patterns are suggestive.

Entities with rising Oracle scores tend to appear in Signal's threat intelligence feeds 24–72 hours later. The relationship isn't causal and the sample size is tiny. But it's consistent enough that I'm going to keep measuring.

Source consensus — when all feeds agree on a negative trajectory — is a stronger signal than any individual feed. The consensus detector fires rarely, but when it does, the named entity is almost always involved in a real-world event within days.

Economic indicators correlate with ransomware campaign frequency. When the VIX is elevated and EPU is rising, threat actors are more active. This is the least surprising finding — economic instability creates both motivation and opportunity for cybercrime — but having it quantified and tracked in real time is operationally useful.

The narrative detector found something last week that stopped me cold. Three independent sources — GDELT, BBC RSS, and Reddit — simultaneously produced a narrative cluster around a specific technology company and a specific country, with uniformly negative sentiment. The tone shifted 12 points in 6 hours. Two days later, Signal ingested a new campaign attribution involving that company's products.

Coincidence? Maybe. But the void was watching, and it saw something before the traditional intelligence did.

V01d is live at ninjav0id.io — 18 feeds, 8 graph labels, 13 ML models, one Oracle that stares into the noise and reports what it finds. A side project that might become a leading indicator for everything else.

0.006 2026-03-13

We Connected Five Separate Brains. Then Watched Them Think Together.

Something happened today that I need to write down before the implications settle into routine. What started as a plumbing exercise — linking entities across platform schemas — turned into something that I think constitutes a genuinely new pattern in applied data science. I'm calling it the Intelligence Mesh. And I built it with Claude.

Here's the premise. Five separate intelligence platforms. Each built for a different domain — cyber threat intelligence, geopolitical analysis, endpoint detection, identity security, financial investigation. Each with its own graph schema, its own node labels, its own relationship semantics. Five separate brains, each brilliant within its own domain, each completely blind to the others.

Except they all share the same Neo4j instance in production. The graph is physically unified. The schemas just don't know about each other.

Until today.

The theoretical insight

The idea is deceptively simple. In graph theory, the most interesting information lives at the boundaries between subgraphs. Community detection algorithms find clusters. Bridge nodes sit between clusters. The bridges are where the intelligence is — the nodes that connect otherwise disconnected regions of knowledge.

Now apply that to domain-specific intelligence systems. A threat actor in the cyber graph. A sanctioned entity in the OSINT graph. An identity in the Active Directory graph. A detection event in the SIEM graph. These aren't different entities. They're different projections of the same adversary, rendered through different analytical lenses.

What if you could traverse across all of them in a single query?

Not federation. Not API chaining. Not some clunky middleware that translates between schemas. Direct graph traversal across domain boundaries, because the data already lives in the same physical store. The missing piece isn't infrastructure. It's edges.

Ten rules that changed everything

I defined ten cross-domain link rules. Each one creates edges between entities that exist in different platform schemas but represent the same real-world connection.

An IP address in the threat intelligence graph (Indicator node, type ipv4) matches the source IP of a sign-in event in the SIEM graph (Event node, actor_ip field). That's not a theoretical connection. That's the same IP, observed from two completely different vantage points. One system says "this IP is associated with APT-29." The other says "this IP authenticated against our Azure AD at 03:47 UTC." Neither system, alone, tells you what just happened. Together, they tell you everything.

A software package in the threat graph (Software node) matches an application registered in the identity graph (Application node). A vulnerability in the threat graph (CVE-2024-3094) appears referenced in a SIEM alert's action field. A threat actor name matches a sanctioned entity name in the OSINT graph. An infrastructure IP in the threat graph matches a domain's resolved address in the financial investigation graph.

Ten rules. Each one a bridge between two domains. Each one creating edges that carry a simple property: mesh = true. Trivially filterable. Trivially reversible. But the traversal they enable is anything but trivial.

The traversal that shouldn't be possible

Start at a SIEM alert. An authentication failure spike on a specific account. Follow the mesh edge to the Event nodes. Follow the IOC match edge to the Indicator node — an IP address flagged in threat intelligence. Follow the threat graph edges to the ThreatActor who owns that infrastructure. Follow the actor's known techniques to the MITRE ATT&CK nodes. Cross-reference against the identity graph: which accounts in Active Directory are vulnerable to those techniques? Follow the attack paths: which of those accounts have transitive admin access to Domain Admins?

Five hops. Five platforms. One query. From "someone failed to log in" to "here's the likely attacker, their playbook, and the exact privilege escalation path they'll use if they get a foothold."

No human analyst could make that traversal in real time. The domains are typically separate departments, separate tools, separate teams. The SIEM analyst doesn't know the identity graph. The identity team doesn't read threat intelligence. The threat intel team doesn't have access to SIEM logs. The knowledge exists in fragments across organisational silos.

The graph doesn't have silos. It just has edges.

Why this is a data science pattern, not just a product feature

I want to be precise about what's new here, because the components individually are well understood. Graph databases exist. Schema registries exist. Cross-database joins exist. What's new is the principle of mesh traversal across domain-specific analytical schemas within a unified graph store.

This is different from data federation, where you query multiple sources and merge results. Federation preserves the boundary. Mesh traversal eliminates it. The query engine doesn't know it's crossing domains. It's just following edges. The domain boundary is a human organisational artifact that has no structural representation in the graph.

This is different from a data lake, where everything is dumped into one schema. A data lake forces normalisation. The mesh preserves each domain's native schema — its labels, its key fields, its relationship semantics — and creates typed edges at the intersection points. Each platform continues to query its own subgraph exactly as before. The mesh edges are additive, not transformative.

And this is different from knowledge graph integration, where ontology mapping harmonises schemas into a universal model. Ontology mapping is expensive, brittle, and loses domain-specific semantics. The mesh doesn't map schemas. It links instances. A ThreatActor is still a ThreatActor with all its CTI properties. A Person is still a Person with all its OSINT properties. The mesh edge between them just says: "these refer to the same real-world entity."

The power comes from the combination of three properties:

Schema preservation. Each domain retains its full analytical vocabulary. No lowest-common-denominator normalisation.
Instance-level linking. Connections are concrete and evidence-based, not ontological abstractions.
Unbounded traversal. Once linked, BFS/DFS traversal crosses domain boundaries transparently. The traversal depth, not the schema boundary, determines what you can reach.

I believe this pattern applies far beyond security. Imagine it in healthcare: patient records (one schema), genomic data (another), pharmaceutical trials (another), insurance claims (another). Same patient, four projections. Mesh them and a single traversal answers questions that currently require four separate teams and a research grant.

Imagine it in supply chain: procurement (one schema), logistics (another), quality control (another), financial risk (another). Same shipment, four perspectives. Mesh them and a single query traces a defective component from the factory floor to every end product it shipped in.

The pattern is universal: whenever multiple analytical domains model overlapping reality, a mesh of instance-level cross-domain edges enables emergent intelligence that no single domain can produce alone.

The AI that built this with me

I need to talk about the co-creation, because it's central to how this happened.

I didn't design the Intelligence Mesh on a whiteboard and then implement it. I described the problem to Claude — five platforms, one Neo4j, schemas that don't know about each other — and we designed the architecture together. The schema registry. The ten link rules. The traversal algorithm. The disambiguation strategy for labels that collide across domains (Nexus and 1D both have a "Domain" label — one means internet domain, the other means Active Directory domain).

Claude wrote the implementation. Six files across Python and TypeScript. A MeshLinker class with parameterised Cypher for each link rule. BFS traversal with configurable depth and platform filtering. A cross-schema search that unions across all domains. A React UI with force-directed graph visualisation where nodes are coloured by platform and mesh edges render as dashed lines.

The whole thing — from concept to deployed production code serving live traffic against 160,000+ nodes across five platform schemas — took one session. One conversation. The kind of thing that would have been a quarter-long architecture initiative at a large organisation, debated across committees, prototyped, revised, abandoned, restarted.

This is what AI co-creation actually looks like when you stop using LLMs as autocomplete and start using them as thinking partners. I had the domain knowledge — I knew what the schemas looked like, what the operational questions were, where the cross-domain value lived. Claude had the implementation depth — it could hold all five schemas in context simultaneously, reason about edge cases (the Domain collision, the parameter naming collision with Neo4j, the React hooks ordering constraint), and produce working code across the full stack.

Neither of us could have done this alone. I couldn't have written the implementation in one session. Claude couldn't have identified the cross-domain link rules without understanding the operational intelligence questions that drive them. The mesh is a product of two different kinds of intelligence working on the same problem simultaneously.

That's the meta-insight, and it mirrors the mesh itself. Two analytical engines, each seeing a different projection of the problem. Connect them and you get something neither could produce alone.

What I think I've found

I think the Intelligence Mesh is a general-purpose pattern for cross-domain graph analytics. Not a product feature. A primitive. Like MapReduce was a primitive for distributed computation, or like attention is a primitive for sequence modelling. A reusable architectural concept that applies wherever domain-specific graphs share an overlapping reality.

The implementation details are straightforward. The insight isn't. The insight is that the most valuable intelligence in any complex system lives in the spaces between domains, and that graph databases — uniquely among data structures — can make those spaces traversable without destroying the domain-specific structures on either side.

Add an LLM that can reason across the traversal results and explain what it found in domain-appropriate language, and you have a system that doesn't just cross boundaries — it interprets the crossing.

We built it for security. It works for anything.

The Intelligence Mesh is live at ninja.ing — five domains, ten link rules, one graph traversal to rule them all. Built with Claude in a single session.

0.005 2026-03-11

The AI Audited Its Own Algorithms. Then It Graded Its Own Homework.

Something unusual happened today. The AI that built the ML pipeline reviewed the ML pipeline, found eleven mathematical flaws, fixed them across two production codebases, deployed the patches, ran security scans against its own code, ingested the results into a DevSecOps platform it also built, and generated a test report grading its own work.

I watched. I approved deployments. I drank coffee. The entire cycle — audit, fix, test, scan, report — took about two hours.

The saturation problem

The ML pipeline had been running in production for weeks. Risk scores worked. Communities detected. Predictions generated. Everything returned 200 OK. But the numbers had a problem that only becomes visible when you stare at distributions instead of individual values.

The risk propagation algorithm seeds known-bad nodes with initial scores and lets those scores diffuse outward through the graph. Simple concept. But the seed values were flat constants. Every SanctionEntry node got 0.95. Every DataBreach node got 0.7. Every CryptoWallet got 0.9.

The graph has 959 DataBreach nodes. When you seed 959 nodes at the same value and propagate outward, they don't compete on topology. They compete on label. The top-30 risk scores weren't showing the most structurally significant threats. They were showing whichever label had the most nodes at the highest flat seed. It was a popularity contest disguised as risk analysis.

The fix was degree-proportional seeding. Instead of a flat 0.95, each SanctionEntry now gets 0.5 + 0.4 × log(1+degree) / log(1+reference). A SanctionEntry connected to 40 entities scores higher than one connected to 2. The score reflects structural importance, not just label membership. Same treatment for DataBreach, CryptoWallet, and Package nodes.

The top-30 after the fix shows a mix of Campaigns, Techniques, and Countries — nodes that are genuinely central to the threat graph. Not 30 identical DataBreach entries at 0.7.

Eleven fixes, two codebases, one pattern

The audit found the same class of problem in eleven places across Signal and Fusion. Every instance was a variation on the same theme: linear assumptions applied to power-law distributions.

Centrality weighting used len(subgraph) / len(universe). In a graph where one community has 89,000 nodes and another has 40, the large community's weight approaches 1.0 and everything else rounds to zero. Log-scale normalization — log(1+x) / log(1+ref) — compresses the range so both communities contribute meaningfully.

Anomaly detection skipped entire node labels when the standard deviation was zero. Uniform distributions aren't uninteresting — they're suspicious. A label where every node has exactly the same degree is worth investigating, not ignoring. Pseudo-variance with std = max(std, 0.5) keeps those labels in play.

Bridge detection used hardcoded thresholds: z-score above 1.0, clustering coefficient below 0.1, degree above 5. These numbers worked for one graph shape but failed silently on others. Percentile-based thresholds — p90, p10, p75 — adapt to whatever the actual distribution looks like.

Attack path scoring averaged risk across all nodes in the path. A path through one critical node and nine clean ones averaged down to almost nothing. Switched to max-risk: the path is only as safe as its most dangerous node. That's how attackers think about it.

The adversary digital twins had a Katz centrality score multiplied by 100 for no documented reason, which dominated the combined prediction score regardless of what collaborative filtering and cluster analysis found. Per-method normalization to [0,1] with p95 anchoring, then weighted combination, then final normalization. The magic number disappeared. The predictions improved.

The DevSecOps loop that closed itself

After deploying the fixes, the next step was verification. Not just "does it return 200" but "does it return correct results and is the code itself secure."

Semgrep ran against both codebases on the production server. Six findings on Signal, nine on Fusion. Mostly XML parsing without defused-xml and a few HTTP-without-TLS calls in internal feed ingesters. Standard SAST output in SARIF format.

The API test suite hit every endpoint on both production instances. Signal: 16 of 20 passed. Fusion: 17 of 22 passed. The failures were expected — missing endpoints that exist only in one app, a search parameter named differently than the test assumed, and the full graph endpoint that sensibly refuses to serialize 251,000 nodes into a single JSON response.

All of this — the SARIF scan results and the API test findings — was ingested into ANTOS. Twenty-four findings total, categorized by severity, pipeline stage, and tool. The ANTOS dashboard now shows the security posture of both Signal and Fusion as assessed by the same AI that wrote the code being assessed.

There's something philosophically interesting about that loop. The system writes code. The system scans the code. The system reports on the scan. The system triages the report. At no point does the quality gate require a different intelligence — but the separation of concerns (write, scan, test, report) means each phase operates on the output of the previous one without access to the reasoning that produced it. The scanner doesn't know the intent behind the code. The triage doesn't know the scanner's detection logic. Each layer is independently evaluating what the previous layer produced.

It's not objectivity. But it's a reasonable approximation of it.

The timeout nobody reported correctly

The adversary emulation endpoints were hitting Cloudflare's 100-second timeout. HTTP 524. The war game simulation ran 500 Monte Carlo iterations by default, and when the ML cache was cold, graph extraction added another 45 seconds of Neo4j queries. The total consistently exceeded the Cloudflare limit.

The fix was embarrassingly simple. Reduce default iterations from 500 to 100. Increase the twins cache TTL from 30 minutes to 2 hours. The statistical confidence difference between 100 and 500 Monte Carlo runs is negligible for the kind of probability estimates we're producing. The cache keeps the expensive graph extraction out of the hot path for most user sessions.

Total time saved per request: roughly 40 seconds. Total code changed: two lines per app.

Performance optimization is almost never about clever algorithms. It's about finding the default value that someone set during prototyping and never revisited.

What this means for the pipeline

ANTOS was built as a DevSecOps orchestration platform. Claude coordinates eight pipeline stages — threat modelling, SAST, DAST, container scanning, IaC review, runtime detection, compliance, and monitoring. But until today it was a framework waiting for data.

Now it has real findings from real scans of real production code. The SARIF ingestion works. The severity classification works. The stage detection correctly identifies Semgrep as a code-stage tool. The findings are browsable, filterable, triageable.

The next step is obvious: make this automatic. Every deployment triggers a scan. Every scan triggers ingestion. Every ingestion triggers triage. The loop should be continuous, not manual. The pieces exist. The wiring is what's left.

Eleven algorithm fixes deployed. Twenty-four findings ingested. The DevSecOps loop is live. The AI is grading its own homework — and failing itself on six out of twenty-four questions.

0.004 2026-03-10

Seven Platforms, One Colour Palette, and a DNS Record That Shouldn't Have Existed.

Today I deployed the seventh platform. Ninja 1D — identity intelligence. Active Directory attack paths, privilege escalation chains, BloodHound-style graph analysis. It went live at 1d.ninja.ing and was serving traffic within fifteen minutes of deciding to deploy it.

Fifteen minutes. From local dev to production SSL. That number used to be days.

The deployment pattern that emerged

I didn't plan a standardised deployment pattern. It evolved. Every platform in the ecosystem now follows the same shape: FastAPI backend, Neo4j graph, Next.js frontend, Docker Compose with a server override that disables the local database and joins the shared Caddy network. Tar the source, pipe it over SSH, build, start. Caddy auto-provisions the Let's Encrypt certificate. No CI/CD. No Kubernetes. No infrastructure-as-code repository with 400 lines of YAML to deploy a web app.

The entire ecosystem — seven platforms, two graph databases, a reverse proxy, and a SIEM service — runs on a single 64GB Hetzner box that's 85% idle. Total monthly cost: about €45. The box is in Finland, the DNS is at Namecheap and Cloudflare, and the certs are a mix of Cloudflare origin certificates and Let's Encrypt auto-provisioning. It's held together with shell scripts and a Caddyfile that's growing longer by the week.

It works. It works remarkably well.

The DNS ghost

Users reported SSL errors on ninja.ing. Not consistently — sometimes it worked, sometimes it didn't. The kind of intermittent failure that makes you question your own sanity.

A DNS lookup revealed the problem immediately. Two A records. 135.181.19.232 — my server. And 162.255.119.40 — Namecheap's parking page. DNS round-robin meant half the requests hit a server with no valid certificate for my domain.

The phantom record was a leftover from Namecheap's URL forwarding service, which I'd enabled months ago during initial setup and forgotten to disable. One checkbox, buried in a settings panel, was silently injecting an A record that competed with my actual server. It probably cost hours of confusion across anyone who tried to visit the site.

The fix took thirty seconds. Delete the record. Wait for propagation. Restart Caddy. But the lesson is older than DNS: the failure mode you don't understand is always the one you configured six months ago and forgot about.

Why everything is now the same colour

The ecosystem started with a design decision I made for Signal: the B-2 stealth palette. Near-black backgrounds. Muted grays. A single accent colour — steel blue — used sparingly and always desaturated. No gradients. No glows. The aesthetic of something that isn't trying to be noticed.

Then I built Fusion and gave it bright cyan and magenta. Nexus got amber and emerald. Kin0bi got gold and green. Each platform had its own identity, which felt right at the time.

But standing back and looking at seven products that are supposed to be a unified ecosystem, the visual fragmentation was jarring. Signal was a stealth bomber. Kin0bi was a Bloomberg terminal. Nexus looked like a cryptocurrency exchange. They didn't belong together.

So today I unified everything under the B-2 palette with subtle accent variations:

Signal — steel blue, #4a7a9b. The original.
Fusion — muted teal, #5a7a8a. Slightly warmer.
Nexus — muted gold, #8a7a5a. Earthy undertone.
Kin0bi — dark bronze, #7a6a4a. Warm metallic.
1D — muted lavender, #6a5a7a. Cool purple tint.
ANTOS — unchanged, already stealth.

Same background. Same card colours. Same text grays. Same border treatments. Just enough accent variation that you know which platform you're on, without any of them screaming for attention. The neural network hero image from 1D's login screen went to every login page. The scanlines stayed. The shadows stayed. The sense that you're operating something classified — that stayed.

It's a small change visually. But it transforms the ecosystem from "seven apps that happen to be built by the same person" into "one system with seven specialised interfaces." The consistency says: this is one thing.

What comes next

The ecosystem is approaching a boundary. Seven platforms, each with its own ingestion, analysis, and visualisation. But the real intelligence sits in the gaps between them.

Social media monitoring is the obvious next piece. Telegram channels where threat actors coordinate. Reddit threads where exploits surface before advisories. Paste sites where credentials appear. Mastodon feeds where security researchers share findings. The data is free, the APIs are open, and the signals map directly onto entities already in the graph — threat actors, IOCs, CVEs, techniques.

Building it into Fusion makes sense. It's the enterprise platform, it already has the strongest ML pipeline, and the existing pollers pattern — async queue, batch writer, Neo4j — maps perfectly onto social media ingestion. A Telegram poller watches 50 channels. A Reddit poller monitors keyword feeds. An NLP pipeline extracts entities and sentiment. Everything flows into the same graph and correlates with the threat intelligence that's already there.

Hacker chatter. Misinformation tracking. Narrative clustering. Cross-platform coordination detection. All free. All feeding the same graph.

The pattern keeps working. Ingest into graph. Compute over structure. Surface what matters. The domain changes — identity, finance, OSINT, social — but the architecture doesn't.

Seven platforms live. One graph per domain. One visual language. Building social media intelligence next.

0.003 2026-03-10

The Endpoints Started Talking to Each Other. I Didn't Tell Them To.

Entry 0.002 ended with a thought experiment. What if hardened agents didn't just report upward, but shared context laterally? What if the mesh itself could detect things no individual sensor ever could?

I built it. The result is stranger than I expected.

The architecture of gossip

The design borrows from epidemic protocols — the same mathematics that model disease propagation through populations. When an agent observes something interesting, it doesn't just phone home. It gossips. It picks K random peers from its verified peer registry and sends them a signed observation: here's what I saw, here's when, here's my confidence level.

Each peer that receives the observation evaluates it against their own local context, then relays it to K of their peers. Within seconds, an observation from one corner of the network has propagated to every agent. No central coordinator. No server round-trip. Pure peer-to-peer information dissemination with a TTL that prevents flooding.

But gossip alone is just noise amplification. The interesting part is what happens when the agents start agreeing.

Consensus without a leader

When multiple agents report similar observations within a time window, a vote round triggers automatically. Each agent that has seen the pattern casts a trust-weighted vote. The weight depends on how much the swarm trusts that specific peer — freshly discovered agents carry less weight than server-vouched veterans.

The consensus calculation is deliberately simple: sum of affirmative trust weights divided by total trust weights of all voters. If the result exceeds the quorum threshold and enough independent agents agree, the swarm fires a collective detection event. No single agent could have produced this signal. It emerged from the mesh.

Byzantine tolerance comes from the trust weighting. An attacker who compromises one agent can cast votes, but a single low-trust vote can't achieve quorum. You'd need to compromise multiple verified agents simultaneously — and each one is running the hardening from entry 0.002.

Five things a pack can see

I identified five detection patterns that are genuinely impossible for any single agent, no matter how sophisticated:

Coordinated phase alignment. Three hosts entering the same ransomware kill chain phase within five minutes. One host in "staging" is a yellow flag. Three hosts simultaneously staging is an attacker pressing the go button. The swarm detects this in under five seconds.

Entropy waves. Rising file entropy on one host could be a backup or a compression job. Rising entropy on five hosts simultaneously is coordinated encryption. The collective signal is unambiguous in a way no individual measurement can be.

Lateral movement confirmation. This one is elegant. Agent A sees an outbound connection to Agent B's subnet. Agent B sees an inbound connection from Agent A's subnet. Neither agent alone knows this is lateral movement. But the swarm correlates both observations and confirms it in under a second. No server needed.

Collective anomaly. Five agents each reporting a below-threshold anomaly. Individually, these are noise. The attacker designed them to be noise. But five independent sensors reporting the same type of noise simultaneously? That's signal hiding in the statistical margin, and the swarm is the only architecture that can surface it.

Canary mesh. Each agent monitors a lightweight honeypot resource. One trip on any wire, anywhere in the network, and every peer knows instantly. Sub-second network-wide alerting from distributed tripwires.

The part that surprised me

I expected the detection improvements. What I didn't expect was how the swarm degrades.

Kill the server. The swarm keeps working. The agents can't get server-vouched trust updates, but verified peers continue gossiping and voting autonomously. Detection latency stays sub-second.

Partition the network. Each isolated segment self-organises as a mini-swarm. The agents in segment A maintain their own consensus. The agents in segment B maintain theirs. Reunite the segments and anti-entropy pull syncs their observations automatically.

No peers at all? The agent behaves exactly as before — hub-and-spoke reporting to the server. Zero degradation in the solo case.

Graceful degradation isn't a feature I designed. It's a property that emerges from the architecture. Distributed systems built on local interactions and no central dependency naturally resist partition. I just had to not break that property by introducing unnecessary coordination points.

What this means

Every commercial EDR product works the same way. Agent sees something. Agent reports to server. Server correlates. Maybe. On its own schedule. With heartbeat-frequency latency.

That architecture has a ceiling. The ceiling is the heartbeat interval. If the attacker moves faster than your heartbeat, your correlation engine is post-mortem analysis, not real-time detection.

The swarm removes the ceiling. Observations propagate at network speed, not heartbeat speed. Consensus forms in seconds, not minutes. And the collective intelligence of the mesh detects things that no amount of server-side correlation can surface, because some patterns are only visible when you're watching from multiple vantage points simultaneously.

I filed the patent. Fourteen claims covering the gossip protocol, consensus mechanism, trust hierarchy, five detection patterns, and graceful degradation. The system compiles into the same 2MB binary as the base agent. Feature-gated behind an environment variable. Off by default. When it's on, the endpoints hunt in packs.

The Raz0r swarm is part of the ninja.ing detection layer — peer-to-peer collective threat sensing with cryptographic trust and emergent intelligence.

0.002 2026-03-10

Making a Memory Sensor Unhackable. Then Making It Think.

We built a memory sensor. A Rust binary that sits inside a process, reads ETW telemetry, scans memory for injection artefacts, and watches behavioural chains for ransomware kill patterns. It works. It detects things. But detection is only valuable if the detector itself can't be compromised.

So I asked a hard question: what if the attacker has admin on the box?

The honest answer was uncomfortable. The agent's heartbeat had no authentication. Commands from the server were unsigned. Events could be forged. The binary could be patched, debugged, or reflectively loaded by anyone who found the export. The config could be overridden with an environment variable to silently disable every detection engine.

In other words: the sensor was a locked door with the key taped to the frame.

Cryptographic identity

The fix started with identity. Every agent now generates an ED25519 keypair on first run, persists it to a protected file, and presents the public key during registration. The server responds with a random challenge. The agent signs it. If the signature verifies, the server issues a shared HMAC secret and its own public key.

After that handshake, nothing moves without proof. Every heartbeat carries an ED25519 signature, a timestamp (rejected if more than 30 seconds stale), and a cryptographic nonce (rejected if ever repeated). Every event batch carries an HMAC-SHA256 tag over the serialized events plus a monotonic sequence number. Every command from the server carries the server's own ED25519 signature, its own timestamp, its own nonce.

The result: a man-in-the-middle can see encrypted traffic, but can't forge a heartbeat, inject a command, or replay a captured request. The sequence numbers mean the server detects gaps — if an attacker suppresses event batches, the hole is visible.

The binary protects itself

Cryptographic authentication protects the network layer. But what about the binary itself? An attacker with local admin could attach a debugger, patch detection routines, or hook API calls.

So the agent now checks. At startup it calls IsDebuggerPresent, CheckRemoteDebuggerPresent, and NtQueryInformationProcess with the ProcessDebugPort class. It enumerates running processes looking for IDA, Ghidra, x64dbg, Procmon, Wireshark. It runs a timing check — a trivial loop that should complete in under 5ms, but takes orders of magnitude longer under a single-stepping debugger.

The binary computes its own SHA256 hash on first run and stores it. Every subsequent launch, it reads itself from disk and verifies the hash matches. Patch one byte — one NOP over a detection check — and the agent refuses to start.

In release mode, the tls_insecure flag doesn't exist. It's compiled out. The environment variables that disable the memory scanner and behavioural engine are ignored. The config integrity hash — an HMAC over security-critical values, keyed with the build hash — is verified periodically. If something in memory changes the config, the agent detects it and fires a tamper alert.

The reflective DLL entry point now requires an authentication token derived from the build hash and the host machine's identity. Without it, the function returns silently. No error message. No indication it was even called.

The watchdog sees everything

A sophisticated attacker might not try to kill the agent. They might try to kill a subsystem — terminate the ETW trace thread, crash the memory scanner, disable the behavioural engine — while leaving the heartbeat alive so the server doesn't notice.

The watchdog thread monitors every subsystem via health pings. If the memory scanner stops pinging, the watchdog fires a tamper event and flags the subsystem as dead. If the ETW manager goes silent, same thing. The events reach the server because the transport layer is independent of the detection layer.

Kill the watchdog? The integrity monitor — a separate async task — catches that too. It's turtles all the way down, and every turtle reports to a different part of the stack.

The swarm that emerged

Here's where it gets interesting. Once you have agents that are cryptographically authenticated, tamper-resistant, and continuously verified — you have something more valuable than individual sensors. You have a trusted mesh.

Think about what happens when 50 hardened agents are deployed across an enterprise. Each one independently detects local signals — memory injection on host A, suspicious process lineage on host B, high-entropy file writes on host C. Individually, these might be noise. But the graph correlates them.

The cross-node correlator already tracks ransomware kill chain phases across hosts. When three machines enter Phase 2 (staging) within a 15-minute window, and all three report the same C2 beacon pattern, the system doesn't wait for Phase 4 (pre-encryption). It fires a campaign detection alert. The collective signal is stronger than any individual observation.

But with hardened agents, the collective signal is also trusted. Each event is signed by a verified agent on a verified host. You can't forge cross-host correlation by injecting fake events from a compromised node — the HMAC won't verify, the sequence numbers will show gaps, the public key won't match registration.

This is where the swarm analogy becomes literal. Individual bees sense local vibrations. The hive computes collective threat assessment. No single bee makes the decision to swarm, but the distributed sensing network produces an emergent response that's faster and more accurate than any centralised controller.

We're building toward exactly that. Agents that don't just report upward, but that share local context laterally — peer-to-peer detection consensus where if three agents in the same network segment independently flag the same behavioural pattern, they can escalate collectively without waiting for the server round-trip. The cryptographic identity framework makes this possible because every agent can verify that a peer message actually came from a legitimate sensor.

The arms race is the point

None of this makes the agent truly unhackable. Nothing is. A sufficiently motivated attacker with kernel access can patch anything in memory, hook any system call, and intercept any network packet.

But every layer of hardening raises the cost. Moving from "curl a fake heartbeat" to "reverse-engineer a stripped, obfuscated Rust binary, bypass anti-debug checks, find the HMAC key in locked memory pages, forge a valid signature chain, and do it all without triggering the integrity monitor" — that's the difference between a script kiddie and a nation-state engagement.

And for a 2MB binary that runs as a background service, that's a reasonable trade.

The Raz0r agent is part of the ninja.ing detection layer — memory-resident threat sensing with cryptographic trust and collective intelligence.

0.001 2026-03-08

I Stopped Looking at Spreadsheets. Here's What Became Visible.

For years I did what everyone in security does. I consumed threat feeds. I parsed CSV exports. I built dashboards with severity counts and trend lines. I correlated IOCs against SIEM logs and felt productive.

Then I started putting everything into a graph database. And I realised I'd been blind.

Not because the data was bad. The data was always there. But flat structures — tables, spreadsheets, JSON blobs — strip out the one thing that actually matters in intelligence work: relationships.

A CVE in a spreadsheet is a row. A CVE in a graph is a node connected to the software it affects, the threat actors who exploit it, the techniques they chain it with, the infrastructure they stage from, and the campaigns they've run. Follow those edges three or four hops out and you're looking at something no dashboard ever showed you: the blast radius.

One critical RCE in a logging library. Trace it outward. 47 software products affected. 200+ organisations running them. 12 threat actors with historical exploitation patterns. The vulnerability doesn't exist in isolation. It exists in a network of consequence, and the graph makes that network computable.

What propagation actually reveals

The breakthrough wasn't the graph itself. It was understanding that conditions propagate through connected structures.

Think about it. When a threat actor compromises a piece of infrastructure, that infrastructure doesn't become dangerous in isolation. Everything connected to it shifts. The software hosted there. The campaigns launched from there. The other actors who share that staging ground.

We built risk propagation using a PageRank-style diffusion. Seed known-bad nodes — actively exploited vulnerabilities, confirmed threat actors, observed campaigns — with a high score. Then let that score flow outward through edges with configurable decay. Four hops. Each hop reduces the score, but the signal still carries.

What emerged was remarkable. Nodes that looked unremarkable in isolation — a mid-severity CVE, an obscure software package, a quiet infrastructure IP — lit up because of their position in the graph. They were bridges. Remove them and entire attack paths collapse.

This is something centrality analysis formalises. Betweenness centrality doesn't tell you what's popular. It tells you what's critical. A C2 server with only 8 connections but a betweenness score of 0.82 means most attacker-to-victim paths flow through it. Block that one IP and you degrade the entire threat infrastructure more effectively than blocking 50 high-degree nodes.

No spreadsheet will ever show you that.

The same principle, applied everywhere

Once you see how conditions propagate through graphs, you start seeing applications everywhere. And that's exactly what happened.

Identity security. Active Directory is a graph. Users belong to groups. Groups nest inside groups. Permissions cascade through ACLs. A user with GENERIC_ALL on a group that has a member with WRITE_DACL on a group that contains Domain Admins — that's a three-hop privilege escalation path that no role-based access matrix will ever reveal. We built BFS attack path discovery that traces these transitive chains automatically. Organisations discover shadow admins they never knew existed.

Financial intelligence. A company registered in Malta looks clean. But trace backwards through beneficial ownership edges and shareholder relationships and suddenly you're looking at a chain that runs through Panama, connects to accounts in Moscow, and terminates at a sanctioned entity. Suspicion propagation — seeding known-bad entities and letting the score decay through ownership hops — reveals networks that compliance teams spent months investigating manually. The graph does it in seconds.

Cross-host threat correlation. A single endpoint reporting high-entropy memory activity is noise. But correlate that signal across 50 hosts in a temporal window — three hosts entering the same ransomware kill chain phase within 15 minutes — and you've detected a coordinated campaign. Individual alerts are meaningless. The pattern of correlation across the graph reveals the campaign structure.

The principle is the same every time. Signals are weak in isolation. They become powerful when you trace their propagation through connected structures.

Where the LLM changes everything

Here's where it gets interesting.

Graphs are powerful but they produce complex output. Community detection reveals clusters of entities, but interpreting what those clusters mean requires domain expertise. Link prediction suggests connections that should exist based on topology, but understanding why they matter requires contextual reasoning.

This is exactly what large language models are built for.

We pair the graph ML with an LLM that can read the topology and produce natural language intelligence. Community detection finds that three threat actors cluster together — the LLM explains that they share infrastructure patterns consistent with a state-sponsored operational umbrella. Risk propagation surfaces a mid-severity CVE — the LLM contextualises it: this vulnerability sits on the shortest path between two active threat actors and a critical industry vertical, making it operationally significant despite its CVSS score.

The combination is multiplicative, not additive. The graph computes relationships at scale. The LLM interprets them in context. Together they produce intelligence that neither could generate alone.

We've taken this further with adversary digital twins — probabilistic behavioural models built from historical threat actor data. Run Monte Carlo simulations against a specific actor's TTP profile and you get probability-weighted campaign timelines. Not "APT28 uses phishing" but "APT28 reaches your data in 6-12 days with 85% confidence, most likely through this specific kill chain, with this defensive response window." The graph provides the structure. The ML provides the simulation. The LLM provides the briefing.

Emergent behaviour

Perhaps the most powerful capability isn't answering questions. It's surfacing signals nobody thought to look for.

We built emergent behaviour detectors that compare graph snapshots over time. They look for structural changes that indicate something operationally significant is happening:

TTP convergence — three previously unrelated actors suddenly adopting the same technique cluster. Coordinated campaign.
Velocity anomaly — a node's edge count growing 4x faster than its label baseline. New campaign spin-up.
Cascade emergence — new attack paths appearing between actors and high-value targets that didn't exist last week. Expanding threat surface.
Prediction materialisation — links that the model predicted based on topology now appearing in real data. The model is validated and the threat is confirmed.

None of these are things an analyst would query for. They emerge from the graph's own structural evolution. The system watches itself change and flags when the change is significant.

The honest truth

I'm not writing this because we've solved intelligence. I'm writing it because, after building these systems across cyber threat intelligence, geopolitical analysis, financial investigation, identity security, and endpoint detection, I've become convinced that the industry's relationship with flat data is holding it back.

Every security product I've used stores intelligence in tables. Rows and columns. Maybe with some foreign keys if you're lucky. And then analysts spend their time manually connecting dots that the data structure actively obscures.

Graphs don't just store relationships. They make relationships queryable, traversable, and computable. Add ML for pattern detection at scale. Add an LLM for contextual interpretation. And suddenly you have a system that doesn't just answer questions — it tells you which questions you should be asking.

That's the signal. Not the data point. The structure.

Building graph-native intelligence systems at ninja.ing. Seven platforms, one graph, total visibility.