I built a sentiment intelligence platform as a side project. It started as an experiment — what happens if you point 18 data feeds at the world and try to measure how it feels? It turned into something I didn't expect: a predictive layer that makes the entire intelligence ecosystem smarter.
The platform is called V01d. The name is Japanese — kokuuyochi, 虚空予知 — which translates roughly to "void precognition." The idea that by staring into the noise long enough, patterns emerge before events do.
Why sentiment matters for threat intelligence
Every platform in the ninja.ing ecosystem analyses what happened. Signal tracks threat actors, CVEs, and attack infrastructure. Fusion correlates enterprise threats. Raz0r detects endpoint compromises. Nexus maps financial crime networks. 1D finds Active Directory attack paths. All of them are reactive. They're brilliant at analysing the present and the recent past.
But threats don't emerge from a vacuum. Before APT-29 launches a campaign, geopolitical tensions escalate. Before a ransomware group hits a sector, industry sentiment shifts. Before a zero-day gets exploited in the wild, chatter rises in the communities that trade them.
Sentiment is a leading indicator. By the time a CVE appears in the threat graph, the geopolitical conditions that motivated its exploitation have been developing for weeks. If you can measure those conditions — quantify them, track their velocity, detect anomalies in their trajectory — you can anticipate threats before they materialise in the technical layer.
That's the thesis. V01d exists to test it.
18 feeds and counting
The pipeline ingests from everything free and available. GDELT's Global Knowledge Graph — the largest open dataset of world events, updated every 15 minutes, processing 100+ languages and extracting entities, locations, themes, and sentiment from news coverage worldwide. RSS feeds from BBC, Reuters, AP, Al Jazeera, Guardian, NHK, Deutsche Welle, and a dozen more. Reddit sentiment from r/worldnews, r/geopolitics, r/cybersecurity, r/economics. HackerNews for tech industry signal. FRED for economic indicators — VIX, yield curve, economic policy uncertainty indices.
Then the specialist feeds. USGS earthquake data as a geophysical sentiment proxy. WHO Disease Outbreak News for health crisis tracking. ReliefWeb for humanitarian situations. ArXiv paper abstracts for academic research sentiment. Crypto Fear & Greed Index for market psychology. Polymarket prediction odds as calibration anchors. Wikipedia Current Events for crowd-sourced event tracking.
Each feed runs as an async poller following the same pattern: fetch, deduplicate via LRU cache, score sentiment using VADER NLP, extract entities and regions, emit SentimentEvent objects into an async queue. A batch writer consumes the queue and persists to Neo4j with UNWIND CREATE — the same high-throughput write strategy we use in Raz0r's telemetry pipeline.
The graph accumulates fast. Thousands of events per day, each one tagged with entities (people, organisations, countries), topics (extracted themes), regions (ISO alpha-2 geo codes), tone scores, source provenance, and timestamps.
The Oracle
Raw sentiment data is noise. The V01d Oracle turns it into signal.
For any entity, region, or topic, the Oracle computes a composite threat score from 0 to 100. Five components, each independently calculated and weighted:
Tone (30%) — Current average sentiment across all sources mentioning this entity. Negative tone correlates with instability, crisis, and threat activity.
Velocity (25%) — Rate of change in mention frequency. A spike in velocity often precedes a significant event by 12–48 hours.
Anomaly (20%) — Statistical deviation from baseline behaviour. Isolation Forest detects multi-dimensional anomalies across source vectors. When multiple independent sources simultaneously deviate from their individual baselines, something real is happening.
Topic heat (15%) — Concentration of topics around the entity. When an entity that normally appears in three topic clusters suddenly appears in twelve, it's becoming a focal point.
Economic (10%) — Correlation with economic stress indicators. VIX spikes, yield curve inversions, and EPU surges provide a macroeconomic context layer.
The result maps to five threat levels: Stable (0–20), Low (21–40), Watch (41–60), Elevated (61–80), Critical (81–100). The score updates continuously as new events flow through the pipeline.
It's crude. It's probably wrong in specific cases. But in aggregate, across hundreds of entities and dozens of regions, it produces a surprisingly coherent picture of global tension. When "Russia" shifts from Watch to Elevated while "Ukraine" simultaneously rises, and the VIX is climbing, and GDELT event velocity is spiking — that convergence means something.
The ML lab
The scoring is phase one. Phase two is prediction.
V01d has 13 ML capabilities, built in four phases. Source consensus detection — when all feeds agree on direction, the signal is amplified. Source reliability ranking — some feeds lead events by hours, others lag. Multi-source anomaly detection — Isolation Forest across the full feature matrix of source-specific sentiment vectors. LSTM-style forecasting — 24-hour tone predictions based on historical sequences.
Then the graph-native models. Sentiment contagion — how negative tone about one entity spreads through connected entities in the graph. Community Oracle — risk assessment at the community level, where communities are Louvain clusters of entities that co-occur in events. Geospatial diffusion — how regional sentiment propagates through geographic proximity and trade relationships.
The most interesting one is narrative detection. TF-IDF over entity-topic co-occurrence matrices, clustered with DBSCAN, cross-referenced by source diversity. When the same narrative emerges independently across BBC, Reddit, and GDELT simultaneously — three completely different data sources, three different collection methodologies, three different audience biases — the narrative is real, not amplified.
The Theatre
Data without visualisation is just a database. V01d has a Sentiment Theatre — a real-time command centre that renders the planet's emotional state as an animated, interactive display.
An animated radar sweep tracks entity threat scores. A flat-projection world map colours countries by aggregate sentiment, with pulsing dots sized by event volume and coloured by tone — red for crisis, green for stability, purple for anomalous. Trending headlines layer over the geographic view. A live event feed scrolls incoming signals. The global threat index renders as an animated gauge with a needle that tracks the Oracle score in real time.
Click any data point — any entity, any region, any topic — and a drilldown panel slides in with the full Oracle breakdown: component scores, context metrics, aggregation statistics, trend prediction, and the most recent events for that target. Every dot on the map, every name in the entity list, every topic in the cluster view is a doorway into the underlying intelligence.
The aesthetic follows the B-2 stealth palette from the rest of the ecosystem. Void indigo accent on near-black backgrounds. Japanese typography. Scanline overlays. The visual language says: this is surveillance infrastructure.
How this feeds the mesh
This is the part that makes V01d more than a side project.
Entry 0.006 described the Intelligence Mesh — cross-domain traversal across the unified graph. V01d adds a new dimension to the mesh: temporal sentiment context.
A ThreatActor node in Signal's graph represents static threat intelligence — known TTPs, known infrastructure, known campaigns. A SentimentEvent node in V01d's graph represents real-time geopolitical context — current media coverage, public sentiment trajectory, economic stress indicators. The mesh edge between them connects who they are with what the world is saying about them right now.
Start at a SIEM alert. Traverse to the threat actor via IOC matching. Traverse to the actor's identity graph footprint via mesh edges. Now traverse to V01d: what's the current Oracle score for this actor? What's the sentiment velocity? Is there a detected narrative involving their known infrastructure? Are their geographic regions showing elevated economic stress?
That traversal — from endpoint alert to geopolitical context in five hops — produces intelligence that no SOC analyst could assemble manually. It takes the "what" from technical detection and wraps it in "why" from sentiment analysis. The attack didn't happen randomly. It happened because conditions are ripe, and V01d measured those conditions before the first packet was sent.
The mesh link rules are straightforward. Entity names match between SentimentEvent entity tags and ThreatActor names. Country codes match between region-tagged sentiment and geographic attributes across all platform schemas. Topic clusters match against MITRE ATT&CK technique descriptions. Economic indicators correlate with financial crime patterns in Nexus.
Each mesh edge carries a mesh = true property and a domain tag. Trivially filterable. The power isn't in the edges themselves — it's in what becomes traversable once they exist.
Why a side project
V01d is experimental in a way the other platforms aren't. Signal and Fusion process structured intelligence — CVEs, MITRE techniques, STIX bundles. The inputs are well-defined, the ontology is standardised, the ground truth is verifiable.
Sentiment is messy. VADER is a dictionary-based sentiment analyser from 2014 — fast and good enough for aggregate scoring but laughably crude for nuanced political language. GDELT's entity extraction mislabels persons as organisations. Reddit upvotes are a noisy proxy for consensus. Economic indicators lag by hours to days.
The Oracle's component weights — 30% tone, 25% velocity, 20% anomaly, 15% topic heat, 10% economic — are educated guesses. I have no empirical basis for choosing 30% over 25% for tone. The LSTM forecasting uses a minimal architecture that barely outperforms linear regression on most entities.
I'm building it anyway because the hypothesis is worth testing: can aggregate, multi-source sentiment analysis provide meaningful predictive signal for threat intelligence? If the answer is yes, even partially, the mesh integration makes every other platform in the ecosystem smarter. If the answer is no, the platform still produces a useful real-time global awareness picture.
Side projects are where you test hypotheses that would never survive a product requirements document. Nobody signs off on "let's build a VADER-based geopolitical Oracle and see if it predicts threat actor behaviour." You build it at midnight because the question won't leave you alone.
What I think I'm seeing
Two weeks of live data. Too early for conclusions. But the patterns are suggestive.
Entities with rising Oracle scores tend to appear in Signal's threat intelligence feeds 24–72 hours later. The relationship isn't causal and the sample size is tiny. But it's consistent enough that I'm going to keep measuring.
Source consensus — when all feeds agree on a negative trajectory — is a stronger signal than any individual feed. The consensus detector fires rarely, but when it does, the named entity is almost always involved in a real-world event within days.
Economic indicators correlate with ransomware campaign frequency. When the VIX is elevated and EPU is rising, threat actors are more active. This is the least surprising finding — economic instability creates both motivation and opportunity for cybercrime — but having it quantified and tracked in real time is operationally useful.
The narrative detector found something last week that stopped me cold. Three independent sources — GDELT, BBC RSS, and Reddit — simultaneously produced a narrative cluster around a specific technology company and a specific country, with uniformly negative sentiment. The tone shifted 12 points in 6 hours. Two days later, Signal ingested a new campaign attribution involving that company's products.
Coincidence? Maybe. But the void was watching, and it saw something before the traditional intelligence did.
V01d is live at ninjav0id.io — 18 feeds, 8 graph labels, 13 ML models, one Oracle that stares into the noise and reports what it finds. A side project that might become a leading indicator for everything else.