How NataPulse Detects Emerging Market Narratives — Without Inventing Them
Inside the deterministic engine that turns S&P 500 anomalies into scored, traceable narratives — and refuses to publish a story it cannot prove.
Every market-intelligence product claims to spot “emerging narratives.” The hard part is not spotting them — it is not inventing them. NataPulse’s Emerging Narratives engine takes the opposite bet from most AI-driven tools: on launch, no language model writes the story, no language model scores it, and a narrative only exists if real market events say it does. On its first production run, the engine read 130 real S&P 500 market events, built 4 candidate narratives, and published exactly 3 — the ones that cleared an explicit, published gate. Everything below is research evidence, not trade instructions: NataPulse describes what the market is doing, never what you should do about it.
Real anomalies, curated themes — and why the seed is not the truth
The raw material is unglamorous by design: end-of-day price and volume data for all 503 S&P 500 constituents, ingested daily. Statistical anomalies in that data — unusual moves, unusual volume — become market events. The narrative builder runs once a day, an hour after the market data lands, and asks a narrow question: do today’s real events map onto any of a curated registry of candidate themes?
There are 12 such candidates: AI Infrastructure, Semiconductors, Defense Spending, Nuclear Renaissance, GLP-1/Obesity Drugs, Cybersecurity, Energy Stress, Regional Banks Stress, Data Center Capex, Robotics, Space Economy, and Bitcoin Treasury Companies. Each seed carries member tickers and weighted keywords, matched at word boundaries so “data center capex” never double-counts as “data center.”
The governing principle is blunt: seed ≠ truth. A seed is a hypothesis, not a headline. If no real event lands on a theme, no narrative is created — the registry can sit there for weeks producing nothing, and that is correct behavior. A sector match alone is treated as supporting evidence only; it can never conjure a narrative by itself.
The co-movement gate: one stock never makes a story
The most common failure mode of narrative detection is anecdote inflation — one stock has a strange day, and suddenly there is a “story.” NataPulse blocks this with a structural rule: a narrative only emerges when at least two distinct tickers in the theme have events in the window, or when an event carries explicit keyword evidence tying it to the theme. A lone single-ticker anomaly with no textual corroboration cannot spawn a narrative, full stop. Single-source narratives that do publish carry an explicit “unconfirmed by other source types” disclaimer.
Five scores you can recompute by hand
Every narrative gets five explainable scores, computed by transparent weighted formulas over real event aggregates — not by a model:
- Importance (1–10): event count, average event importance, breadth of tickers, diversity of source types, freshness.
- Confidence (0–1): average event confidence plus corroboration, with an explicit penalty when a single ticker lacks keyword support.
- Momentum (−10 to +10): recent-half versus earlier-half event activity.
- Breadth and Freshness round out the picture.
Because the formulas are deterministic, the same inputs always produce the same scores. There is nothing to prompt, nothing to hallucinate, and — a fact worth stating plainly — the v1 engine costs $0 in AI spend. Semantic clustering and AI-written summaries exist behind feature flags, shipped off; descriptions are deterministic template copy built from the actual counts.
Publication is a hard gate, applied server-side every time the feed is read: importance ≥ 5.5 AND confidence ≥ 0.55 AND at least 2 tickers or 2 events. A narrative below the gate is not demoted or footnoted — it is invisible. Even a direct link to it returns a 404. Weak stories do not get a smaller font; they get nothing.
A lifecycle that admits when a story is over
Narratives are not evergreen, and the product says so. Each one moves through an honest lifecycle: emerging (first seen within 7 days, accelerating), active, fading (no events for 21+ days, or negative momentum), and archived (45+ days quiet). A fading narrative literally tells the reader to “treat it as a story that is losing steam, not a fresh catalyst” — the opposite of the engagement-maximizing instinct to keep every story alive.
Status changes stream in real time, including fades and archivals — subscribers see stories leaving the feed, not just entering it.
Traceable to the last event
Every published narrative decomposes completely: the exact events behind it (each tagged with its evidence type — ticker match or keyword match — and a relevance weight), the tickers involved with per-asset event counts, and the cited sources with URLs and timestamps. Daily score snapshots power the trend charts, so a narrative’s history is inspectable, not asserted. If you disagree with a narrative, you can check its work — which is the point.
NataPulse produces research evidence: scored, explained, and traced. What to do with it remains, deliberately, your call.
Sources
Sources
- NataPulse Docs — Emerging Narratives docs.natapulse.com
- NataPulse Docs — Publication Gates docs.natapulse.com
- NataPulse Docs — Scores docs.natapulse.com
- NataPulse Docs — Market Data Pipeline docs.natapulse.com
- NataPulse for Agents natapulse.com
- Polygon.io — Market Data API polygon.io