Transparent Metrics
How scoring works
Natural 20 is a signal fusion machine. We don't try to predict truth — we try to measure momentum. The highest ranked items are the stories showing up across multiple independent channels: news coverage, Hacker News, Reddit, and Google Trends.
Two layers: story score vs cluster “bigness”
There are two different scoring systems in Natural 20:
- Story display score (single item ranking) — used when sorting raw stories within tabs and APIs.
- Cluster bigness (0–100) — used for the main feed, where we group related stories into one “event” and score the event by cross-channel breadth.
Story display score
Each raw story comes in with a base story.score from its source:
- Hacker News: points + comments influence score.
- Reddit: upvotes + comments influence score.
- RSS: usually starts low — amplified by clustering/coverage.
We then apply a display re-score (see lib/feed.ts):
- HN boost: if points are high, score is bumped to at least
points * 0.3 + 50. - Reddit boost: if upvotes are high, score is bumped to at least
upvotes * 0.1 + 40. - Recency boost: stories < 6 hours old get up to +50 via
(6 - ageHours) * 8.
This is intentionally simple. The more important ranking is the event-level cluster score.
Clusters: how stories become “events”
The main feed is cluster-first: multiple links about the same thing become one event. Clustering is based on title similarity and entity/topic normalization.
Why? Because coverage breadth is the strongest “real world” signal. A single viral post is interesting — but when the same story appears across multiple publishers and communities, it's almost always more consequential.
Cluster signals (normalized 0–100)
Every cluster computes a set of signals, each normalized to 0–100, then combined:
- Coverage — Number of unique sources. 2 sources ≈ 50, 3 ≈ 70, 5+ ≈ 95.
- Hacker News — Points + 0.5×comments, log-scaled + age-decayed. Age decay: >24h → 0.8, >48h → 0.6, >7d → 0.3.
- Reddit — Upvotes + 0.5×comments, log-scaled + age-decayed + multi-subreddit bonus.
- Google Trends — Matched keyword interest (0–100) with spike bonus: spiking → +15, mild spike → +5.
Signal weights
Each normalized signal is weighted (see lib/cluster.ts):
- coverage: 3.0
- hackernews: 2.0
- reddit: 2.0
- trends: 1.5
- twitter: 2.0 (reserved, currently unused)
Translation: coverage is king. A story that crosses multiple publishers will outrank a story that's only viral in one place.
Bigness: the final 0–100 score
Bigness starts as a weighted average of the active (non-zero) signals. Then we apply two key multipliers:
- Breadth ceiling: the number of active signals limits max bigness. One-signal clusters are capped.
- Multi-source bonus: coverage breadth adds a flat bonus.
Base: weightedSum / totalWeight
activeSignals == 1 → bigness = min(40, bigness*0.4)
activeSignals == 2 → bigness = min(75, bigness*0.85)
activeSignals >= 3 → bigness = min(100, bigness*1.3)
coverageRaw >= 5 → +30
coverageRaw >= 3 → +20
coverageRaw >= 2 → +12Recency boost (sorting clusters)
After bigness is computed, sorting applies a small freshness boost so brand-new stories surface quickly. This is a temporary boost applied at sort time.
age < 1h → +25
age < 3h → +20
age < 6h → +15
age < 12h → +10
age < 24h → +5
age < 48h → +2
age > 72h → −5
age > 168h → −15Why this approach works
This scoring system is biased toward stories that are:
- Cross-validated — multiple independent channels
- Fresh — recency boost + time decay
- Hard to fake — coverage breadth is expensive to manufacture
As we evolve the platform, this page will evolve with it. When you see the feed change, check here first.