Transparent Metrics

How scoring works

Natural 20 is a signal fusion machine. We don't try to predict truth — we try to measure momentum. The highest ranked items are the stories showing up across multiple independent channels: news coverage, Hacker News, Reddit, and Google Trends.

Two layers: story score vs cluster “bigness”

There are two different scoring systems in Natural 20:

  • Story display score (single item ranking) — used when sorting raw stories within tabs and APIs.
  • Cluster bigness (0–100) — used for the main feed, where we group related stories into one “event” and score the event by cross-channel breadth.

Story display score

Each raw story comes in with a base story.score from its source:

  • Hacker News: points + comments influence score.
  • Reddit: upvotes + comments influence score.
  • RSS: usually starts low — amplified by clustering/coverage.

We then apply a display re-score (see lib/feed.ts):

  • HN boost: if points are high, score is bumped to at least points * 0.3 + 50.
  • Reddit boost: if upvotes are high, score is bumped to at least upvotes * 0.1 + 40.
  • Recency boost: stories < 6 hours old get up to +50 via (6 - ageHours) * 8.

This is intentionally simple. The more important ranking is the event-level cluster score.

Clusters: how stories become “events”

The main feed is cluster-first: multiple links about the same thing become one event. Clustering is based on title similarity and entity/topic normalization.

Why? Because coverage breadth is the strongest “real world” signal. A single viral post is interesting — but when the same story appears across multiple publishers and communities, it's almost always more consequential.

Cluster signals (normalized 0–100)

Every cluster computes a set of signals, each normalized to 0–100, then combined:

  • Coverage — Number of unique sources. 2 sources ≈ 50, 3 ≈ 70, 5+ ≈ 95.
  • Hacker News — Points + 0.5×comments, log-scaled + age-decayed. Age decay: >24h → 0.8, >48h → 0.6, >7d → 0.3.
  • Reddit — Upvotes + 0.5×comments, log-scaled + age-decayed + multi-subreddit bonus.
  • Google Trends — Matched keyword interest (0–100) with spike bonus: spiking → +15, mild spike → +5.

Signal weights

Each normalized signal is weighted (see lib/cluster.ts):

  • coverage: 3.0
  • hackernews: 2.0
  • reddit: 2.0
  • trends: 1.5
  • twitter: 2.0 (reserved, currently unused)

Translation: coverage is king. A story that crosses multiple publishers will outrank a story that's only viral in one place.

Bigness: the final 0–100 score

Bigness starts as a weighted average of the active (non-zero) signals. Then we apply two key multipliers:

  • Breadth ceiling: the number of active signals limits max bigness. One-signal clusters are capped.
  • Multi-source bonus: coverage breadth adds a flat bonus.
Base: weightedSum / totalWeight
activeSignals == 1 → bigness = min(40, bigness*0.4)
activeSignals == 2 → bigness = min(75, bigness*0.85)
activeSignals >= 3 → bigness = min(100, bigness*1.3)
coverageRaw >= 5 → +30
coverageRaw >= 3 → +20
coverageRaw >= 2 → +12

Recency boost (sorting clusters)

After bigness is computed, sorting applies a small freshness boost so brand-new stories surface quickly. This is a temporary boost applied at sort time.

age < 1h  → +25
age < 3h  → +20
age < 6h  → +15
age < 12h → +10
age < 24h → +5
age < 48h → +2
age > 72h → −5
age > 168h → −15

Why this approach works

This scoring system is biased toward stories that are:

  • Cross-validated — multiple independent channels
  • Fresh — recency boost + time decay
  • Hard to fake — coverage breadth is expensive to manufacture

As we evolve the platform, this page will evolve with it. When you see the feed change, check here first.