New Bearings · Methodology

How this dataset is built

The value of New Bearings depends on whether the numbers and claims you see hold up under scrutiny. This page explains where the data comes from, how a role is curated, how projections are framed, and what we don't yet know.

What's in the dataset

New Bearings has two layers.

Where the data comes from

Every claim in a deep-tier or time-machine profile is tagged with a source. The sources we draw from most:

BLS — Bureau of Labor Statistics
OEWS (current employment + wages by SOC), Occupational Outlook Handbook (projections), QCEW (county-level employment).
Refresh: Annual for OEWS; biennial for OOH projections.
O*NET
Task and skill data per occupation. Powers the base-layer task lists.
Refresh: Roughly annual.
Eloundou et al., GPTs are GPTs (2023)
LLM labor-exposure scores per occupation. Seeds the base-layer AI-exposure metric.
Refresh: Static; recalibrated by hand when a substantially better academic measure ships.
IPUMS — Integrated Public Use Microdata Series
Historical US Census occupation counts. Powers the long-tail employment series in time machines.
Refresh: On-demand per role during curation.
FRASER — Federal Reserve Archive
Historical wage series, often back to the 19th century via scanned BLS bulletins.
Refresh: On-demand per role; many series need OCR.
Curated projection sources
BLS extrapolation, McKinsey scenario modelling, OECD employment trend, academic surveys. Each projection in a time machine cites its source and method.
Refresh: When the underlying publication updates.
Vendor product pages + first-party docs
AI tool descriptions, release dates, and what each tool actually automates. Tools are named by vendor (Harvey, Cursor, Vic.ai, etc.) — never as a generic "AI tools" placeholder.
Refresh: Continuous as new tools emerge.

How a role becomes a time machine

A role moves from base layer to curated layer through a deliberate research process — typically the same human running multiple parallel sub-agents for the legwork, with the human making every editorial call.

  1. Source the historical series. Pull employment from IPUMS Census tables (1850 onward where data exists). Pull wages from FRASER and contemporary BLS bulletins. Many older wage series require OCR on scanned PDFs.
  2. Identify tool eras. Each technology wave that reshaped the role (the typewriter, the telephone, dictation machines, word processors, ERP systems, LLMs) becomes a tool era with start year, end year (or open-ended), impact, and labor effect — every claim cited.
  3. Anchor historical notes. Specific year-anchored beats (e.g., a peak employment year, a major displacement event, a regulatory change) get short notes with citations.
  4. Surface projection disagreement. Multiple credible projection sources are included as a cone — the spread is the point, not a single number.
  5. Run verification. Static audits enforce that every datapoint has a citation, every era has at least one source, every projection has a method, and no claim is invented.

The Gemini model writes narrative prose — it never invents tasks, tools, scores, employment numbers, or historical facts. If the data doesn't support a claim, the page says nothing.

The projection cone

Past "now," the time machine shows a fanned-out view of where credible sources think a role is heading. They will disagree, often dramatically. That's deliberate — and it's the most honest signal in the file.

Each projection in the cone shows its source, target year, method, and a magnitude bar. A BLS extrapolation may project +6% by 2033 while a McKinsey scenario projects −40% by 2030. Both are real positions held by real institutions. The spread is the range of futures this role might live in — not noise to average away.

New Bearings does not produce its own projections. Producing one would require taking sides between sources, and the entire point of the cone is to refuse to do that.

Confidence tiers

Every role you can look up sits in one of three tiers:

What we don't yet know

A few honest limitations:

Found a mistake?

The value of New Bearings collapses if we get a citation wrong. If you spot an error — a wrong employment number, a misdated tool era, a projection attributed to the wrong source — please flag it. The dataset is in a private repo but corrections come back into the file directly.

Use the contact form on the For teams page with the role, the year, and the source you think is wrong, and select “Correction or data issue” as the use case. We read every submission and respond within a few days.