Jmail Data API

Pre-computed datasets from the Jeffrey Epstein email archive — House Oversight Committee, Department of Justice, and Yahoo account releases.

Text extraction powered by Reducto. Video understanding by Kino AI. Built by the Jmail team.

v1 — Released February 25, 2026. View changelog

Available Datasets

Emails 1.78M records

Documents 1.41M records

Photos & People 18K photos

iMessages 4.5K messages

Community & Metadata 414K stars

All datasets are also available in NDJSON format: replace .parquet with .ndjson.gz.

Content Negotiation

Extensionless paths like /v1/emails redirect to .parquet by default. Send Accept: application/x-ndjson to get the NDJSON variant instead.

Quick Start with DuckDB

SELECT sender, COUNT(*) as n
FROM read_parquet('https://data.jmail.world/v1/emails.parquet')
GROUP BY sender
ORDER BY n DESC
LIMIT 20;

Python

import duckdb

conn = duckdb.connect()
df = conn.sql("""
  SELECT * FROM read_parquet('https://data.jmail.world/v1/emails.parquet')
  LIMIT 100
""").df()
print(df)

Version Aliases

/latest/* redirects to /v1/*. When new schema versions are published, /latest will point to the newest version.

Export Status

View the live export status page to monitor data pipeline progress.