Pre-computed datasets from the Jeffrey Epstein email archive — House Oversight Committee, Department of Justice, and Yahoo account releases.
Text extraction powered by Reducto. Video understanding by Kino AI. Built by the Jmail team.
v1 — Released February 25, 2026. View changelog
All datasets are also available in NDJSON format: replace .parquet with .ndjson.gz.
Extensionless paths like /v1/emails redirect to .parquet by default. Send Accept: application/x-ndjson to get the NDJSON variant instead.
SELECT sender, COUNT(*) as n
FROM read_parquet('https://data.jmail.world/v1/emails.parquet')
GROUP BY sender
ORDER BY n DESC
LIMIT 20;
import duckdb
conn = duckdb.connect()
df = conn.sql("""
SELECT * FROM read_parquet('https://data.jmail.world/v1/emails.parquet')
LIMIT 100
""").df()
print(df)
/latest/* redirects to /v1/*. When new schema versions are published, /latest will point to the newest version.
View the live export status page to monitor data pipeline progress.