Private Alpha - Institutional Validation in ProgressRequest Access
Back to Home
Benchmark Report

Enron Corpus
Benchmark Results.

Validated against 3,034 emails from the Enron investigation (allen-p custodian). Ground truth filtered by custodian for accurate precision/recall measurement.

Micro-tier benchmark on a single custodian. Recall metrics reflect single-query performance - in practice, reviewers run multiple queries per topic, and cumulative recall across a review session is significantly higher. Results will vary with corpus size and composition.

55% Single-Query Recall

Hybrid search finds over half of all relevant documents in a single query. Cumulative recall across a review session is significantly higher.

5 Search Modes

Keyword, semantic, hybrid, RAG fusion, and full pipeline. Each optimized for different use cases.

18 Minutes Total

From raw email upload to fully searchable, triaged corpus with embeddings and AI relevance scores.

$0.03/GB Triage Cost

Embedding + LLM triage costs. Semantic and hybrid search modes are free after initial embedding.

Pipeline Performance

4 parallel workers - Hetzner AX41 bare-metal (64GB RAM)

StageDurationDocumentsDetails
Ingest2.9s3,0341,054 docs/sec throughput
Process10m 4s3,034Text extraction, metadata, MIME detection
Embed7m 54s3,034OpenAI text-embedding-3-small (1536 dims)
Triage~2m3,034AI relevance scoring with citations
Total~18 min3,034End-to-end with 4 parallel workers

Search Accuracy

Micro-tier benchmark - ground truth: custodian-filtered Enron dataset

Best Overall

Hybrid

Precision16%
Recall55%
F1 Score25%
Per-Query Cost$0

Semantic

Precision11%
Recall36%
F1 Score17%
Per-Query Cost$0
Most Precise

Full (RAG+Rerank)

Precision100%
Recall11%
F1 Score20%
Per-Query Cost~$0.02/query

Methodology

Dataset

  • 3,034 emails from allen-p custodian (Enron corpus)
  • Ground truth queries filtered by custodian presence
  • Multiple query categories: factual, temporal, relational
  • Reproducible via open-source CLI: cli bench run --tier micro

Infrastructure

  • Hetzner AX41 bare-metal: 64GB RAM, 2x512GB SSD RAID 1
  • PostgreSQL 15 + pgvector (HNSW index, ef_construction=256)
  • 4 parallel RQ workers (forking mode)
  • Total infrastructure cost: $34.70/month

See These Results Live

The demo matter contains the exact same Enron dataset. Search, triage, and export it yourself.

TRY_INTERACTIVE_DEMO