Skip to main content
← Back to jobs
ARC Prize Foundation logo

Benchmark Testing and Analysis Lead

Compensation
$150,000–$250,000/year

Job Description

A technical researcher to own how we evaluate frontier models on the ARC-AGI benchmarks. This person will run new models end-to-end, mine the data exhaust from every run, and translate what we learn into reports and public communication that shape the conversation on where model capability is heading. This is a remote, full-time role.

What You'll Do:

  • Own our model benchmarking and testing process, and run new frontier models against ARC-AGI-1, ARC-AGI-2, and ARC-AGI-3 as they ship
  • Build and own the ARC Prize Analysis Package - a repeatable report produced for every new frontier model, turning raw logs into insight on capability, failure modes, and gaps
  • Own the official and community leaderboards end-to-end - from scoring pipeline to public page
  • Serve as primary contact for new labs testing on ARC-AGI, and communicate findings externally via Twitter, newsletter, and policy and partner briefings

What We're Looking For:

  • Research background with hands-on model evaluation experience - you've run evals before and know how to read the results (model training experience not required)
  • Deep understanding of how modern models work and fail, and comfortable building your own tooling and analysis to answer the questions you care about
  • Strong ownership instinct and clear technical communicator

Example outputs this role would produce: a model score announcement and a model analysis blog post.

Optimize Your Resume for This Job

Get a match score and see exactly which keywords you're missing

Optimize Resume

Job Details

Category
Business & Finance
Employment Type
Full Time
Location
Remote (US) (Remote Available)
Posted
Apr 23, 2026, 07:41 PM
Listed
Apr 23, 2026, 07:41 PM
Compensation
$150,000 - $250,000 per year

About ARC Prize Foundation

Part of the growing frontier tech ecosystem pushing the edges of what's possible.

Found this role interesting?

Benchmark Testing and Analysis Lead
ARC Prize Foundation
Apply ↗

Shipping like we're funded. We're not. No affiliation.

Sequoia logo
Y Combinator logo
Founders Fund logo
a16z logo