
Research Engineer
Job Description
About the role
This is a high-ownership role for someone who wants to work at the frontier of LLM evaluation, agentic systems, and startup building. You will help design and execute projects across coding data, computer-use environments, benchmark development, and research infrastructure.
You will work closely with the founders on both technical execution and company-building. That means this role is not limited to pure research or pure engineering; it sits at the intersection of research, product, operations, and GTM.
What you’ll work on
- Build and improve evaluation tasks, datasets, and environments for frontier models
- Lead portions of our computer-use and coding-data projects end to end
- Create tasks for coding, terminal, and computer-use benchmarks
- Help develop new samples and benchmark ideas across new domains
- Contribute to internal and open-source tooling for running agent evaluations
- Help source novel codebases and datasets, including areas like Fortran, COBOL, and embedded C
- Write technical blog posts or research-style writeups about benchmarks, datasets, and methods
- Support hiring, recruiting, and interviewing as the team grows
- Potentially contribute to papers, open-source benchmarks, and research collaborations as time permits
Example projects
- Building realistic computer-use environments with strong verifiers
- Designing coding and agent-eval datasets that surface capability gaps in frontier models
- Creating benchmarks we can publish publicly as part of Refresh’s research presence
- Automating the creation of tasks, environments, and graders for large-scale data generation
- Exploring adjacent tools and platforms that support trajectory collection, eval creation, and failure-mode discovery
What we’re looking for
- Strong engineering ability and high agency
- Interest in frontier LLMs, agent systems, evals, and research infrastructure
- Comfort working across ambiguity and taking ownership of poorly defined problems
- Strong writing and communication skills
- Ability to move between research, engineering, and operational work as needed
- Excitement about working closely with founders in an early-stage startup environment
Nice to have
- Experience with LLM evals, benchmarks, or training data
- Experience building agents, developer tools, or research infrastructure
- Familiarity with modern startup engineering stacks
- Interest in open-source research tooling and benchmark design
- Interest in publishing technical writing or papers
What you’ll learn
- How frontier model evals and data pipelines are built in practice
- Early-stage startup operations, scaling, and execution
- The modern startup stack across infrastructure, product, and deployment
- LLM training fundamentals, research workflows, and data operations
- How to build systems end to end
- How to lead projects and help grow a team
Optimize Your Resume for This Job
Get a match score and see exactly which keywords you're missing
Job Details
- Category
- Aerospace Engineering
- Employment Type
- Full Time
- Location
- San Francisco, CA, US / Remote (US) (Remote Available)
- Posted
- Apr 7, 2026, 11:40 PM
- Listed
- Apr 7, 2026, 11:40 PM
- Compensation
- $100,000 - $150,000 per year
About Refresh
Part of the growing space & AI ecosystem pushing the frontiers of technology.
Similar Aerospace Engineering Roles



Found this role interesting?
Career Guides
Inside guide to Airbus Defence & Space careers: Ariane 6, Eurostar, Orion ESM programs, salary ranges across Europe, Graduate Programme, and hiring process.
Inside guide to Thales Alenia Space careers: Galileo, MetOp, SpaceRider programs, salary ranges, locations across France and Italy, hiring process, and work culture.
Practical guide to writing resumes for aerospace and space jobs: ATS optimization, keywords by role, translating experience from other industries, clearance listing, and cover letters.