
Job Description
As a founding engineer, you'll help shape the foundation of how intelligent agents are built, verified, and deployed — not just at Relari, but across the next generation of AI-native software.
In this role, you will:
- Architect core systems for Nuvi — designing how natural language specifications are turned into structured, testable agent behavior.
- Build infrastructure that enables agents to integrate with real-world tools, APIs, and knowledge sources.
- Ensure agent reliability by developing systems for simulation, automated evaluation, and behavioral verification.
- Work directly with customers, helping them go from intent to working agents — and using that feedback to inform what we build next.
- Continuously explore new ways agents can reason, act, and improve — and bring those innovations into the product.
You’re a good fit if:
- You’ve built and shipped technically complex systems — whether infrastructure, AI tools, or end-user products.
- You’re a Software Engineer, Machine Learning Engineer, or Research Engineer at a fast-growing software company or top research lab.
- You thrive in a fast-moving environment and view unstructured environments as opportunities to identify the most impactful work and define the future success of the company.
- You operate like a owner: you identify high-leverage problems, drive them to resolution, and raise the bar for the team.
Bonus points if you have:
- Built or contributed to an AI agent framework, orchestration layer, or evaluation pipeline.
- Trained or fine-tuned foundation models, or worked on tools that enable others to do so.
- Published in top ML/NLP conferences (e.g. NeurIPS, ICML, ACL) or contributed to widely used open-source projects.
Optimize Your Resume for This Job
Get a match score and see exactly which keywords you're missing
Job Details
- Category
- Aerospace Engineering
- Employment Type
- Full Time
- Location
- San Francisco, CA
- Posted
- Compensation
- $110,000 - $160,000 per year
About Relari
Relari is the creator of continuous eval, a modular evaluation framework that uses open-source technology to cover a wide range of applications such as text generation, code generation, retrieval, classification, agents, and other LLM use cases. The cloud-based platform generates custom synthetic data to simulate user behavior and stress test GenAI applications.
Similar Aerospace Engineering Roles



Found this role interesting?