
Job Description
We built Moss because Voice AI breaks when retrieval is slow. Most retrieval infrastructure was not built for real-time reasoning. It adds latency, breaks context mid-conversation, and becomes the bottleneck just as an AI product needs to feel instant.
Moss is the real-time semantic search layer for conversational AI. We deliver sub-10ms retrieval directly in the browser, at the edge, on-device, or in the cloud, without making teams stitch together a slow retrieval stack.
We are backed by YC and run in production for teams building serious AI products. Moss is deployed across 80+ countries, serves 1M+ voice minutes, supports enterprises serving 3,000+ of their own end customers, and has 350K+ package installs.
The systems are already working at real scale. Now we are expanding the reliability, precision, and performance foundation needed for the next billion minutes.
We are looking for a Senior or Staff Backend Engineer to own the stability, precision, and scale of our production infrastructure.
This is not a greenfield role. You will work on live, high-throughput systems from day one: pushing latency lower, improving correctness, strengthening reliability, and helping Moss scale confidently with our customers.
When milliseconds and correctness are both part of the product, your work defines the company.
What You’ll Do
- Build and evolve the systems behind every Moss retrieval request: the query path, APIs, caching, indexing, and data plane that make sub-10ms retrieval possible.
- Take Moss from millions of real-time retrieval minutes to billions without letting latency, correctness, or reliability degrade along the way.
- Make sure retrieval is not just fast, but consistently right. Most of the work is dealing with messy customer data and figuring out why results are off when things don’t behave the way you expect.
- Find the bottlenecks before they become customer problems: traffic spikes, noisy workloads, bad cache behavior, broken edge cases, deployments, and everything that only shows up once a system is actually being used.
- Build the operational muscle around Moss: observability, load testing, deployment safety, alerting, incident response, and capacity planning.
- Work across our cloud and edge footprint - AWS, Cloudflare, Vercel, Supabase, Azure, GCP - and make strong technical calls on what belongs where as the product scales.
- Keep Moss enterprise-ready as we grow: secure data handling, access controls, auditability, and the production rigor behind SOC 2 and HIPAA.
- Work directly with customers pushing Moss the hardest. You will help debug real deployments, understand where systems break down, and turn those lessons into product and infrastructure improvements.
- Help define what “production-grade retrieval” means for Moss. Not just uptime, but how quickly we can ship, debug, scale, and earn trust from teams building their products on top of us.
Core Stack
Our core retrieval engine is built in Rust, with SDKs across TypeScript, Python, Swift, Android and other developer surfaces. The production stack spans cloud and edge infrastructure, including AWS, Cloudflare, Supabase, Vercel, Azure, and GCP.
You do not need experience with every part of this on day one. What matters is strong judgment around systems design, performance, and operating infrastructure that customers depend on.
Your First 90 Days
On day one, you will ship code to production.
In your first 30 days, you will get deep into the Moss query path and start taking ownership of the systems behind it - latency, correctness, caching, APIs, indexing, and the production infrastructure customers touch every day.
By 60 days, you will own meaningful backend surfaces end to end. You will be shipping improvements across multiple systems and be responsible for how they behave under real customer workloads.
By 90 days, you will be driving work without waiting for tightly scoped tickets: identifying what needs to be fixed or scaled, making the technical call, and shipping it. You will have materially improved how Moss performs, scales, or behaves in production.
What We’re Looking For
- 5+ years building and operating production backend systems where latency, correctness, and reliability matter.
- A track record of hardening and scaling live, high-traffic systems without sacrificing quality.
- Strong distributed systems fundamentals across APIs, databases, caching, queues, and performance optimization.
- Hands-on experience with monitoring, alerting, incident response, capacity planning, and production SLAs.
- Experience working in compliant or regulated environments, ideally SOC 2, HIPAA, or similar.
- Strong instincts around data security, privacy, and operational rigor.
- Comfort operating across modern cloud infrastructure and making pragmatic architecture decisions.
- High ownership. You know what needs to be hardened first and do not wait for someone to hand you a checklist.
- Clear communication. You can work directly with customers, engineers, product, and GTM when something important is on the line.
Nice to Have
- Experience building search, retrieval, vector, database, or other latency-sensitive infrastructure.
- Experience scaling systems by orders of magnitude or operating global, multi-region infrastructure.
- Experience taking a company through SOC 2 or HIPAA audits.
- Background in developer tools, infrastructure, API products, or AI platforms.
- Familiarity with edge compute, local-first systems, or on-device deployment.
- Familiarity with the modern AI stack and how developers build with LLMs.
Why Moss
This is a chance to own a core part of the product at a point where the technical decisions will shape how Moss scales for years.
You will work closely with the founders, build systems that customers rely on every day, and have real input into the architecture, product, and engineering standards behind Moss.
The work is technically hard: keeping retrieval fast, correct, and reliable as usage grows across different customers, workloads, and environments.
Competitive compensation and meaningful equity.
Optimize Your Resume for This Job
Get a match score and see exactly which keywords you're missing
Job Details
- Category
- Software
- Employment Type
- Full Time
- Location
- San Francisco, CA (Hybrid)
- Posted
- Compensation
- $60,000 - $300,000 per year
About Moss
Moss is a provider of custom trade show display components, tension fabric displays and structures.
More Roles at Moss
Similar Software Roles



Found this role interesting?