
Product Infrastructure Engineer, Data & Agent Systems
Job Description
About Truewind
Truewind is building AI agents that help accounting teams close books faster and more accurately. Our agents read documents, prepare workpapers, reconcile transactions, draft structured outputs, and operate across ERP and financial systems.
To make this reliable in production, we need a strong product infrastructure engineer who can build the data foundation and execution systems underneath the product.
This is a backend-leaning infrastructure role for someone who can work across production data models, correctness-sensitive workflows, and agent execution systems. It is data-first in the near term: the primary focus is migrating Truewind from legacy data models into cleaner, more durable domain models while keeping live customer workflows working. As that foundation gets stronger, the role also expands into the execution infrastructure that lets AI agents safely complete real work.
This is not a prompt engineering role, a pure analytics data role, or a pure DevOps/SRE role. It is a product infrastructure role for someone who has lived through messy production systems and can move between backend services, data correctness, workflow reliability, and product-facing infrastructure.
Why this role matters
Truewind is in the middle of a major platform transition.
We are migrating from legacy schemas into cleaner domain models. These systems need to run side by side while we move product modules, preserve customer behavior, validate correctness, and avoid breaking production workflows.
Financial data has very little margin for silent error. A missing transaction, duplicated record, stale sync, or incorrect mapping can cascade into a wrong close. You will work with data from ERPs, banks, spreadsheets, PDFs, file uploads, and customer-provided documents that arrives in inconsistent formats and needs to be normalized, validated, audited, and made useful before humans or agents act on it.
At the same time, our agents are becoming more capable. They need reliable execution infrastructure: long-running jobs, retries, workspaces, artifacts, review flows, logs, traces, and failure recovery.
In a larger company, this might be split across data platform, product infrastructure, and agent runtime teams. At our stage, we need someone who can work across these layers without losing sight of correctness or product impact.
Team and stack
You will work directly with the engineering and product team on infrastructure that is already in production with real customers. The work sits between backend engineering and data infrastructure, with a stack that includes TypeScript, PostgreSQL/Supabase, Drizzle, queue and workflow systems, cloud infrastructure, and Python or similar tools where they are the right fit for data and automation work.
What you'll work on
1. Data infrastructure and model migration
You will help move Truewind from legacy data models to cleaner, more durable domain models while the product stays live.
This includes:
-
Building and maintaining data pipelines that ingest, normalize, transform, and serve correctness-sensitive financial data
-
Migrating customer-facing product modules from legacy schemas to new domain models
-
Maintaining compatibility while legacy and new systems run side by side
-
Designing schemas, repositories, services, APIs, and workflows around complex data models
-
Writing migrations, backfills, validation checks, and test coverage
-
Building data quality checks to catch missing, duplicate, stale, inconsistent, or incorrectly mapped records
-
Improving observability around syncs, transformations, model transitions, and downstream product behavior
-
Preserving tenant isolation, auditability, and correctness across data flows
-
Creating internal tools that help engineers debug data pipeline and migration failures faster
2. Agent execution systems
You will also help make our AI agents reliable enough for real production workflows.
This includes:
-
Building orchestration for long-running agent workflows, including queues, retries, cancellations, checkpoints, resumability, and failure recovery
-
Designing workspace and artifact handling for documents, workbooks, logs, generated outputs, and intermediate files
-
Building tool-calling infrastructure for agents to interact with files, APIs, documents, browsers, CLIs, and internal systems
-
Implementing human review flows where users can inspect, approve, reject, or modify agent outputs
-
Adding traces, logs, workflow state, and root-cause debugging tools so agent work is auditable and debuggable
-
Introducing safer execution environments when agent tasks need to manipulate files, call tools, or run isolated code
You may be a fit if you have
-
4+ years of experience in product infrastructure, backend engineering, data infrastructure, or distributed systems
-
Strong experience with relational databases, schema design, migrations, and data integrity
-
Experience building data pipelines, ingestion systems, transformation layers, or backend services around complex data models
-
Experience with async jobs, queues, workflow engines, retries, idempotency, and failure recovery
-
Strong coding ability in TypeScript, Python, Go, Rust, or similar
-
Strong debugging instincts across data, backend, infrastructure, and workflow layers
-
Good judgment around system boundaries, reliability, observability, and operational simplicity
-
Comfort working in a startup where you may need to move between product features, infrastructure, data pipelines, and internal tooling
-
Interest in building systems where AI agents do real work, not just generate text
Strong signals
-
You have migrated a production system from one data model to another while keeping the product running
-
You have built or maintained production data pipelines
-
You have worked on systems where data correctness really matters
-
You have designed validation gates, audit logs, approval flows, or data quality checks
-
You have built workflow engines, internal platforms, automation infrastructure, or developer tools
-
You have experience with multi-tenant SaaS systems
-
You have worked with Postgres, Drizzle, Supabase, Temporal, Dagster, Airflow, Celery, BullMQ, Sidekiq, or similar systems
-
You have worked with LLM agents, tool-calling systems, or human review workflows
-
You enjoy turning messy real-world workflows into reliable, observable systems
Useful but not required
-
Experience with ERP, fintech, billing, payments, reconciliation, accounting, or financial data systems
-
Experience integrating with messy third-party systems such as ERPs, banks, payment processors, CRMs, file storage systems, or document APIs
-
Experience with sandboxing or isolated execution technologies such as E2B, Daytona, AWS ECS, Docker, Kubernetes, Firecracker, gVisor, cloud IDEs, CI runners, or similar systems
-
Experience building agent runtimes, tool execution platforms, secure execution environments, or notebook/code execution infrastructure
What makes this role different
This role sits at the intersection of data infrastructure, backend systems, product workflows, and agent execution. The center of gravity is not model behavior or demos. It is the production substrate that makes AI workflows reliable.
You will help build:
-
data model migration paths
-
backend services around durable domain models
-
validation and observability for correctness-sensitive data
-
workflow reliability for long-running jobs
-
artifact, trace, and review systems for agent outputs
-
safer execution environments when agent workflows need them
The best person for this role is likely a strong product infrastructure engineer: someone backend-capable, data-correctness-minded, and comfortable moving systems forward without breaking production.
Not a fit if
-
You mainly want to write prompts
-
You only want to work on model behavior
-
You prefer demos over production reliability
-
You are looking for a pure DevOps/SRE role disconnected from product and data modeling
-
You are looking for a pure analytics or warehouse data engineering role
-
You are uncomfortable working with complex data models
-
You do not enjoy migrations, backfills, validation, and system cleanup
-
You do not care about logs, traces, retries, idempotency, and observability
-
You want a narrowly scoped role with only one type of problem
Why join now
Truewind is at the stage where the product is powerful enough that the platform underneath it matters more than ever.
We need engineers who can help turn AI workflows from impressive demos into reliable production systems. That means building the data foundation, migrating legacy modules carefully, and giving agents the execution layer they need to do real work safely.
Optimize Your Resume for This Job
Get a match score and see exactly which keywords you're missing
Job Details
- Category
- Operations
- Employment Type
- Full Time
- Location
- San Francisco, CA, US
- Posted
- May 10, 2026, 01:40 AM
- Listed
- May 10, 2026, 01:40 AM
- Compensation
- $180,000 - $200,000 per year
About Truewind
Part of the growing frontier tech ecosystem pushing the edges of what's possible.
More Roles at Truewind





Similar Operations Roles



Found this role interesting?