Skip to main content
← Back to jobs
Protege logo

Solutions Engineer, Healthcare

ProtegeRemote Full Time

Job Description

Company Overview:

We are building Protege to solve the biggest unmet need in AI — getting access to the right training data. The process today is time intensive, incredibly expensive, and often ends in failure. The Protege platform facilitates the secure, efficient, and privacy-centric exchange of AI training data.

Solving AI’s data problem is a generational opportunity. We’re backed by world-class investors and already powering partnerships with some of the most ambitious teams in AI. The company that succeeds will be one of the largest in AI — and in tech.

We’re a lean, fast-moving, high-trust team of builders who are obsessed with velocity and impact. Our culture is built for people who thrive on ambiguity, own outcomes, and want to shape the future of data and AI.

Role Overview

We're hiring a Solutions Engineer for our Healthcare vertical to operate and improve the infrastructure that moves large-scale healthcare datasets between partners, cloud platforms, and customers. This is a hands-on delivery engineering role focused on production execution, operational reliability, and safe data movement - not a pre-sales solutions role.

You will work across AWS, GCP, Azure, Snowflake, and Protege's internal platform to run cross-cloud transfers, lightweight transformations, packaging, and validation for healthcare data deliveries. You will manage real production workflows, debug failures, and keep data integrity, auditability, and repeatability high.

At its core, this role is about making complex healthcare data delivery reliable at scale: taking messy, multi-system workflows and turning them into safe, repeatable operations that support customers, partners, and internal teams.

What You'll Do

Own cross-cloud data movement and delivery

  • Execute and monitor large-scale data transfers across AWS S3, Google Cloud Storage, Azure Blob, Snowflake, and customer environments.

  • Use CLI tools such as rclone, s5cmd, and cloud-native utilities to move data safely and efficiently.

  • Manage credentials, permissions, manifests, and delivery packaging artifacts required for ingestion, subset delivery, and handoff workflows.

Build structured data assembly and lightweight transformation workflows

  • Use Python and SQL to join datasets, add derived columns, clean data, and validate CSV, Parquet, and database tables.

  • Support customer-specific assembly work that turns raw inputs into delivery-ready datasets.

  • Apply a high bar for data integrity, structure, and reproducibility before handoff.

Operate internal pipelines with production discipline

  • Leverage Protege's Dagster-based platform to orchestrate data processing and delivery.

  • Maintain clean separation between pre-production and production workflows and validate configs before runs.

  • Build lightweight scripts and command-line workflows for filtering, manifest generation, validation, and recovery.

Create leverage for the team and platform

  • Document steps, outputs, and recovery paths for auditability and repeatability.

  • Identify recurring delivery patterns, failure modes, and manual toil.

  • Partner with Engineering to turn one-off operational work into repeatable platform capabilities and test new tooling before go-live.

What Success Looks Like

  • 30 days: Learn the environments and delivery motion

  • Build working knowledge of partner and customer delivery patterns, environments, permissions models, and core tooling. Shadow live deliveries and get operational on standard validation and rerun workflows.

  • 60 days: Own scoped production work
    Run scoped cross-cloud deliveries and light transformations with limited support. Improve runbooks, validation steps, and status communication so work is safer and easier to repeat.
    90 days: Create leverage and reliability
    Independently own complex delivery workflows across multiple systems while reducing rework and surfacing platform improvements that eliminate operational risk and manual toil.

What You Bring

  • Strong hands-on experience with data pipelines, both orchestrated and manual, in real production environments.

  • Fluency with command-line tooling in Linux or MacOS and strong scripting ability in Python, SQL, and Bash/shell.

  • Experience working with cloud storage systems and large-scale cross-cloud data movement.

  • High bar for data integrity, validation, reproducibility, and auditability - especially for regulated data.

  • Calm, methodical debugging instincts and strong operational judgment about when to rerun, recover, or escalate.

  • Ability to manage multiple delivery workflows simultaneously while collaborating well with a distributed team.

  • Bonus if you have experience with AWS S3, GCS, Azure Blob, Snowflake, IAM debugging, Dagster/Airflow, healthcare data, or AI training data.

Working with Protege

  • We move fast - thoughtfully. Speed matters in what we're building, and so does intention. We're biased toward action and always learning.

  • We're a lean, high-trust team. Everyone has real ownership. Clarity and autonomy drive our best work.

  • We take our work seriously, not ourselves. We solve hard problems with humility and celebrate wins - big and small.

  • We're kind, direct, and inclusive. We give feedback early and often, with the goal of helping one another grow.

  • We're builders at heart. Every person at Protege is hands-on, resourceful, and focused on creating momentum.

  • We grow fast - together. You'll be surrounded by people who care about impact, who challenge you to think bigger, and who are genuinely excited about what comes next.

Optimize Your Resume for This Job

Get a match score and see exactly which keywords you're missing

Optimize Resume

Job Details

Department
Product/Design
Category
Aerospace Engineering
Employment Type
Full Time
Location
Remote (Remote Available)
Posted
Apr 27, 2026, 01:47 PM
Listed
Apr 27, 2026, 03:45 PM

About Protege

Part of the growing frontier tech ecosystem pushing the edges of what's possible.

Found this role interesting?

Solutions Engineer, Healthcare
Protege
Apply ↗

Shipping like we're funded. We're not. No affiliation.

Sequoia logo
Y Combinator logo
Founders Fund logo
a16z logo