
Luminal
Making AI run fast on any hardware.
About the Company
Luminal (YC S25) is focused on optimizing AI models to accelerate and simplify model deployment. We are building an AI compiler that enhances model speeds by 10x and streamlines deployment to production with just a single line of code. Our mission is to turn AI research code into production code, seamlessly.
Role Description
This is a full-time on-site role for a Founding Engineer located in downtown San Francisco. The Founding Engineer will be responsible for assisting the design of the core compiler. Day-to-day tasks will include writing CUDA kernels, conducting model performance reviews, and shitposting on social media.
Tech Stack
Luminal uses a search based approach to generate, tune, and verify GPU kernels so engineers do not have to hand write CUDA.
Search based approach
- Express computations in a small IR, then generate candidate kernels via equality saturation rewrite rules (tiling, unrolling, vectorization, memory layout).
- Guide exploration with cost models and bandit style search to find the fastest valid kernels for a target GPU.
- Compile and benchmark candidates on real hardware, enforce correctness with property tests and equivalence checks, and keep strict shape and dtype constraints.
- Cache, version, and reuse the best kernels across models and deployments with full reproducibility.
Tech stack
- Compiler and runtime: Rust and egglog based compiler generating GPU kernels. Using a lightweight IR with e-graph style rewrites to search and benchmark kernels.
- Backends: CUDA and Metal in production today. Other backends in progress.
Founders
Open Positions at Luminal (2 Jobs)
Ready to start your space career at Luminal?