
Kernel Engineer — Scientific Computing (SPU)
Job Description
Vorticity is building the world’s first Scientific Processing Unit (SPU), a new class of silicon purpose-built to accelerate scientific computing beyond the limits of GPUs. We are designing tightly coupled software–hardware systems around applied mathematics workloads to deliver order-of-magnitude performance gains. Unlocking its full potential requires early, deep engagement from applied mathematics–driven software engineers who can translate real-world scientific workloads into executable models, kernels, libraries, and applications that inform both architecture and tooling decisions.
As a Kernel Engineer, you will work at the intersection of applied mathematics, scientific computing, parallel programming, and low-level performance engineering. You will help shape how numerical kernels are implemented, optimized, and eventually mapped onto the SPU. Your work may include building early numerical kernels and libraries, developing prototype applications, and writing Python-based workload models and simulators, all to support and inform the evolving hardware and compiler stack.
This requires both strong applied math fundamentals and deep low-level implementation ability. You should be comfortable moving from mathematical formulations to efficient kernels, reasoning about accuracy, stability, data movement, memory hierarchy, parallel execution, and compiler behavior along the way. This position is ideal for someone who combines strong scientific computing instincts with the low-level habits of a performance engineer.
Responsibilities
- Prototyping and implementing core kernels and low-level numerical primitives for the SPU.
- Translating mathematical formulations into executable, performance-relevant kernel implementations.
- Analyzing and optimizing memory-access patterns, including coalescing, locality, shared memory usage, cache behavior, register pressure, and host-device data movement.
- Collaborating closely with hardware architects to evaluate algorithm–architecture tradeoffs around memory hierarchy, synchronization, vector/SIMT execution, instruction behavior, and parallel scheduling.
- Working with compiler and runtime teams to ensure kernels map cleanly to the SPU programming model.
- Designing microbenchmarks, correctness tests, numerical accuracy tests, and performance models, then iteratively refining kernels based on hardware evolution, compiler behavior, profiler output, and measured performance.
Core Skills:
- Strong applied mathematics and scientific computing judgment, with the ability to understand numerical workloads deeply enough to implement them correctly and efficiently.
- Strong proficiency in C++ and CUDA, HIP, SYCL, or an equivalent accelerator programming model.
- Experience writing custom kernels, not just using existing frameworks or vendor libraries.
- Ability to translate mathematical formulations into low-level implementations while balancing accuracy, stability, precision, data movement, and performance.
- Deep understanding of GPU execution and memory hierarchy, including global memory, shared memory, registers, caches, coalescing, atomics, reductions, scans, warp-level execution, and occupancy.
- Experience using profiling and performance tools to identify bottlenecks, test hypotheses, and validate improvements.
- Ability to reason from profiler output to concrete code changes, rather than treating performance debugging as guesswork.
- Solid concurrency fundamentals, including race conditions, atomicity, synchronization, and thread/process execution behavior.
Nice to Have Skills:
- Familiarity with performance analysis tools or modeling techniques (profilers, roofline models)
- Exposure to compilers, runtimes, or code generation frameworks
- Experience in applied scientific domains such as physics, geophysics, CFD, climate, materials, fusion, or finance.
- Experience with low-level GPU assembly or intermediate representations.
- Familiarity with low-level system software or drivers.
Non-Technical Qualities:
- Excellent written and verbal communication skills
- Strong ability to work independently and collaboratively in a team.
- Comfort operating in an early-stage environment where the hardware, compiler, and software stack are evolving together.
- Willingness to put in the hard work needed to bring the SPU to life.
- Above all: low ego.
As passionate scientists and engineers, we are well aware of the plethora of critical problems in the world that cannot be solved because humanity simply does not have enough computing power. To address this, Vorticity is developing a radically new silicon chip architecture and system to dramatically accelerate scientific computing problems.
Vorticity’s mission is to expand human ingenuity. To do that we are building a team of exceptional people to work together on big problems. Join us!
Optimize Your Resume for This Job
Get a match score and see exactly which keywords you're missing
Job Details
- Category
- Aerospace Engineering
- Employment Type
- Full Time
- Location
- Redwood City
- Posted
- May 15, 2026, 07:41 PM
- Compensation
- $120,000 - $170,000 per year
About Vorticity
Part of the growing frontier tech ecosystem pushing the edges of what's possible.
More Roles at Vorticity
Similar Aerospace Engineering Roles



Found this role interesting?
Career Guides
Inside guide to Airbus Defence & Space careers: Ariane 6, Eurostar, Orion ESM programs, salary ranges across Europe, Graduate Programme, and hiring process.
Inside guide to Thales Alenia Space careers: Galileo, MetOp, SpaceRider programs, salary ranges, locations across France and Italy, hiring process, and work culture.
Practical guide to writing resumes for aerospace and space jobs: ATS optimization, keywords by role, translating experience from other industries, clearance listing, and cover letters.