Software Engineer, Inference - Performance Optimization

OpenAI, Los Angeles, CA, United States

Software Engineer, Inference - Performance Optimization
Inference – San Francisco

About the Team

Our team analyzes inference stack performance across the application, model, and fleet layers to identify bottlenecks and drive faster, cheaper inference. We combine systems profiling, benchmarking, and analysis to understand where time and cost are spent, then turn that understanding into performance optimizations and models that project performance and capacity needs for future launches.

About the Role

In this role, you will model inference performance across application, model, and fleet layers with higher fidelity. You will build cost-to-serve estimates from microbenchmarks and create tools that help cross‑functional teams reason about latency, capacity, utilization, and cost tradeoffs.

Responsibilities

Build and refine performance models that translate microbenchmark results into cost-to-serve estimates.

Analyze inference workloads end to end across applications, models, and fleet infrastructure.

Enhance tooling to identify bottlenecks across layers for latency and throughput.

Partner with other teams to turn performance insights into concrete improvements and project how future changes affect inference.

Qualifications

Enjoy reasoning from first principles about distributed systems, model inference, and hardware efficiency.

Are comfortable working across abstraction layers, from application behavior to kernels, accelerators, networking, and fleet scheduling.

Have deep expertise with performance profiling, benchmarking, analysis, and optimization.

Enjoy collaborating with engineering and research teams to improve real production systems.

Compensation
$295K – $555K + Offers Equity

OpenAI is an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic.

#J-18808-Ljbffr