Inference Performance Engineer - Latency & Cost

OpenAI, Los Angeles, CA, United States

OpenAI is seeking a Software Engineer specializing in Inference Performance Optimization to work in Los Angeles. This role involves modeling inference performance across various layers, analyzing workloads, and collaborating with cross-functional teams for improvements. Responsibilities include building performance models, enhancing tooling for bottlenecks, and optimizing costs. The compensation ranges from $295K to $555K, plus equity. This is an opportunity to work on impactful optimizations within cutting-edge technologies.
#J-18808-Ljbffr