Logo
job logo

ML Compute SRE: Scale Reliability for ML Ops

Google Inc., Durham, North Carolina, United States, 27703

Save Job

A leading tech company is seeking a Site Reliability Engineer to support the ML Accelerator team, focusing on enhancing ML compute infrastructure. Candidates should have a Bachelor’s degree in Computer Science or equivalent experience, with at least 2 years in software development (Golang preferred). The role includes designing features, optimizing operations, and improving service level objectives. Competitive salary ranges from $141,000 to $202,000 plus bonuses and equity. Join a collaborative team and help drive performance for ML operations. #J-18808-Ljbffr