Mediabistro logo
job logo

Senior Machine Learning Engineer

General Motors, montgomery, al, United States


**Job Description** **The Role:  ** We are seeking an experienced, technical oriented, impact delivering-driven expert in ML Training Infrastructure with a strong ability to execute hands-on technical work. In this role, you will be responsible for designing and building scalable, reliable, and high-performance AI/ML platform infrastructure to support advanced AI research and model development initiatives. As a Senior ML Engineer, you will collaborate closely with machine learning engineers, research scientists, and other partners to develop state-of-the-art AI solutions that enable the future of intelligent driving technologies across General Motors vehicles. **What You'll Do:** + Design and development of scalable, reliable, high-performance ML framework to support model training at scale. + Model training performance analysis and optimization solutions to scale distributed training workflows and maximize resource utilization across heterogeneous hardware environments, and save cost. + Raise the bar on system observability, debuggability, and operational excellence, and user experience. + Collaborate with cross-functional teams to integrate new features and technologies into the platform. **Your Skills & Abilities (Required Qualifications)** + Bachelors degree or higher in Computer Science or equivalent major OR equivalent relevant experience + 3+ years professional software engineering experience + 2+ years specialized experience in AI/ML infrastructure, e.g., enabling distributed training for scaling large ML models + Strong programming skills in Python, with proficiency in frameworks such as,PyTorch (preferred), TensorFlow, or similar + Experience with distributed computing, GPU computing, and cloud environments (AWS, GCP, Azure). + Willingness to travel to Sunnyvale, CA as needed + Comfortable working in highly ambiguous and dynamic environments **What Will Give You a Competitive Edge (preferred qualifications):** + 5+ years of professional software engineering experience. + Self-motivated, strong execution, impact-delivering oriented + Extensive knowledge and experience with PyTorch 2.x+ and distributed training framework + Experience with design and development of training framework that supports FSDP, Pipeline Parallelism and other scalable solutions to training large foundational models + Experience with profiling, analysis, debugging and optimizing training and data loading performance. + Excellent communication skills to resolve controversial, make consensus, communicate risks and give constructive feedback **Compensation:** The compensation information is a good faith estimate only. It is based on what a successful applicant might be paid in accordance with applicable state laws. The compensation may not be representative for positions located outside of the California Bay Area. + The salary range for this role is $170,000 to $240,000. The actual base salary a successful candidate will be offered within this range will vary based on factors relevant to the position. + Bonus Potential: An incentive pay program offers payouts based on company performance, job level, and individual performance. **Relocation:** This job may be eligible for relocation benefits. **Benefits:** + Benefits: GM offers a variety of health and wellbeing benefit programs. Benefit options include medical, dental, vision, Health Savings Account, Flexible Spending Accounts, retirement savings plan, sickness and accident benefits, life insurance, paid vacation & holidays, tuition assistance programs, employee assistance program, GM vehicle discounts and more. \#GM-AV-1 This role is based remotely, but if the selected candidate lives within a specific mile radius of a GM hub, they will be expected to report to the location three times a week {or other frequency dictated by your manager}. The selected candidate will be required to travel