
Sr Manager, Platform DevOps
T-MOBILE USA, Inc., Bellevue, WA, United States
Job Overview
This role leads globally distributed DevOps/SRE teams across the US and India, with end-to-end accountability for workforce planning, team performance, and the hiring, development, and retention of a high-performing organization.
It oversees the reliability, scalability, and cost efficiency of production and non-production environments across AWS and Azure, applying expertise in capacity planning, traffic management, and cloud optimization.
Leading teams of 20+ engineers and contractors, the role drives platform delivery, technical and security enhancements, and multi-functional collaboration.
Success is measured by platform reliability, timely delivery of capabilities, team growth, and the overall impact on organizational performance and customer experience.
Job Responsibilities
Lead and manage distributed DevOps/SRE teams (US and India) globally, ensuring effective workforce planning, shift and availability management, performance development, mentorship, and continuous skill growth aligned with organizational needs. Own the security and vulnerability management lifecycle, ensuring timely remediation, cloud posture hardening, secure configuration management, and alignment with enterprise security, governance, and risk controls.
Lead implementation of observability platforms across monitoring, logging, tracing, and alerting; develop dashboards and insights to proactively identify failures, bottlenecks, and performance deviations.
Define and implement continuous improvement practices across technical fields and organizational processes. Drive SRE frameworks, including SLA/SLI/SLO definitions, reliability measurement, error-budget policies, and adoption of standards that improve operational excellence.
Provide end-to-end ownership of incident management, including response coordination, root-cause analysis (RCA), post-incident reviews, and implementation of corrective actions to strengthen system resilience.
Oversee technical vendor relationships to incorporate feature and function requests into product releases.
Drive and maintain the current and future technical roadmap in collaboration with design and architecture teams. Collaborate with product, architecture, quality, and security organizations to align technical priorities and delivery objectives; drive execution of a long-term platform engineering roadmap covering modernization, automation, migrations, and innovation initiatives.
Recruit and hire qualified managers and team members to strengthen the platforms and the support model.
Education and Work Experience
Bachelor's Degree plus 7 years of related work experience OR Advanced degree with 5 years of related experience. Acceptable areas of study include Computer Science, Engineering, IT or equivalent experience. (Required)
7-10 years Relevant Product Management experience in an agile software product development environment. (Required)
2-4 years Experience in a leadership role. (Required)
7-10 years Technical Leadership: Strong command of cloud infrastructure (AWS & Azure), CI/CD systems, GitLab administration, IaC tools (Terraform/CloudFormation/Bicep), automation, and modern DevOps/SRE methodologies. (Preferred)
2-4 years Experience managing teams of 5 or more resources in direct reporting relationships in a Platform Management organization. (Preferred)
Knowledge, Skills and Abilities
Strong understanding of Software Development Life Cycle (SDLC) and Agile methodologies
Experience delivering complex technology initiatives across engineering and operations
Expertise in vulnerability management, cloud security procedures, secure SDLC, compliance frameworks, and regulatory alignment.
Knowledge of observability concepts including monitoring, logging, and alerting
Understanding of SLAs, SLOs, and service performance management
Ability to collaborate with multi-functional partners and influence technical decisions
Strong written and verbal communication skills with the ability to convey technical concepts clearly
Analytical skills to assess system performance, operational metrics, and improvement opportunities
Licenses and Certifications
Cloud certifications (AWS or Azure)
Kubernetes or related containerization certifications
Other Eligibility Requirements
At least 18 years of age
Legally authorized to work in the United States
Travel
Travel Required (Yes/No):
DOT Regulated Position (Yes/No): No
Safety Sensitive Position (Yes/No): No
Base Pay Range: $160,000 - $288,500
Corporate Bonus Target: 20%
T-Mobile USA, Inc. is an Equal Opportunity Employer. All decisions concerning the employment relationship will be made without regard to age, race, ethnicity, color, religion, creed, sex, sexual orientation, gender identity or expression, national origin, religious affiliation, marital status, citizenship status, veteran status, the presence of any physical or mental disability, or any other status or characteristic protected by federal, state, or local law. Discrimination, retaliation or harassment based upon any of these factors is wholly inconsistent with how we do business and will not be tolerated.
#J-18808-Ljbffr
This role leads globally distributed DevOps/SRE teams across the US and India, with end-to-end accountability for workforce planning, team performance, and the hiring, development, and retention of a high-performing organization.
It oversees the reliability, scalability, and cost efficiency of production and non-production environments across AWS and Azure, applying expertise in capacity planning, traffic management, and cloud optimization.
Leading teams of 20+ engineers and contractors, the role drives platform delivery, technical and security enhancements, and multi-functional collaboration.
Success is measured by platform reliability, timely delivery of capabilities, team growth, and the overall impact on organizational performance and customer experience.
Job Responsibilities
Lead and manage distributed DevOps/SRE teams (US and India) globally, ensuring effective workforce planning, shift and availability management, performance development, mentorship, and continuous skill growth aligned with organizational needs. Own the security and vulnerability management lifecycle, ensuring timely remediation, cloud posture hardening, secure configuration management, and alignment with enterprise security, governance, and risk controls.
Lead implementation of observability platforms across monitoring, logging, tracing, and alerting; develop dashboards and insights to proactively identify failures, bottlenecks, and performance deviations.
Define and implement continuous improvement practices across technical fields and organizational processes. Drive SRE frameworks, including SLA/SLI/SLO definitions, reliability measurement, error-budget policies, and adoption of standards that improve operational excellence.
Provide end-to-end ownership of incident management, including response coordination, root-cause analysis (RCA), post-incident reviews, and implementation of corrective actions to strengthen system resilience.
Oversee technical vendor relationships to incorporate feature and function requests into product releases.
Drive and maintain the current and future technical roadmap in collaboration with design and architecture teams. Collaborate with product, architecture, quality, and security organizations to align technical priorities and delivery objectives; drive execution of a long-term platform engineering roadmap covering modernization, automation, migrations, and innovation initiatives.
Recruit and hire qualified managers and team members to strengthen the platforms and the support model.
Education and Work Experience
Bachelor's Degree plus 7 years of related work experience OR Advanced degree with 5 years of related experience. Acceptable areas of study include Computer Science, Engineering, IT or equivalent experience. (Required)
7-10 years Relevant Product Management experience in an agile software product development environment. (Required)
2-4 years Experience in a leadership role. (Required)
7-10 years Technical Leadership: Strong command of cloud infrastructure (AWS & Azure), CI/CD systems, GitLab administration, IaC tools (Terraform/CloudFormation/Bicep), automation, and modern DevOps/SRE methodologies. (Preferred)
2-4 years Experience managing teams of 5 or more resources in direct reporting relationships in a Platform Management organization. (Preferred)
Knowledge, Skills and Abilities
Strong understanding of Software Development Life Cycle (SDLC) and Agile methodologies
Experience delivering complex technology initiatives across engineering and operations
Expertise in vulnerability management, cloud security procedures, secure SDLC, compliance frameworks, and regulatory alignment.
Knowledge of observability concepts including monitoring, logging, and alerting
Understanding of SLAs, SLOs, and service performance management
Ability to collaborate with multi-functional partners and influence technical decisions
Strong written and verbal communication skills with the ability to convey technical concepts clearly
Analytical skills to assess system performance, operational metrics, and improvement opportunities
Licenses and Certifications
Cloud certifications (AWS or Azure)
Kubernetes or related containerization certifications
Other Eligibility Requirements
At least 18 years of age
Legally authorized to work in the United States
Travel
Travel Required (Yes/No):
DOT Regulated Position (Yes/No): No
Safety Sensitive Position (Yes/No): No
Base Pay Range: $160,000 - $288,500
Corporate Bonus Target: 20%
T-Mobile USA, Inc. is an Equal Opportunity Employer. All decisions concerning the employment relationship will be made without regard to age, race, ethnicity, color, religion, creed, sex, sexual orientation, gender identity or expression, national origin, religious affiliation, marital status, citizenship status, veteran status, the presence of any physical or mental disability, or any other status or characteristic protected by federal, state, or local law. Discrimination, retaliation or harassment based upon any of these factors is wholly inconsistent with how we do business and will not be tolerated.
#J-18808-Ljbffr