
Senior Manager, DevOps & SRE – Platform Reliability & Global Operations
Qcells North America, San Francisco, CA, United States
Position Description
Senior DevOps & SRE Manager – Platform Reliability & Global Operations is a senior technical leader responsible for the reliability, scalability, security, and operational excellence of a complex, multi‑platform ecosystem spanning applications, workflows, event streaming, and data platforms.
Location & Work Arrangement Candidates must be able to work primarily within Pacific or Central Time Zone business hours to support collaboration with global teams.
Employees located within 50 miles of a Qcells office (e.g., Irvine, San Francisco, Houston, or South Carolina locations) are expected to follow the company’s hybrid work policy of at least three in‑office days per week.
Responsibilities
Lead and scale a global, multi‑tier (L1/L2/L3) DevOps and SRE organization
Design and operate follow‑the‑sun on‑call and support models
Own incident management, including Sev‑1/Sev‑2 incident command and executive communication
Define and operate SLOs, SLIs, and error budgets across apps, workflows, events, and data pipelines
Oversee DevOps practices for CI/CD, Kubernetes, IaC, automation, and cost optimization
Ensure reliable operation of event‑driven and telemetry pipelines
Govern and manage third‑party DevOps and SRE vendors, including SLAs and escalations
Drive operational maturity: post‑mortems, automation, reliability improvements
Partner with security on secure operations, incident response, and compliance readiness
Platforms in Scope
Application Platforms: Kubernetes, containerized, EMS telemetry & control
Workflow Orchestration: Fleet Manager, Power Automate, cross‑system workflows
Event & Streaming: Microsoft Event Hub, event streams, Kafka, RabbitMQ
Data & Telemetry: Microsoft Fabric, Kusto, PostgreSQL, TimescaleDB, Cassandra
CI/CD & Infrastructure: GitHub Actions, Jenkins, Terraform, Helm, Ansible (Azure & AWS)
IAM across Azure and AWS
Experience with SalesForce, Snowflake preferred
Technical Strengths
Kubernetes and container platforms in production
Azure (required), AWS
Event streaming and messaging systems
Data pipelines and telemetry platforms
Power pages, Power Automate
CI/CD, Infrastructure as Code, and automation
Observability and incident troubleshooting at scale
Operational Expectations
Escalation management for on‑call and major incidents
Willingness to work off‑hours when required
Comfortable making high‑impact decisions under pressure
Required Qualifications
15+ years in DevOps, SRE, Platform Engineering, or Production Operations
5+ years leading globally distributed engineering teams
Proven ownership of 24x7, mission‑critical production platforms
Strong experience managing third‑party vendors/managed service providers
Deep hands‑on experience with Kubernetes, cloud platforms, and event‑driven systems
Preferred Qualifications
Solar industry experience (Renewable)
Use of AI Tools Qcells expects team members to leverage AI models and AI‑assisted tools in their daily workflows where appropriate. Candidates should be comfortable working in an AI‑augmented environment and applying sound judgment when using AI‑generated outputs. During the interview process, candidates will be asked to share examples of how they have used AI tools or models in their work.
Salary Range The salary range is required by the California Pay Transparency Act and may differ depending on the location of those candidates hired nationwide. Actual compensation is influenced by a wide array of factors including but not limited to skill set, education, licenses and certifications, essential job duties and requirements, and the necessary experience relative to the job’s minimum qualifications.
This target salary range is for CA positions only and should not be interpreted as an offer of compensation.
#J-18808-Ljbffr
Location & Work Arrangement Candidates must be able to work primarily within Pacific or Central Time Zone business hours to support collaboration with global teams.
Employees located within 50 miles of a Qcells office (e.g., Irvine, San Francisco, Houston, or South Carolina locations) are expected to follow the company’s hybrid work policy of at least three in‑office days per week.
Responsibilities
Lead and scale a global, multi‑tier (L1/L2/L3) DevOps and SRE organization
Design and operate follow‑the‑sun on‑call and support models
Own incident management, including Sev‑1/Sev‑2 incident command and executive communication
Define and operate SLOs, SLIs, and error budgets across apps, workflows, events, and data pipelines
Oversee DevOps practices for CI/CD, Kubernetes, IaC, automation, and cost optimization
Ensure reliable operation of event‑driven and telemetry pipelines
Govern and manage third‑party DevOps and SRE vendors, including SLAs and escalations
Drive operational maturity: post‑mortems, automation, reliability improvements
Partner with security on secure operations, incident response, and compliance readiness
Platforms in Scope
Application Platforms: Kubernetes, containerized, EMS telemetry & control
Workflow Orchestration: Fleet Manager, Power Automate, cross‑system workflows
Event & Streaming: Microsoft Event Hub, event streams, Kafka, RabbitMQ
Data & Telemetry: Microsoft Fabric, Kusto, PostgreSQL, TimescaleDB, Cassandra
CI/CD & Infrastructure: GitHub Actions, Jenkins, Terraform, Helm, Ansible (Azure & AWS)
IAM across Azure and AWS
Experience with SalesForce, Snowflake preferred
Technical Strengths
Kubernetes and container platforms in production
Azure (required), AWS
Event streaming and messaging systems
Data pipelines and telemetry platforms
Power pages, Power Automate
CI/CD, Infrastructure as Code, and automation
Observability and incident troubleshooting at scale
Operational Expectations
Escalation management for on‑call and major incidents
Willingness to work off‑hours when required
Comfortable making high‑impact decisions under pressure
Required Qualifications
15+ years in DevOps, SRE, Platform Engineering, or Production Operations
5+ years leading globally distributed engineering teams
Proven ownership of 24x7, mission‑critical production platforms
Strong experience managing third‑party vendors/managed service providers
Deep hands‑on experience with Kubernetes, cloud platforms, and event‑driven systems
Preferred Qualifications
Solar industry experience (Renewable)
Use of AI Tools Qcells expects team members to leverage AI models and AI‑assisted tools in their daily workflows where appropriate. Candidates should be comfortable working in an AI‑augmented environment and applying sound judgment when using AI‑generated outputs. During the interview process, candidates will be asked to share examples of how they have used AI tools or models in their work.
Salary Range The salary range is required by the California Pay Transparency Act and may differ depending on the location of those candidates hired nationwide. Actual compensation is influenced by a wide array of factors including but not limited to skill set, education, licenses and certifications, essential job duties and requirements, and the necessary experience relative to the job’s minimum qualifications.
This target salary range is for CA positions only and should not be interpreted as an offer of compensation.
#J-18808-Ljbffr