To enhance the reliability of a global platform, the hybrid Senior Staff Production Engineer will design and implement scalable infrastructure across multiple cloud environments, drive an "automation-first" culture, and mature observability standards while collaborating with engineering teams.
Key responsibilities
Design and implement highly available, scalable infrastructure across AWS, Azure, GCP, and bare-metal environments
Drive an "automation-first" culture by writing code to eliminate manual toil and build self-healing systems
Act as a lead Incident Commander, developing response playbooks and conducting post-incident analyses
Required qualifications
8+ years of experience managing reliability, scalability, and availability for large-scale production services
Deep expertise in programming languages such as Python, Go, or C/C++
Strong background in networking protocols, Linux/FreeBSD systems, and distributed architecture
Experience in high-stakes incident management and participation in a 24/7 on-call rotation
Proficiency in leveraging ITIL frameworks and incident data for service maturity

Senior Staff Production Engineer
Virtual Vocations Inc · New York, NY, USA ·
- Job type:
- Full Time