
Sr TechOps Lead Engineer (AWS Cloud)- REMOTE
Simple Solutions, New York, NY, United States
Job Description
Sr TechOps & SRE Lead Engineer (AWS Cloud)
Department:
Technology / Engineering
Role Overview
We are seeking a highly experienced Senior TechOps & SRE Lead Engineer with deep expertise in Cloud to lead our cloud infrastructure, DevOps practices, reliability engineering, and operational excellence initiatives. This role is both strategic and hands-on - responsible for designing scalable architectures, improving automation, ensuring system reliability, and leading the TechOps team.
Key Responsibilities
Architect and manage secure, scalable, and highly available infrastructure on AWS.
Design multi-account AWS environments using AWS Organizations.
Implement VPC architecture, IAM policies, networking, and security best practices.
Oversee EC2, ECS/EKS, Lambda, RDS, S3, CloudFront, and related AWS services.
Optimize AWS cost management and resource utilization.
Site Reliability & Production Operations
Implement Site Reliability Engineering (SRE) best practices.
Define SLIs, SLOs, and error budgets.
Manage monitoring and alerting (CloudWatch, Datadog, Prometheus, Grafana).
Lead incident response, root cause analysis (RCA), and postmortems.
Ensure 24/7 uptime and operational resilience.
Security & Compliance
Implement IAM best practices and least-privilege access controls.
Manage secrets and key management (AWS KMS, Secrets Manager).
Conduct vulnerability management and patching.
Support compliance initiatives (SOC 2, ISO 27001, GDPR as applicable).
Lead disaster recovery planning and backup strategies.
Leadership & Strategy
Lead and mentor a team of DevOps/TechOps/ engineers.
Establish operational KPIs and performance benchmarks.
Manage on-call rotations and escalation processes.
Collaborate with Engineering, Product, Security, and Data teams.
Contribute to long-term infrastructure strategy and cloud roadmap.
Required Qualifications
Bachelor's degree in Computer Science, Engineering, or equivalent experience.
minimum 12-15+ years in DevOps, Cloud Engineering, SRE and/or Infrastructure roles.
5+ years leading SRE technical teams.
Strong hands-on experience with AWS services (EC2, EKS, RDS, S3, IAM, VPC, Lambda).
Deep knowledge of networking, Linux systems, and distributed systems.
Experience with Infrastructure-as-Code (Terraform or CloudFormation).
Strong scripting skills (Python, Bash, or similar).
Experience with containerization (Docker) and Kubernetes (EKS preferred).
Key Competencies
Strong architectural thinking
Hands-on technical leadership
Crisis and incident management
Strategic planning and execution
Excellent cross-functional communication
Success Metrics
99.9%+ production uptime
Reduced deployment lead time
Reduced incident frequency and MTTR
Improved cost efficiency
High-performing and scalable TechOps function
Sr TechOps & SRE Lead Engineer (AWS Cloud)
Department:
Technology / Engineering
Role Overview
We are seeking a highly experienced Senior TechOps & SRE Lead Engineer with deep expertise in Cloud to lead our cloud infrastructure, DevOps practices, reliability engineering, and operational excellence initiatives. This role is both strategic and hands-on - responsible for designing scalable architectures, improving automation, ensuring system reliability, and leading the TechOps team.
Key Responsibilities
Architect and manage secure, scalable, and highly available infrastructure on AWS.
Design multi-account AWS environments using AWS Organizations.
Implement VPC architecture, IAM policies, networking, and security best practices.
Oversee EC2, ECS/EKS, Lambda, RDS, S3, CloudFront, and related AWS services.
Optimize AWS cost management and resource utilization.
Site Reliability & Production Operations
Implement Site Reliability Engineering (SRE) best practices.
Define SLIs, SLOs, and error budgets.
Manage monitoring and alerting (CloudWatch, Datadog, Prometheus, Grafana).
Lead incident response, root cause analysis (RCA), and postmortems.
Ensure 24/7 uptime and operational resilience.
Security & Compliance
Implement IAM best practices and least-privilege access controls.
Manage secrets and key management (AWS KMS, Secrets Manager).
Conduct vulnerability management and patching.
Support compliance initiatives (SOC 2, ISO 27001, GDPR as applicable).
Lead disaster recovery planning and backup strategies.
Leadership & Strategy
Lead and mentor a team of DevOps/TechOps/ engineers.
Establish operational KPIs and performance benchmarks.
Manage on-call rotations and escalation processes.
Collaborate with Engineering, Product, Security, and Data teams.
Contribute to long-term infrastructure strategy and cloud roadmap.
Required Qualifications
Bachelor's degree in Computer Science, Engineering, or equivalent experience.
minimum 12-15+ years in DevOps, Cloud Engineering, SRE and/or Infrastructure roles.
5+ years leading SRE technical teams.
Strong hands-on experience with AWS services (EC2, EKS, RDS, S3, IAM, VPC, Lambda).
Deep knowledge of networking, Linux systems, and distributed systems.
Experience with Infrastructure-as-Code (Terraform or CloudFormation).
Strong scripting skills (Python, Bash, or similar).
Experience with containerization (Docker) and Kubernetes (EKS preferred).
Key Competencies
Strong architectural thinking
Hands-on technical leadership
Crisis and incident management
Strategic planning and execution
Excellent cross-functional communication
Success Metrics
99.9%+ production uptime
Reduced deployment lead time
Reduced incident frequency and MTTR
Improved cost efficiency
High-performing and scalable TechOps function