Mediabistro logo
job logo

Software Engineering Manager, Triage Services and Infrastructure

Apple, Cupertino, CA, United States


Role Number:

200648961-0836

Summary

The Core OS team is seeking an exceptional engineering manager to lead the team responsible for enabling Apple's operating systems to achieve world-class reliability. This team develops and owns mission-critical tools and services that detect, analyze, and classify kernel panics and low-level crashes across all Apple platforms. You will be partnering with engineering teams across Software, Hardware, and Silicon groups to drive and deliver the rock-solid OS reliability for over 2 billion currently active Apple devices and shape the future of system reliability across Apple's entire product ecosystem.

Description

Lead a team of engineers triaging kernel panics and critical system-level issues across all Apple platforms (macOS, iOS, watchOS, tvOS). Build intelligent automation pipelines that analyze, group, and prioritize failure signatures based on their reliability impact. Mentor engineers to design and develop advanced systems diagnostic and at-scale debug services to realize the vision of zero-iteration debugging and fully automated triage and root cause analysis. Develop telemetry-based dashboards to monitor at-scale panic/crash triage and analysis services to ensure they are working as expected and efficiently. Collaborate with Core OS, Hardware, Silicon, and other engineering teams to champion and advance improvements in debuggability, panic data quality, symbolication, and automation of triage and debug workflows.

Minimum Qualifications

Demonstrated track record of building and scaling high-performing engineering teams

Passion for solving challenging technical problems that directly impact millions of users

Strong communication skills with ability to influence technical direction across organizational boundaries

Experience managing complex, multi-platform technical initiatives with measurable reliability improvements

Strong technical depth in operating system internals will be helpful

BS/MS in Computer Science, Compute Engineering, Electrical Engineering, or equivalent experience

Preferred Qualifications

Experience applying AI/ML for automated triage and reliability services is preferred

Experience with large-scale telemetry systems processing millions of events daily is preferred

Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant (https://www.eeoc.gov/sites/default/files/2023-06/22-088_EEOC_KnowYourRights6.12ScreenRdr.pdf) .