Senior Site Reliability Engineer - Apple Services Engineering (ASE)

Santa Clara Valley (Cupertino), California, United States
Software and Services

Summary

Posted:
Weekly Hours: 40
Role Number:200536022
People at Apple don’t just build products — they craft experiences our customers love and depend on. Apple Services Engineering (ASE) builds and supports the systems that make many of these daily experiences possible. If you’ve used Apple products, you’ve likely interacted with us. Apple Services Site Reliability Engineering (SRE) teams are responsible for the systems and services that directly support those customers and their experiences. We are looking for an SRE with experience in building and supporting highly available customer-facing services. You will apply SRE best practices to ensure the availability, reliability, and performance of our systems and services. Does this sound like you!

Key Qualifications

  • 5+ years in Infrastructure Ops, Site Reliability Engineering, or DevOps focused role.
  • Knowledge of Linux operating system principles, networking fundamentals, and systems management.
  • Proven proficiency in at least one of the following languages: Java, Python, or Go.
  • Experience in leading and scaling distributed systems in a public, private, or hybrid cloud environment.
  • Familiarity with micro-services architecture and container orchestration with Kubernetes.
  • Awareness of key security principles including encryption, keys (types and exchange protocols).
  • Understanding of SRE principals including monitoring, alerting, error budgets, fault analysis, and automation.
  • Strong sense of ownership, with a desire to communicate and collaborate with other engineers and teams.
  • Ability to identify and communicate technical and architectural problems, while working with partners and their team to iteratively find solutions.

Description

Engage with our product teams to understand requirements, design and implement resilient and scalable infrastructure solutions. Operate, monitor, and triage all aspects of our production and non-production environments. Collaborate on code, infrastructure, design reviews, and process enhancements Evaluate and integrate new technologies to improve system reliability, security, and performance. Develop and implement automation to provision, configure, deploy, and monitor Apple services. Participate in an oncall rotation providing hands-on technical expertise during service impacting events. Contribute to capacity planning, scale testing, and disaster recovery exercises Approach operational problems with a software engineering mindset.

Education & Experience

BS degree in computer science or equivalent field with 5+ years of experience

Additional Requirements

Pay & Benefits