Site Reliability Engineer, Ad Platforms

Santa Clara Valley (Cupertino), California, United States
Software and Services


Weekly Hours: 40
Role Number:200084887
At Apple, we work every day to create products that enrich people’s lives. Our Advertising Platforms group makes it possible for people around the world to easily access informative and imaginative content on their devices while helping publishers and developers promote and monetize their work. Today, our technology and services power advertising in Search Ads in the App Store and Apple News. Our platforms are highly-performant, deployed at scale, and setting new standards for enabling effective advertising while protecting user privacy. The Ad Platforms team is seeking a Site Reliability Engineer for a great opportunity. Our mission is to enable Ad Platforms to deliver advertisements in a reliable and scalable way that results in awesome user experiences. We achieve this mission by automation, processes and education to our partner teams.

Key Qualifications

  • Excellent experience supporting internet-facing production services and distributed systems.
  • Good programming skills in one of C, Java, Python or Go.
  • Expertise in operating Linux based systems, with a solid understanding of its internals.
  • Demonstrated problem solving ability utilizing creative and innovating thinking but also adhering to a strong sense of ownership, customer service, and integrity demonstrated through clear communication.
  • Drive to be self-motivated, and eager to learn.
  • Experience building and managing container orchestration platforms like Nomad or Kubernetes.
  • Bonus, if you have experience running infrastructure on public clouds like AWS or GCP


You’ll be part of the team delivering hosting infrastructure for Ad Platforms, supporting the continued growth of Ad Platforms, by helping to ensure that Ad Platforms can continue to scale up and grow. With global deployments, and fast growth, we need solutions to deliver new capabilities for Apple’s customers. You work to ensure high availability/high resiliency patterns for application owners to build on, solve operational problems with our infrastructure, and help drive the continued evolution of our hosting infrastructure. Your duties will include: - Design and develop tools that will aide in improving reliability of our infrastructure. - Engage with engineering teams to improve on-call efficiencies, drive incident management and post-mortem analysis. - Develop expertise in Apple Infrastructure and best practices and bring that to ad-platforms to run a world class distributed systems. - Improve areas like capacity planning, configuration management and monitoring. - Design and improve architectures of new and existing systems based on the principles of reliability and high availability with extensive logging and observability. - Own the reliability of ad-platforms. - Design our next generation container platforms to run ad services. - Create tooling to improve the observability of ad systems. - Create frameworks that enable engineers to interact with Apple Infrastructure - Create robust deployment and delivery pipelines - Create systems to develop black box testing capabilities for ad delivery.

Education & Experience

- Bachelor's degree in Computer Science/Engineering discipline or equivalent. Master's degree preferred.

Additional Requirements