Site Reliability Engineer, Ad Platforms

Austin, Texas, United States
Software and Services


Weekly Hours: 40
Role Number:200231269
At Apple, we work every day to create products that enrich people’s lives. Our Advertising Platforms group makes it possible for people around the world to easily access informative and imaginative content on their devices while helping publishers and developers promote and monetize their work. Today, our technology and services power advertising in Search Ads in the App Store and Apple News. Our platforms are highly-performant, deployed at scale, and setting new standards for enabling effective advertising while protecting user privacy. The Ad Platforms team is seeking a Site Reliability Engineer for an extraordinary opportunity. Our mission is to enable Ad Platforms to deliver advertisements in a reliable and scalable way that results in awesome user experiences. We achieve this mission by automation, processes and education to our partner teams.

Key Qualifications

  • At least 5 years in a Reliability Engineering, DevOps or infrastructure focused role.
  • Passion for designing and building reliable systems.
  • Advanced experience with programming languages (GoLang, Python, Java, C++).
  • Experience designing, building and operating solutions built on top of AWS.
  • Experience building and operating container based platforms like EKS.
  • Experience in deployment automation based on Terraform.
  • Automation advocate - you truly believe in removing operation load with software.
  • Experience with deploying, supporting and monitoring new and existing services, platforms, and application stacks.
  • Confirmed experience supporting internet-facing production services and distributed systems.
  • Demonstrated problem solving ability utilizing creative and innovating thinking but also adhering to a strong sense of ownership, customer service, and integrity demonstrated through clear communication.
  • Drive to be be self-motivated, and eager to learn.


We are a diverse, global, agile engineering team that moves smart and fast by consuming and optimizing readily-available technology, collaborating to improve and scale capabilities across businesses and use cases, and sharing our own innovative solutions so everyone can benefit. We are not constrained by organization structure and offer flexibility to work on a variety of full-stack and backend systems and we have fun doing it! IN THIS ROLE, YOU WILL: - Design, build and operate our Serving Infrastructure. - Design and implement reusable infrastructure resiliency patterns. - Implement and improve our monitoring and observability capabilities that results in improving our reliability. - Enhance our deployment infrastructure that will enable us to safely deploy changes. - Design and develop tools that will aide in improving reliability of our infrastructure. - Engage with engineering teams to improve on-call efficiencies, drive incident management and post-mortem analysis. - Develop expertise in Apple Infrastructure and best practices and bring that to ad-platforms to run a world class distributed systems. - Improve areas like capacity planning, configuration management and monitoring. - Design and improve architectures of new and existing systems based on the principles of reliability and high availability with extensive logging and observability. - Own the reliability of ad-platforms.

Education & Experience

Bachelor’s degree in Computer Science or equivalent industry experience

Additional Requirements