Site Reliability Engineer - News, Stocks, and Weather

London, Greater London, United Kingdom
Software and Services

Summary

Posted:
Weekly Hours: 35
Role Number:200400399
Do you love engineering and running systems & infrastructure that will delight millions of customers? Imagine what you could do here. At Apple, new ideas have a way of becoming extraordinary products, services, and customer experiences very quickly! Bring passion and dedication to your job and there’s no telling what you could accomplish. The Apple News team is looking for a Site Reliability Engineer with strong experience managing world class Production environments. You will use technology to automate solutions and optimize outcomes focusing on application infrastructure in a fast-changing world of software delivery. We seek a passionate and high-energy member of a global Site Reliability Engineering team to continue our focus in providing our customers the highest quality Apple Services experience. Our applications include News, Stocks, and Weather and must scale globally, remain highly available, and “just work.” Here, you will be part of a dynamic group, you’ll have the rare and rewarding opportunity to help maintain world-class uptime of current and upcoming products that will inspire millions of Apple’s customers every day!

Key Qualifications

  • At least 3 to 5 years in a Site Reliability Engineering, DevOps, or Infrastructure focused role
  • Experience supporting internet-facing production services and distributed systems
  • Ability to implement and coordinate telemetry using monitoring and observability tools such as Splunk, Grafana and Prometheus
  • Hands on experience with scripting languages such as Bash, Python, Groovy, GoLang
  • Experience building and operating container orchestrating systems like Kubernetes or EKS
  • Experience designing, building and maintaining infrastructure with a cloud provider such as AWS
  • Automation advocate - you truly believe in removing operational load via software
  • A strong sense of ownership. At the same time you’re a great teammate who communicates clearly and transparently
  • Self motivated, inquisitive and always looking to learn more
  • Experience with scale testing, disaster recovery, and capacity planning
  • Experience in deployment automation based on Terraform or CloudFormation
  • Working experience of systems built with open source storage and search technologies including Cassandra, Kafka, Solr, Postgres and Redis
  • Experience working with big data systems using Hadoop, Spark, and business intelligence software is a plus

Description

Our team is collaborative; we work closely with partner teams to deliver the best results for Apple. We strive to balance the best solution with the need to get things done for each engineering challenge we face. Good ideas are heard and results are rewarded. As an SRE at Apple, you'll: - Operate, monitor, and triage all aspects of our production and non-production environments. - Pioneer and implement the next generation telemetry system for News, Stocks, and Weather. - Prepare alert handling procedures, runbooks, and collaborate with off-shore SRE team. - Automate deployment and orchestration of services into the cloud environment as well as other routine processes. - Actively participate in capacity planning, scale testing and disaster recovery exercises. - Interact with and support partner teams including engineering, SRE, QA, and project management. Create self-service solutions for them. - Cultivate and maintain relationships with internal and external third party vendors.

Education & Experience

BS/MS in Computer Science or Equivalent is required.

Additional Requirements