SRE

Shanghai, Shanghai, China
Operations and Supply Chain

Summary

Posted:
Role Number: 200118770
We are looking for an SRE who can help lead the next generation of products we create. Our infrastructure team is responsible for architecting, building, and scaling a distributed system that enables Apple to manufacture every product. We manage hundreds of bare-metal servers and thousands of client machines across 10+ data centers. You should strive to make everyone that uses these systems life easier including devs, technical support, on location teams, and end users. This means automating everything from deployment workflow to CI/CD to monitoring and alerting systems.

Key Qualifications

  • 3 years architecting and managing Linux server infrastructure across multiple data centers
  • Experience with Linux servers such as Ubuntu
  • Experience with configuration management tools such as Ansible
  • Experience deploying and managing observability tools such as Prometheus, Grafana and the ELK stack
  • Experience using CI/CD system like Jenkins
  • Experience managing database servers such as PostgreSQL including replication across multiple data centers
  • Good communication skills in written and spoken English

Description

This is a rare opportunity to put your signature on how Apple manufactures everything. We need you to help take our system to the next level working closely with manufacturing design and the mechanical engineering team on new products. We don’t expect you to be a manufacturing expert, but guarantee within the first 6 months you will be. You’ll be working with the worlds best engineers to help them build the products we all want. Our current stacks are diverse and evolving combinations of old and new, closed and open source technologies. We are not looking for a solution for now; we are looking for the best solution for tomorrow. We are an ambitious team that takes smart risks and challenges everything -- including each other. None of us are the best at everything but all are the best at something. As we scale and evolve the supporting infrastructure for such diverse technologies it becomes crucial to understand the entire stack to help investigate, log, monitor, optimize and expand our services.  Responsibilities: Build and manage systems, infrastructure and applications through automation Deploy, monitor and maintain our production systems in multiple data centers Troubleshooting the issues in production Participate in a regular on-call rotation to support the infrastructure 24/7

Education & Experience

Bachelor’s degree in Computer Science or equivalent industry experience

Additional Requirements

  • Experience using Docker for production services
  • Experience managing bare-metal hardware (PXE boot, kickstart)
  • Experience with one or more: Golang, Python, SQL, HTTP, TCP/IP