Sr. Site Reliability Engineer - Strategic Data Solutions
Sacramento, California, United States
Operations and Supply Chain
Do you want to be part of a group critical to the success of Apple? Are you a Senior Site Reliability Engineer who is passionate about solving hard problems, owning the entire solution and leveraging cutting edge technologies to enable business operations? Do you enjoy creating automation to eliminate toil? Do you excel under pressure? Can you summarize highly complex problems so that others can help you solve them? Do you have rock solid integrity and are the team member people trust and count on? Does everyone turn to you to brainstorm solutions? Do you like gathering evidence to base your decisions off of, but can use your gut, intuition and experience to make quick decisions when necessary? If you smile in the face of pressure, can work independently but are also great team player, we're looking for you! In this position you'll shape the next generation of big data solutions by working on the bleeding-edge technologies and solutions for the Strategic Data Solutions (SDS) team. SDS is looking for exceptional engineers to help run, optimize and scale our environment to the next level. Be a member of the team that is responsible for the data collection and reporting for all of Apple’s products around the world. You will operate and scale systems that every iPhone, iPad and Mac have interacted with. Apple’s engineering and operations teams will utilize your systems to build the next insanely great product. This position is based in Elk Grove, CA.
- Have a passion for Site Reliability Engineering and a flexible, creative approach to problem solving.
- 5+ years of hands-on experience with one or more programming languages: Java, Python, Node, Go or Ruby
- 3+ years of hands-on experience with container orchestration systems such as: Kubernetes(Preferred), DC/OS or Mesos
- Experience with at least one of these monitoring systems: AppDynamics, Grafana, Kibana, Prometheus, InfluxDB
- Experience with build automation, source control and CI/CD tools (GitHub, Artifactory, Jenkins, Spinnaker, etc)
- Linux configuration, deployment and troubleshooting
- Excellent problem solving and programming skills; proven technical leadership and communication skills
- Cloud infrastructure as code experience, e.g., Terraform, CloudFormation
- Experience with Open API and Microservice architecture
- Experience with configuration management tools such as: Ansible, Chef, Puppet, Salt
- Ability to learn new technologies quickly
- Experience with Kafka, Elastic, Druid, Object Storage a strong plus
- Flexibility for travel and work schedules
• Work cross-functionally across multiple teams to solve challenging big data engineering problems across a broad range of Apple manufacturing services • Work with very large-scale, highly-available Big Data platform supporting multi-Petabytes of data with super-linear growth • Apply a “build-to-manage”, problem-solving and innovative mindset to the design, build, test, deployment, change and maintenance of enterprise class applications drawing from deep engineering expertise • Measure success against platform stability, effective integration and delivery, instrumentation, release quality, technical debt(toil) reduction, development of automation, risk/security compliance, and sustained advancement of the SRE practice
Education & Experience
• BS or MS in Computer Science preferred, equivalent work experience will be considered.