Senior Service Reliability Engineer
Santa Clara Valley (Cupertino), California, United States
Software and Services
Imagine what you could do here. At Apple, new ideas have a way of becoming great products, services, and customer experiences very quickly. Bring passion and dedication to your job and there's no telling what you could accomplish. Come help us build and scale the next generation of software defined infrastructure (SDI) for one of the largest private platforms in the world. This Service Reliability Engineer (SRE) position requires a mix of strategic engineering and design along with full stack technical work. In this role you will configure, tune, and solve problems on massive scale distributed systems to achieve optimal performance, stability and availability. You will work closely with the system, network, data, software, quality, performance, security, integration, and platform engineering teams. This role is for someone who loves to analyze and solve a broad spectrum of problems while continuing to drive innovative and results. You will join an existing, elite team dedicated to supporting and designing Apple’s infrastructure as a service.
- 10+ years of handling and maintaining critical services in a large scale *nix environment
- Experience with platform, reliability, and automation tools, processes, and culture
- A systematic test-driven approach to continually improving service scale and reliability
- Solid understanding of standard networking protocols and components
- Demonstrated experience with configuration management and orchestration tools
- Practical knowledge of shell scripting and at least one programming language (e.g. Perl, Python, Ruby, Golang)
- Experience with monitoring, logging, and other observability tools such as Nagios, Sensu, Prometheus, Elasticsearch, or Splunk
- Extensive experience supporting production services and massive scale distributed systems
- Demonstrated problem solving ability utilizing practical, creative, and innovating thinking
- Good sense of ownership, customer service, and integrity demonstrated through clear communication
- As a self-motivated teammate who thrives in a dynamic, constantly changing environment, passionate about building solutions and learning new technologies, this is the job for you. If you are a smart, creative, forward-thinking software engineer who’s always searching for a better way, we’d love to talk to you.
As a Site Reliability Engineer, you will be highly self-motivated with a passion for excellence, quality and detail. The service reliability engineer will not only support operations and infrastructure, but also work closely with the development teams and other partners within Apple to aid in architectural design and implementation of complex features. Our reliability engineering team develops and deploys software which forms the foundation for all of Apple’s services and provides software defined infrastructures that ensure Apple's services are reliable, scalable, fast, and secure. Responsibilities of the SRE include the following: • Figure out and implement optimal full stack configurations for scale and reliability • Deploy, support and monitor new platforms and application stacks • Set priorities and work efficiently in a fast-paced environment • Continuously measure and optimize system performance for capacity and scale • Explore and evaluate new technologies and solutions in order continually improve • Communicate and work effectively across multiple business and technical teams • Dedicatedly deliver results on time with the highest quality standards
Education & Experience
- Prefers a BS in engineering, computer science or other technical disciplines plus ten years of related experience
- Should be great teammate who excited to work members having highly diverse technical background. Should have extremely good communication skill. Should be self-motivated in pursuing the objective goals.