Site Reliability Engineer (SRE), Apple Pay (Japan)

Tokyo, Tokyo-to, Japan
Software and Services

Summary

Posted:
Weekly Hours: 37.5
Role Number:200478703
At Apple, ideas have a way of becoming extraordinary products, services, and customer experiences very quickly. Bring passion and dedication to your job and there's no telling what you could accomplish! The Apple Pay Site Reliability Engineering Team is hiring an engineer to focus on the front line customer experience and the back end integration of Apple systems with our Network and Banking partners. Site Reliability Engineering (SRE) is an engineering field that combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. This position requires a highly motivated individual who loves sophisticated challenges in a fast-paced environment. You will support payments services to ensure the services are functional, work on improving reliability and availability, will think creatively, and come up with innovative solutions to automate the manual support activities. As part of our "follow-the-sun" team, you’ll have peers based in Europe and the US, ensuring timely support for customers around the world while maintaining a healthy work-life balance. You’ll be joining a team of extraordinary engineers and interacting daily with teams that span across Apple including iOS software and hardware engineering, Apple Online Store, Apple Retail and our Network and Banking partners. Join us on this journey and help us leave the world better than we found it!

Key Qualifications

  • Demonstrated extensive experience in supporting mission-critical customer-facing systems in production is a requirement.
  • Strongly motivated to automate and continuously replace manual operations with efficient automated solutions.
  • Proven record of successfully handling sophisticated problems and consistently delivering effective solutions.
  • Proficient in event-based and log-based monitoring systems, deployment tools, and cloud computing.
  • Skilled in developing applications using object-oriented or functional programming languages, with a preference for Java, Python, or Go.
  • Exceptional communication and collaboration skills.

Description

- Participate in a rotating on-call schedule, diligently overseeing, diagnosing, and resolving potential disruptions in critical production systems. - Develop strategies to enhance system reliability and observability, establish protocols for handling alerts, and create run-books to facilitate seamless collaboration with a globally distributed team across multiple time zones. - Streamline and optimize tools and processes by automating repetitive tasks. Introduce automation tools and best practices to minimize manual effort and enhance efficiency. - Collaborate with multi-functional teams, including development teams, to incorporate reliability considerations into system design and implementation. Collaborate with external partners to ensure high availability of integrated services for our customers. - Stay updated with the constantly evolving technology landscape, continuously expanding knowledge of the latest technologies and industry best practices, not only within the realm of system reliability but also in the industry we operate in.

Education & Experience

BS degree in computer science or equivalent experience

Additional Requirements

  • Nice to Have:
  • Proficient in container orchestration systems such as Kubernetes and experienced in utilizing Continuous Delivery platforms like Spinnaker.
  • Apple is an equal opportunity employer that is committed to inclusion and diversity. We take affirmative action to ensure equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Apple is committed to working with and providing reasonable accommodation to applicants with physical and mental disabilities.