Site Reliability Engineer (SRE), Apple Pay - Japan
The people here at Apple don’t just craft products, they build the kind of wonder that’s revolutionized entire industries! We believe that it’s the diversity of those people and their ideas that encourage the innovation that runs through everything we do, from amazing technology to industry-leading privacy and environmental efforts.
We’re looking for talented Site reliability Engineers software engineers to join the Wallet, Payments & Commerce (WPC) Engineering Operations team.
WPC organization is responsible for everything in the Apple Wallet, and all things Payments, at Apple. We build the rails on which all payments at Apple run, including for the Apple Online Store, the Apple Retail Stores around the world, and the Apple App Store. We also build the rails that make Tap to Pay on iPhone possible, by allowing iPhones to become a payment terminal, and by helping Tap to Pay on iPhone partners to process payments from iPhones.
Join us on this journey and help us leave the world better than we found it!
As an SRE in WPC, you'll need to solve problems using data, teamwork, and your own expertise. You will own the full stack and our responsibilities are both broad and deep. We run a mix of open source and internally developed tools for system & configuration management, provisioning, software deployment, and monitoring. You'll learn these tools and have opportunities to improve them.
Our team is collaborative; we work closely with the development teams we support to deliver the best results for Apple. We think critically and strive to balance the best solution with the need to get things done for each engineering challenge we face. Good ideas are heard and results are rewarded.
Responsibilities include:
- Deploy, support and monitor new and existing services, platforms, and application stacks
- Enhance, architect, author, and deliver software to improve the availability, scalability and security of Wallet services
- Build and manage systems, infrastructure and applications through automation
- Participate in periodic on-call duties
- Experience in building and scaling distributed systems in a public, private, or hybrid cloud environment
- Understanding of core SRE concepts - Monitoring, Alerting, Incident management
- Experience with deploying, supporting and monitoring new and existing services, platforms, and application stacks
- Proven track record to write programs using a high-level programming language like: Java, Go, Python
- Understanding of AWS and Kubernetes Concepts
- Operating systems concepts (process scheduling, disk and network I/O, performance)
- Strong sense of ownership, customer service, and integrity demonstrated through clear communication
- Passion for eliminating repetitive manual processes using automation and to improve them through repeated iteration
- Excellent troubleshooting and problem solving skills
- Experience with scalability, reliability, availability and disaster recovery
- Proclivity towards efficient programming emphasizing improvement via complexity analysis
- Understanding of open source technologies like Open Telemetry, Envoy, Istio and Kubernetes