Software Engineer (Site Reliability), Retail Engineering
Austin, Texas, United States
Software and Services
Carrier Services offer seamless integration of Apple Retail Stores and Apple Online store with major US Carriers for iPhone activations.
We are looking for a talented Site Reliability Engineer to join our growing team.
As an SRE, you will be responsible for ensuring the reliability, scalability, and performance of our systems and services. You will work closely with our engineering and operations teams to design, build, and maintain robust infrastructure and automation solutions.
If you are an SRE engineer who can thrive in a dynamic environment and can make a meaningful impact through your technical expertise and dedication to excellence, come join our team as a Site Reliability Engineer (SRE).
Description
This role demands extensive hands on experience of working as SRE engineer for large scale, customer facing Cloud applications. Candidate should have good understanding of SRE principals, including monitoring, alerting, error budgets, fault analysis, and other common reliability engineering concepts. Candidate should have excellent troubleshooting and problem solving skills
Candidate will be expected to represent the SRE organization in design reviews and operational readiness exercises for new and existing services. They will also be required to collaborate with technical and non technical teams and analyze stats to come up with a clear picture on current state of our system. Having good working knowledge of Oracle and Cassandra databases will be beneficial in this regard.
Candidate should have a passion to automate manual operations and to improve them through repeated iteration.They should have good understanding of networking and load balancing concepts and should be able to lead a small team and come up with innovative solutions. They should be self motivated, capable of taking business critical decisions and should be comfortable working in a dynamic, ever changing environment. Candidate should be proactive in dealing with critical production issues and take them to closure while working with required partners.
Participate in an on call rotation providing hands-on technical expertise during service impacting events
Minimum Qualifications
- 5 years of hands on experience as an SRE engineer supporting large scale micro services applications.
- 5 years of experience in deploying, supporting and monitoring Cloud services in a large scale, customer facing environment.
- 5 years of hands on experience in developing Java based applications.
- 5 years of hands on experience building complex queries and dashboard using Splunk.
- 5 years of promoting observability of systems for monitoring, alerting, and metrics reporting using Datadog, Prometheus and similar tools.
- 5 years proficiency with at least 1 scripting language like Python etc.
- 5 years hands on experience working with Kubernetes, Docker, and containerization
- Proven track record for eliminating repetitive manual processes using automation
- 5 years working on maintenance tasks for Oracle and Cassandra Databases.
Key Qualifications
Preferred Qualifications
- BS in Computer Science or equivalent work experience is preferred.
- Strong problem solving skills, software development and debugging skills.
- Proven track record of taking ownership and successfully delivering results.
- Should be comfortable working in fast paced and dynamic environment.
- 5 years experience in leading small teams (3-4 members), design and develop scalable SRE Solutions while working with other teams.
- Fluency in Japanese language is a plus!
Education & Experience
Additional Requirements
Apple is an equal opportunity employer that is committed to inclusion and diversity. We take affirmative action to ensure equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant.