Site Reliability Engineer (SRE)
Austin, Texas, United States
Software and Services
Device Services is the backbone for sale and activation of devices including iPhone, iPads, Apple watch, Apple TV etc. Additionally, Device Services also protects Apple devices when it’s lost and is driving the eSIM revolution. With new smart HomeKit enabled devices, working with different partners for their onboarding and activation needs is another crucial project being handled by the team. A successful candidate will have experience in being a Full Stack Developer who has supported their applications operationally. The SRE will be identifying and developing solutions for advanced and robust monitoring, and reducing manual effort by automations
- Full stack development – JDK8+ preferred with spring boot, Rest APIs, multithreaded, multiprocessing applications, Angular
- Diagnosing and resolving problems in high-throughput applications, excellent communication, and documentation skills.
- Experience with one or more APM tools and log monitoring tools like AppDynamics, Splunk, Nagios etc
- Exposure to *nix environments including some shell script development and basic command execution
- Strong understanding of database principles and working knowledge in distributed storage and infrastructural solutions preferably Oracle, Cassandra.
The successful candidate will be highly self-motivated with a passion for excellence, quality and detail. The SRE would also be working with various support, development, QE teams to understand the gaps in existing product architecture, monitoring, verifications processes and will try to bridge those gaps by building solutions or making change in process. Lastly, a proven track record and expertise in debugging and root causing issues with an instinct to automate repetitive tasks. •Passion for quality and automation, an ability to understand complex systems and a desire to constantly make things better. •Constantly developing and improving on existing solutions used by the team to improve the throughput and reduce manual work. •You will be developing and maintaining scripts/products used for environment monitoring and task automation (Java, Perl, Shell, Python, Ruby, etc.) •Deploy, support and monitor new platforms and application stacks •Set priorities and work efficiently in a fast-paced environment •Explore and evaluate new technologies and solutions to push the capabilities forward, getting ahead of customers’ needs, innovate and continually improve •Strong interpersonal skills and ability to work effectively across multiple business and technical teams •Demonstrate ability to deliver results on time with high quality
Education & Experience
BS in engineering, computer science or other technical disciplines plus 3 years of related experience
- Understanding of security standards, policies, and cryptography.
- Network troubleshooting of routing policies, proxies, firewalls, load balancers configurations
- Experience with container management and micro-services architectures such as Docker in cloud and on-premises infrastructure.
- Good understanding of incident and problem management