Site Reliability Engineer/ Dev Ops
Santa Clara Valley (Cupertino), California, United States
Software and Services
The successful candidate will enjoy using technology to automate solutions and optimize outcomes, implementing continuous integration and deployment in a meaningful and fast paced environment.
- Key qualifications
- Expert knowledge and experience with Software Version Control systems: SVN, GIT, etc. (Git and GitHub/Gitlab knowledge is a plus).
- Knowledge of Java build systems and tools including: Maven, Gradle, Ant, SBT, etc.
- Strong Operational Experience in Linux/Unix environment and scripting languages: Shell, Perl, Python.
- Experience maintaining automated build systems such as Jenkins.
- Experience working with server clusters consisting of 100s-1000s of machines, and deploying changes with zero downtime.
- A desire to write tools and applications to automate work rather than do everything by hand.
- * Familiarity with Splunk for investigating or monitoring problems on systems.
- Experience managing and integrating test automation into various points in a deployment pipeline.
- Experience with Java test frameworks such as JUnit.
- Experience implementing Java server applications using tools such as: Jersey, Jetty.
- Knowledge of WebServers and LoadBalancers Apache HTTP Server, Apache Traffic Server, Nginx, HAProxy.
- Experience maintaining large clusters using configuration tools such as: Ansible,Puppet, Chef, Salt, etc.
- Solid experience in trouble shooting, debugging, and performance measurement.
- Knowledge of Virtualization technologies like VMWare Fusion, VMWare Workstation, VMWare ESXi, Vagrant, Docker.
- Self-motivated, pro-active and solution-oriented individual.
Analyze the technology options/feasibility and define the build, delivery, and deployment pipeline for applications. Provide leadership in implementing a secure, robust and high availability DevOps pipeline. Automate build & deployment processes. Work closely with engineers, QA, project managers throughout the software lifecycle in successfully delivering best in class, large scale systems. Implement push button deployment at scale with zero downtime. May work on migration to Cloud platform. Manage and operate on production environments.
Education & Experience
BS degree in computer science or equivalent field with 5+ years or MS degree with 3+ years experience, or equivalent.
- Cloud certification and/or experience.
- Proficiency in Ansible (Other configuration management tools may count but preferable is Ansible first).
- Proficiency in Docker and orchestration tools.
- Proficiency in Unix/Linux management and troubleshooting.
- Having intermediate skills in scripting and programming (focused on Shell and Python).
- Knowledge on Java and Node applications is good to have for troubleshooting.
- Good oral/written communication skills.
- Manage production environments.