Site Reliability Engineer (SRE), Shazam
San Diego, California, United States
Software and Services
Hundreds of millions of users. Billions of Shazams. Countless moments of discovery. Shazam brings a unique brand of magic to millions every day. Bring us your vision, and it’ll be you creating the wow moments that excite people across the world! We’re looking for a strong engineer to join our team to lead advancements to the next level of reliability, scalability, and performance for the core services that Shazam provides to its users. You’ll work alongside development teams to continue to evangelize best practices and improve the systems that power Shazam.
- 3+ years experience in a DevOPS or SRE role. Bonus: Experience designing and operating reliable distributed systems at scale in a cloud environment.
- 3+ years experience with modern web-scale systems. Bonus: Microservices.
- Demonstrated ability to work cross-functionally to define and implement best practices and standards; leading through influence.
- Experience in at least one cloud provider, with preference for GCP.
- Strong understanding of one or more of the following programming languages: Golang, Java, C++, and/or Python
- Comfortable digging into source code (from the Linux Kernel to in-house developed software) to chase bugs, understand behaviour, etc.
- Experience with systems automation and configuration management tools as well as infrastructure as code (Go/Python/etc. Kubernetes tooling, Terraform. Bonus: Puppet, Chef, Ansible, etc.)
- Experience operating Kubernetes clusters in production and deploying software, monitoring stack, etc. Knowledge managing containerised services and how they interact with network & system resources.
- Strong understanding of CI/CD processes, technologies and methodologies.
- Strong understanding of core Linux/UNIX operating system fundamentals and TCP/IP and network stack (or technologies).
- Familiarity with monitoring systems based on TSDBs (Graphite, Prometheus, Thanos, etc.) and front ends such as Grafana.
Shazam Site Reliability Engineers are not just responsible for making sure all of the services and systems that Shazam relies on are operating at their highest level; they’re also responsible for helping development teams embrace these principles as they develop software. Shazam SREs embed themselves with development teams and act as extensions of those teams to propagate best practices. They think about distributed systems and help development teams that are focused on individual parts of it to have a bigger picture than they would if they weren’t involved. We believe that software engineers who own their code in production will write much more scalable, supportable systems. Shazam SREs exist to help those teams build the competencies to be able to do that. This role sits in our San Diego office reporting to our Head of SRE in London. The successful candidate will be working locally in San Diego assisting multiple development teams based in San Diego and London to build and maintain the key backend systems that power Shazam as well as participate in the development lifecycle at a very deep level, from the early stages of feature design all the way to seeing it released into production for our users to enjoy. You will be expected to write and review code and deeply understand how our applications work, and that means knowing how the code works.
Education & Experience
Bachelor's degree Computer Science, Electrical or Computer Engineering, or equivalent experience.
- A dedicated lifelong learner who is always looking for new things to learn and try.
- A professional engineer who loves crafting, analysing and troubleshooting large software systems.
- An excellent communicator who builds collaborative relationships with technical and non-technical stakeholders.
- Have excellent analytical and problem-solving skills, tenacious in sticking with a problem until it's resolved once and for all.
- A great teammate, but you can work on your own initiative as well.
- Always actively looking for ways to improve Shazam's services, and take personal ownership for the quality of the services we offer.
- Demonstrate personal accountability, owning the decisions and mistakes that you make