Operations Engineer

Austin, Texas, United States
Software and Services

Summary

Posted:
Weekly Hours: 40
Role Number:200374417
The Operations Engineer in Crypto Services team manages key technical infrastructure. An ideal candidate will have experience in Systems Administration. The Operations Engineer will monitor infrastructure and application services and drive incident management. The Ops engineer will work closely with SRE's, PKI Engineers, systems engineers, network engineers, database administrators and information security teams to effectively ensure availability and reliability. For this position, strict application security and high availability requirements must be balanced to achieve optimal solutions.

Key Qualifications

  • 3+ years of expertise with Linux (any distro, but especially RHEL). Standard UNIX utilities and programs
  • A systematic, test-and-measure approach to continually improving service operations
  • Knowledge of the operating system networking stack, TCP and UDP, and network interface drivers
  • Conceptual understanding of multi-tiered and web-based information systems architecture
  • Flexibility and adaptability to thrive in a dynamic, highly-demanding, constantly changing environment
  • Strong analysis, problem solving, and troubleshooting skills
  • Track record of practical problem solving, excellent communication, and documentation skills
  • Experience with monitoring tools such as icinga/nagios and log aggregation tools such as splunk
  • Experience with scripting languages such as Bash, Python
  • Working knowledge of Oracle Database is a plus
  • Experience around Security and Compliance is a plusDescription
  • The successful candidate will participate in troubleshooting issues following established procedures, documenting problems, managing incidents and owning the issue from the initial contact to resolution. Additionally the engineer will be responsible for critical compliance tasks.

Description

Responsibilities of the Operations Engineer include the following: Serve as a full time, primary on-call, responding and mobilizing efforts to address outages Follow change management procedures and deploy code using configuration management (e.g. Puppet, Chef, Ansible, etc) Set priorities and work efficiently in a fast-paced environment Measure and optimize system performance Monitor telemetry and address alerts to ensure smooth operations Strong communication skills and ability to work effectively across multiple business and technical teams Demonstrate ability to deliver results on time with high quality

Education & Experience

Prefer a BS in engineering, computer science or other technical disciplines plus 3 years of related experience.

Additional Requirements