AIML - Site Reliability Engineer, ML Platform & Technology

Singapore, Singapore, Singapore
Software and Services


Role Number:200511426
Apple is a place where extraordinary people gather to do their best work. Together we create products and experiences people once couldn’t have envisioned — and now can’t imagine living without. If you’re excited by the idea of making an impact, joining a team where we pride ourselves in being one of the most diverse and expansive companies in the world, a career with Apple might be your dream job! If you wish to play a part in revolutionizing how people use their computers and mobile devices; build ground breaking technology for algorithmic search, machine learning, natural language processing & artificial intelligence; and work with the teams building the most scalable big-data systems in existence. This is the role for you!

Key Qualifications

  • 2 or more years of experience in a Site Reliability Engineering, observability or ML Ops focused role supporting internet services and distributed systems
  • Proficiency in using Go, Python or other higher-level languages for automation, observability and infrastructure management
  • Experience building and supporting telemetry, observability and logging solutions for incident, cost and performance management
  • Experience with infrastructure or dashboards as code and provisioning tools for Kubernetes and cloud based services
  • Working knowledge of open source or commercial monitoring and observability frameworks and platforms such as ELK, Splunk, OpenCensus, Datadog
  • Working knowledge of ML Ops systems and tools advantageous
  • Good interpersonal skills shown through previous projects or assignments


- Monitor production, staging and development environments for a myriad of services in an agile and dynamic organization. - Employ metrics for data driven solutions for reliability, performance and service insights - Design, implement, and extend automation tools for monitoring, logging, ML and data processing pipelines - Strive to improve the stability, security, efficiency and scalability of production systems by applying software engineering practices. - Resolve future needs for capacity and investigate new features and products. - Strong problem solving ability will be used daily; a successful Engineer will take steps on self-initiative basis to isolate issues and resolve root cause through investigative analysis. - Responsible for writing justifications, incident reports, best practices documentation and solution specifications.

Education & Experience

Bachelor Degree in Computer Science or Computer Engineering or equivalent

Additional Requirements