Siri - Data Infrastructure Engineer

Santa Clara Valley (Cupertino), California, United States
Machine Learning and AI


Role Number:200042911
As a member of the Data Infrastructure Engineering team within the Siri Production Engineering organization, you will be faced with highly complex issues in a large scale, distributed system environment. In order to ensure a reliable and rewarding Siri experience, you will be empowered to develop and design new solutions heavily focused on data infrastructure and DevOps style system automation. We look for talented engineers with a passion for data in both the Operations and Development space to bring these unique solutions to production at a rapid pace. From an Operations perspective, this makes us the most successful personal assistant in the industry.

Key Qualifications

  • In depth understanding of Big Data technologies (Hadoop, Hbase, Spark, Kafka, Flume, Hive, HDFS, etc)
  • Solid experience in large scale production data and systems environments
  • Experience in one or more object-oriented programming languages (Scala, Java, C++)
  • Fluent in at least one scripting or systems programming language (Python, Ruby, Bash, Go, Rust, Crystal etc.)
  • Knowledge of the Linux operating system (OS, networking, process level)
  • Interest in DevOps style engineering teams - we operate what we build!
  • Strong verbal and written communication skills
  • Passionate about data and in being a part of a tight-knit Data Infrastructure Engineering team


The Siri Data Infrastructure Engineering team manages data ingestion and the data storage platform used for analytics and machine learning of worldwide Siri events. Using internally built systems platforms, open source data platform tools, and purpose built solutions developed by our own team, the Data Infrastructure Engineering team members strive to build out a performant and scalable data platform at huge scale with high quality data that can be operated at by our relatively small team. To accomplish this, we build operations focused automation tools and services to prevent failures and page individuals when there really is a problem, not just noise. Our engineers not only work closely within the Siri Production Engineering team, but also with the development and analytics engineers within Siri as well as outside organizations. We build out data platform infrastructure for maximum efficiency, scalability and reliability to allow domain specific data scientists and machine learning engineers to focus on their specialties. A successful candidate will be someone who can actively take part in the design, build, and operation of our data and systems infrastructure. As a member of the Siri Production Engineering Team within Apple you will: • Build and Operate Apple’s largest data infrastructure supporting millions of Siri customers at double digit PB scale • Work with open source technologies such as Spark, Kafka, Presto, Hbase, and Hadoop in building out our data platform • Develop data platform services that enhance how we operate our data platform, store our data securely, ensure data privacy, and enhance ease of use for everyone that makes use of our data • Ensure our systems and data platform offers reliable high quality data with consistent SLAs • Design and engineer scalable streaming data solutions that analytic engineering teams can build metric platforms on top of • Make use of Anomaly Detection and Machine Learning to enhance the operation and understanding of our data from an operational perspective • Troubleshoot complex issues across the entire stack • Create automation frameworks to handle both development and production events at scale • Advise other teams (within and outside of Siri) on technical direction • Help grow our data environment with the purpose of pushing Siri to the next level of scale and stability

Education & Experience

BS,MS, or PhD degree in Computer Science, EE, Physics, or other technical discipline and 5+ years of building data pipelines experience

Additional Requirements

  • Other Desirable Skills
  • • Strong Scala and/or Java expertise with JVM performance tuning/optimization
  • • Experience with Kubernetes or Mesos compute platforms
  • • Workflow and data pipeline orchestration experience (Oozie, Airflow, Pinball, Jenkins, Luigi, etc.)
  • • In depth experience with resource managers such as Yarn and Marathon
  • Apple is an equal opportunity employer that is committed to inclusion and diversity. We take affirmative action to ensure equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics