Siri - Sr. Data Scientists, Data Organization
Santa Clara Valley (Cupertino), California, United States
Machine Learning and AI
The Siri Data Infrastructure Engineering team is building groundbreaking data infrastructure and privacy technology used across the Siri team for metrics, machine learning, natural language processing, and artificial intelligence. As part of this team, you will work with one of the most exciting high-performance big data computing environments with petabytes of data. You will have an opportunity to imagine and build data infrastructure and data privacy platform solutions that ensure our Siri users have a great interactive experience.
- You are highly experienced and battle tested, a lead or core contributor on data storage and data infrastructure projects in the 100PB+ range.
- You are pragmatic, the data architecture we build and operate is a blend of internally defined projects, open-source projects, and projects from other parts of Apple at similar scale.
- This is a hands-on position, expect to write more code than documentation, to be able to dive in deep into our data platform internals
- You will take a key role in attracting talent and screening additional members of the project as it expands.
- You are Data Privacy-focused, ensuring data privacy is an integral part of everything we build for our data platform
- You have strong experience in one or more object-oriented programming languages (Scala, Java, C++)
- You are passionate about open source systems projects and in contributing back to them, alongside work on our internally developed data platform components and systems
- You are fluent in at least one scripting or systems programming language (Go, Python, Ruby, Bash, Rust, Crystal etc.)
- You have a solid understanding of SQL, SparkSQL, HiveQL, and other data query language variants, and how downstream users can optimally query our data besides programmatically accessing our data
- You possess excellent writing and interpersonal skills, with an ability to lead technical discussions and engage with downstream teams on their usage of our data platform.
- You like to partner with a variety of cross and multi-functional team members from a diverse array of groups across Siri and the company
- Excellent writing and interpersonal skills
- Thorough knowledge of Linux is helpful
- Ability to stay focused and prioritize a heavy workload while achieving exceptional quality
- You are upbeat, adaptable, and results-oriented with a positive attitude
- You bring passion and dedication to your job and are committed to our vision and supporting the developer community
The mission of the Data Infrastructure and Privacy Team within the Siri Production Engineering organization is to store and protect Siri’s usage data and to provide a platform that is performant, secure, and trustworthy for projects across Siri that power the Siri user experience. We capture instrumentation and logging data from a billion physical devices, and collecting and curating the delivery of data from many hundreds of Siri devices and Siri Services. We bring all this data together into a single environment to enable the work of hundreds of hardworking engineers and data scientists, on one of the largest distributed compute clusters at Apple. We are seeking a small number of extraordinary individuals to design and lead elements in our next generation Exascale data storage and data access infrastructure while ensuring data privacy throughout the platform.
Education & Experience
Master's Degree, PhD, or Bachelors with equivalent work experience in Computer Science, Computer Engineering, EE, Physics, or other technical discipline We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.