Senior Software and Data Engineer - SIML, ISE
Santa Clara Valley (Cupertino), California, United States
Software and Services
Do you believe Machine Learning and AI can change the world? We truly believe it can. We are the Data Team of the System Intelligence and Machine Learning (SIML) group at Apple. We are responsible for building high quality datasets at scale. Every year, our team produces datasets used in the training of ML and AI-centric features for many Apple products, including iPhone, iPad, Mac, Apple Watch and even AirPods. Our work is used in very visible and important features, from the wallpaper on your iPhone Lock Screen, to the models that highlight the faces of your loved ones in your photos app, to the memories that your Apple products create from your photos, videos and music. We value collaboration and team-work. Our engineers are not afraid to manually browse the data to identify and troubleshoot problems. We expect the same from our new members. This is an exciting time to join us and have an impact on multiple key features at Apple!
- 5+ years of industry experience in building data pipelines, data processing infrastructure or data operations teams
- Demonstrated prior experience in large language models, or generative AI
- Proficient in Python, or another modern programming language
- Proven track record of handling complex data projects with contributions to hiring, mentoring and growing engineers, establishing and enforcing the right software engineering culture for a software team
- Experience in building data processing pipelines for curating data, training and evaluating models (experience in Airflow, KubeFlow or other pipelining tools)
- Passionate about supporting day-to-day data operations and able to work efficiently with members of the team who are not engineers
- Self-starter, able to handle ambiguity, navigate uncertainty, identify risks, and find the right people and tools to get the job done
- We value collaboration and team-work. Our engineers are not afraid to manually browse the data to identify and troubleshoot problems. We expect the same from our new members.
Our team works in close interaction with R&D, infrastructure and client teams, as well as with other groups and other functions across Apple (legal, privacy) and externally. This position focuses on designing and implementing flexible data pipelines and data tools based on advanced computer vision technology, NLP and humans in the loop. You will be responsible for the design and development of the data pipelines, automation, visualization and tools that constitute the end-to-end process for building models, from raw data to trained model to evaluation to deployment. You'll partner with data infrastructure and other teams to ensure that we have high quality, representative data. You'll work with ML engineers to refine the modeling process to enable faster iteration and better modeling decisions and deploy models more rapidly to customers. You'll collaborate with data scientists and analysts to build insights from customer analytics and feedback into the process to complete the cycle of continuous improvement. Your work will impact hundreds of millions of Apple's customers and help people communicate more easily in the languages and modalities of their choice.
Education & Experience
Bachelors, Masters or PhD in Computer Science, Mathematics, Physics, or a related field (or equivalent practical experience).
- • Strong understanding of applied machine learning topics is desirable
- • Experience with ETL frameworks like Airflow is desirable
- • Kubernetes and Docker experience is desirable
- • Strong knowledge of either NLP or Computer Vision is desirable