ML Infrastructure Engineer - SPG

Santa Clara Valley (Cupertino), California, United States
Machine Learning and AI


Role Number:200441276
Apple Special Projects Group is seeking engineers to develop scalable machine learning approaches for autonomous systems. You will directly contribute to ML approaches using rich data. You will work closely with specialists across high-performance computing, machine learning, and autonomous system development to identify needs and deliver new capabilities. You will share every stage of development from concept to deployment. We are a team of hardworking engineers and researchers with deep experience in robotics, machine learning, and software engineering. We work on exciting new technologies and balance exploration of new problems with result-driven project planning and execution. In our daily work, the team stays effective, productive, and fun by sharing some key values: Passion for the mission: We’re here to make something extraordinary. We seek whatever work is right and strive for the best possible results. Modesty: The right answer is more significant than being right. We search for solutions as a team and value clear-eyed feedback. Lean habits: You can’t grow without limits. Time constraints and big goals encourage us to sharpen our focus and learn to make phenomenal decisions.

Key Qualifications

  • Proficiency with ML modeling frameworks (PyTorch, Tensorflow, etc.).
  • Experience in ML model serving (TorchServe, TensorFlow Serving, NVIDIA Triton inference server, etc.)
  • Familiarity with GPU computing.
  • Solid software engineering skills in complex, multi-language systems. Fluency in Python.
  • Building end to end data systems as an ML Engineer, Platform Engineer, or equivalent.
  • Experience working with cloud data processing technologies (Spark, Dask, ElasticSearch, Presto, SQL, etc.).
  • Excellent debugging and critical thinking skills.
  • Excellent analytical and problem-solving skills.
  • Ability to work in a fast paced, team-based environment.


• Build and integrate end to end lifecycles of large-scale, distributed machine learning systems using the latest open source technologies • Improve distributed cloud GPU training approaches for deep learning model • Build software that improves your rate of experimentation and helps you make better decisions about what to try next • Train, evaluate, and debug deep learning models for complex tasks • Develop tools and services for improving ML systems beyond modeling choices • Data distribution editing, data quality improvements, and representation learning with self-supervision • Architect the end-to-end platform that supports MLOps • Collaborate with engineers across functions to solve complex data problems at scale • Identify and evaluate new patterns and technologies to improve performance, maintainability and elegance of our machine learning systems • Lead technical projects to completion. Communicate with peers to build requirements and track progress • Mentor fellow engineers in your areas of expertise - Contribute to a team culture that values effective collaboration, technical perfection, and innovation

Education & Experience

• Bachelors, Masters, or PhD Degree in Computer Science/Machine Learning or equivalent professional experience.

Additional Requirements

Pay & Benefits