Deep Learning Algorithm Engineer, 3D Pose - Body Technologies

Santa Clara Valley (Cupertino), California, United States
Machine Learning and AI


Weekly Hours: 40
Role Number:200196929
Have you devised breakthrough deep learning solutions to long-standing computer vision problems? Are you ready to collaborate with a Apple teams to bring your knowledge to market and make a far-reaching impact? We are seeking a driven and dedicated machine learning engineer with a background in human body 3D pose estimation, 3D object pose estimation, or a related field. As a member of the team you will tackle real world problems with direct impact on new and innovative products in the space of Virtual and Augmented Reality that will delight and inspire millions of people every day. We are looking for a diverse set of people, senior researchers, experienced professional and newly graduates are welcome to apply.

Key Qualifications

  • 2+ years of experience with deep learning (academic or professional experience)
  • Solid understanding of deep learning fundamentals
  • Familiarity with 3D, projective, and multiple view geometry
  • Experience in working with large real world datasets
  • Proficiency with Python (C++, Objective-C, Swift are a plus)
  • Proven experience in at least one major machine learning framework: TensorFlow, Keras, (Py-)Torch etc.


The Video Computer Vision (VCV) group delivers algorithms that drive revolutionary Apple products. We are the team that is responsible for many of the key algorithms for videos and photos on Apple products (e.g iPhone, iPad and more), provide backbone algorithms for ARKit, and conduct research and development in the space of Virtual and Augmented Reality. VCV’s Body Technologies team develops people understanding algorithms that drive features such as ARKit Motion Capture. We are looking for smart engineers who are passionate about building products for millions of customers around the world. You will be working on ground breaking technology and develop algorithms that enable a high-quality user experience across a range of applications. As a part of our team, you will collaborate with both software teams (computer graphics, video engineering, data generation/annotation, system integration) and hardware engineers (cameras, silicon, electrical engineering, product design). Join us for the rare opportunity to work on novel algorithms software that go beyond the state of the art and eventually will touch the lives of millions of people around the world!

Education & Experience

MS/PhD in Machine Learning, Computer Science, Mathematics, or a related field. Alternatively, a comparable industry career with a proven track record. If this is you, we'd love to hear from you!

Additional Requirements

  • * Experience in human body or hand pose estimation, keypoint detection and tracking, 3D object pose estimation, 2D-3D lifting, or a related field is a huge plus
  • * Experience with temporal deep learning techniques (e.g. LSTMs, TCNs)
  • * Experience with domain adaptation or generalization techniques for running deep learning models in unconstrained real world scenarios
  • Familiarity with state of the art deep learning architectures, especially in the context of computer vision
  • Strong mathematical foundation of machine learning and computer vision
  • Experience with motion capture data and character animation in the context of 3D computer graphics
  • Exposure to deep reinforcement learning techniques for data driven character animation
  • Experience working on real world problems and large datasets
  • Passion for groundbreaking computer vision / machine learning technologies and product delivery
  • Excitement for solving problems in new and creative ways but also about bringing research projects to product quality
  • Aspiration to stay on top of the state of the art and the latest developments in the research community