AI/ML- ML Performance Engineer, Machine Intelligence

Seattle, Washington, United States
Machine Learning and AI


Role Number:200179047
You will be responsible for whole-system performance optimization of our inference engine pipeline. This typically includes writing high-performance, well-tested, multi-platform code that uses platform-specific capabilities. Our ideal team member is daring when it comes to trying new things, adept at analyzing computer systems performance, and is willing to iterate on ideas. We value team members who can quickly prototype, iterating all the way to high-quality implementations. The Machine Intelligence team within Apple is well positioned for meaningful contributions in the immediate short-term on well-known Apple ML products. We are also invested in more results-oriented, high-risk projects for never-before-released products where performance and energy efficiency play a critical role.

Key Qualifications

  • Basic understanding of C++, compilers, and computing systems
  • Experience with whole-system performance debugging and optimization
  • Experience with systems-level debugging
  • Experience in parallel computing


As a member of this team, you will use your background to: - Improve overall computer systems performance of Apple’s on-device inference engine - Assess bottlenecks and apply surgical fixes to reduce latency and improve energy efficiency - Design and implement optimized code for well understood and novel ML primitives - Benchmark existing deep learning models on CPU, GPU, and the Apple Neural Engine - Micro-benchmark compute fabrics to derive insights for ML network designers - Work with the ML team to identify novel network architectures

Education & Experience

Bachelor's, Master's, or PhD in Computer Science or a related field

Additional Requirements

  • - Experience in performance modeling of multi-core CPUs
  • - Experience in machine learning and computer vision
  • - Experience in assembly-level programming
  • - Experience in accelerator-based computing such as GPUs, FPGAs, DSPs, NPUs
  • - Experience with objective C/C++ and Swift
  • - Experience with profiling tools, e.g., gprof, google-pprof, lttng, perf
  • - Experience with big.LITTLE ARM and/or heterogeneous architectures
  • - Experience with embedded platforms using ARM CPUs
  • - Experience with compilers including cross-compilation