Machine Learning Ops Engineer, Applied Machine Learning
Sunnyvale, California, United States
Corporate Functions
Apple’s Applied Machine Learning team has built systems for a number of large-scale data science applications. We work on many high-impact projects that serve various Apple lines of business. We use the latest in open source technology and as committers on some of these projects, we are pushing the envelope. Working with multiple lines of business we manage many streams of Apple-scale data. We bring it all together and unleash business value. We do all this with an exceptional group of software engineers, data scientists, DevOps engineers and managers. We are looking for a talented and dedicated engineers to join our team to bring passion for infrastructure and distributed systems, to build world-class platforms/products at a very large scale across cloud environments.
Description
Join Apple's Applied Machine Learning Team, as a Senior Software Engineer, to enable GenAI across our Applications & Platforms. Candidates should have strong background in LLM core concepts, be proficient in setting up and supporting the large scale big data applications in public cloud like AWS/GCP.
The main responsibilities for this position include:
Build LLM Applications using open source LLM App Frameworks, AWS BedRock/GCP Vertex AI
Evaluate and port Language Models onto optimized infrastructure to reduce cost and increase performance
Build tools to benchmark and compare various embedding databases, LLMs
Build & Support CI/CD tools to port & manage applications on AWS/GCP & Kubernetes
Build automation to enable self-healing systems
Ability to troubleshoot application specific, core network, system & performance issues.
Build a multi-tenancy system by enforcing data protection between different use cases.
Involvement in challenging and fast paced projects supporting Apple's business by delivering innovative solutions.
The candidate is expected to be self-motivated, proactive, and a solution-oriented individual.
Minimum Qualifications
- Bachelors with 4+ years
- 4+ years of experience in Python Programming
- Extensive experience in deploying and managing the applications on AWS/GCP & Kubernetes
- Deep understanding on RAG based pipelines for Model inferencing, GuardRails
- Experience in open source LLM App frameworks like LangChain/LlamaIndex
Key Qualifications
Preferred Qualifications
- BS in computer science with 4+ years or MS with 2+ years experience or related experience.
- Exposure to Cloud managed services like AWS BedRock/GCP Vertex AI
- Good Understanding of Agents in GenAI
- Strong Experience in Infrastructure templating tools like CloudFormation, Terraform
- Experience in GitOps based deployment tools like Spinnaker/Flux/ArgoCD
- Strong proficiency with Helm and Kustomize for managing Kubernetes applications and configurations.
- Experience in managing Embeddings using Vector databases
- Exposure to Promot engineering
- Experience in observability & traceability for Large Language Models.
- Experience in Performance tuning on operating systems like Linux
- Excellent analytical & problem solving skills
- Exposure to various LLM infrastructure like GPUs, TPUs & Inferentia is preferred
- Exposure to LLM runtime like Triton, Frameworks like TensorRT, vLLM is an added advantage.
- Exposure to general Java troubleshooting skills
Education & Experience
Additional Requirements
Pay & Benefits
Apple is an equal opportunity employer that is committed to inclusion and diversity. We take affirmative action to ensure equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant.