Senior Software Engineer - Big Data Platform

Santa Clara Valley (Cupertino), California, United States
Software and Services

Summary

Posted: Aug 29, 2019
Weekly Hours: 40
Role Number: 200081498
Imagine what you could do here. At Apple, great ideas have a way of becoming great products, services, and customer experiences very quickly. Bring passion and dedication to your job and there's no telling what you could accomplish. Apple’s Applied Machine Learning team has built systems for a number of large-scale data science applications. We work on many high-impact projects that serve various Apple lines of business. We use the latest in open source technology and as committers on some of these projects, we are pushing the envelope. Working with multiple lines of business, we manage many streams of Apple-scale data. We bring it all together and extract the value. We do all this with an exceptional group of software engineers, data scientists, dev-ops engineers and managers.

Key Qualifications

  • Strong Experience in architecting, designing and writing platform software management of big data platform.
  • Strong experience in large big data environments running thousands of nodes and jobs
  • Strong java and python expertise with focus on writing scalable , secure and highly available software application.
  • Strong understanding of containerizartion technologies and writing software running in them.
  • Strong expertise in troubleshooting complex production issues.
  • Strong expertise in troubleshooting someone else’s code.
  • Expert understanding of Unix/Linux based operating system
  • Excellent problem solving, critical thinking, and communication skills
  • Experience automating CI/CD pipelines.
  • Expertise in configuration management for deploying, configuring, and managing servers and systems
  • The candidate should be adapt at prioritizing multiple issues in a high pressure environment
  • Should be able to understand complex architectures and be comfortable working with multiple teams
  • Ability to conduct performance analysis, capacity management and troubleshoot large scale distributed systems
  • Comfortable working in a fast paced environment while continuously evaluating emerging technologies while mentoring junior engineers
  • The position requires solid knowledge of secure coding practices and experience with the open source technologies.

Description

You are a software engineer architecting and building high performant software platform and applications in big data ecosystem. You have solid understanding and experience on HDFS, YARN, spark, and related and emerging big data technologies. You will help architect and build big data solutions around big data platfrom orchestration, capacity management , job orchestration tools like airflow ,to provide visibility and ease of management of our large big data platform. You will also help shape the direction of platform around how we do capacity allocation and show back, data sharing, data lineage and orchestration in our machine learning platform.

Education & Experience

BS in computer science with 7-10 years or MS plus 5-7 years experience or related experience.

Additional Requirements

  • - Experience with Kubernetes, Docker Swarm, or other container orchestration framework
  • - Experience building and operating large scale hadoop/spark data infrastructure used for machine learning in a production environment
  • - Experience in tuning complex database queries
  • - Basic understanding of Hadoop/hdfs/yarn
  • - Experience in Workflow and data pipeline orchestration (Oozie, Jenkins etc.)
  • - Experience in jupyter based notebook infrastructure.