Big Data SRE Manager

Bengaluru, Karnataka, India
Software and Services


Role Number:200211899
The people here at Apple don’t just build products — they build the kind of wonder that’s revolutionized entire industries. It’s the diversity of those people The people here at Apple don’t just build products — they build the kind of wonder that’s revolutionized entire industries. It’s the diversity of those people and their ideas that encourages the innovation that runs through everything we do, from amazing technology to industry-leading environmental efforts. Join Apple, and help us leave the world better than we found it. Apple’s Applied Machine Learning team has built systems for a number of large-scale data science applications. We work on many high-impact projects that serve various Apple lines of business. We use the latest in open source technology and as committers on some of these projects, we are pushing the envelope. Working with multiple lines of business, we manage many streams of Apple-scale data. We bring it all together and extract the value. We do all this with an exceptional group of software engineers, data scientists, SRE/devops engineers and managers.

Key Qualifications

  • 10 Years of Management experience leading team of engineers
  • Hands on manager who likes troubleshooting complex performance and scale problems
  • Excellent problem solving, critical thinking, and communication skills - Lead by example to motivate and challenge the team to deliver their best.
  • 5+ years of experience in Hadoop based technologies - HDFS/Yarn cluster administration, Hive, Spark
  • Strong Experience leading cross functional initiatives and thought leadership
  • Zoom in and zoom out to clear out ambiguity and set a clear path forward
  • Experience managing Hadoop/YARN clusters with thousands of nodes and 10’s of petabytes of data running 10’s of thousands of jobs
  • Have a passion for automation by creating tools using Python, Java or other JVM languages
  • Strong expertise in troubleshooting complex production issues.
  • The candidate should be adapt at prioritizing multiple issues in a high pressure environment
  • Should be able to understand complex architectures and be comfortable working with multiple teams
  • Ability to conduct performance analysis and troubleshoot large scale distributed systems
  • Should be highly proactive with a keen focus on improving uptime availability of our mission-critical services
  • Comfortable working in a fast paced environment while continuously evaluating emerging technologies
  • The position requires solid knowledge of secure coding practices and experience with the open source technologies.


We are seeking a hands-on Manager who has experience leading large Big Data environments spread across thousands of nodes and petabytes of data. We look forward to a people manager with a background & experience that looks like this: - Grown into leadership roles after proving technical skills in individual contributor roles but still enjoys hands on work when the situation calls for it. - You have designed and built large data environments for availability, security and reliability. - You keep yourself informed about the choices and trade off as the new technology evolves in big data landscape. - You have an eye for talent and hire and grow your engineers by mentoring and challenging them. - You will collaborate across many teams to deliver on projects related to big data platform and data pipeline and provide SRE support for reliability of these managed services. - You will have significant opportunity to influence and shape our big data platform strategy and data products as we work on the next generation of our architecture, platform and processes.

Education & Experience

Additional Requirements

  • Experience with running infrastructure in AWS and Kubernetes
  • Experience building and operating large scale hadoop/spark/kafka data infrastructure used for machine learning in a production environment
  • Experience in tuning complex hive and spark queries Expertise in debugging hadoop/spark/hive issues using Namenode, datanode, Nodemanager, spark executor logs.
  • Exeprience in Capacity management on multi tenant hadoop cluster
  • Exeprience in Workflow and data pipeline orchestration (Airflow,Oozie,Jenkins etc.)
  • Experience in jupyter based notebook infrastructure.