Data Pipeline Engineer, Apple Media Products
Santa Clara Valley (Cupertino), California, United States
Software and Services
The App Store data engineering team provides insights through data that drive decision making for our engineering and product teams. We are looking for a Data Pipeline Engineer that can automate and build data pipelines for our search and recommendations features.
- Experience with big data systems and distributed computing, such as Hadoop and Spark.
- Proficiency in using query languages such as SQL, Hive and SparkSQL.
- Experience with entity-relationship modeling and understanding of normalization.
- Experience with sessionization of clickstream and time-series data is a plus.
- Familiar with the concepts of dimensional modeling.
- Experience maintaining a large software system.
- Experience writing a test suite.
- Experience with Continuous Integration.
- Experience with Version Control such as git.
- Experience with programming with Scala, Spark or Python.
- Experience with data visualization tools, such as GGplot, etc.
- Able to understand various data structures and common methods in data transformation.
- Keep up-to-date with the newest technology trends.
Our team designs, executes and builds tools for online experiments (A/B tests) and offline experiments (human relevance judgement) that help us improve and fine tune our data-driven features. Your primary focus will be to automate the delivery of various datasets by working with Data Scientists on the team to understand important KPIs and how they are derived. You will write and maintain the code that ingests, computes and coordinates various data sets. ADDITIONAL EXPECTATIONS OF THIS ROLE INCLUDE: Work with software engineering teams to improve data collection procedures. Processing, cleansing, and validating the integrity of data used for analysis. Engineer code that is durable and reliable. Performance tune and optimize code as data grows and needs change. Generate reports (that can be automated) to present key insights to partners across engineering and product teams. Passion for visualizing and making sense of data analysis.
Education & Experience
Bachelor’s Degree in Computer Science or related field 3-5+ years’ practical experience with Big Data systems, ETL, data processing, and analytics tools.