Big Data Engineer - Wallet & Apple Pay
Santa Clara Valley (Cupertino), California, United States
Software and Services
Looking for hardworking, passionate and results-oriented individuals to join our team to build data foundations and tools to craft the future of commerce and Apple Pay. You will design and implement scalable, extensible and highly-available data pipelines on large volume data sets, that will enable actionable insights & strategy for payment products. Our culture is about getting things done iteratively and rapidly, with open feedback and debate along the way; we believe analytics is a team sport, but we strive for independent decision-making and taking smart risks. Our team collaborates deeply with partners across product and design, engineering, and business teams: our mission is to drive innovation by providing the business and data scientist partners outstanding systems and tools to make decisions that improve the customer experience of using our services. This will include demonstrating large and complex data sources, helping derive measurable insights, delivering dynamic and intuitive decision tools, and bringing our data to life via amazing visualizations. Working with the head of Wallet Payments & Commerce Data Engineering & BI, this person will collaborate with various data analysts, instrumentation specialists and engineering teams to identify requirements that will derive the creation of data pipelines. You will work closely with the application server engineering team to understand the architecture and internal APIs involved in upcoming and ongoing projects related to Apple Pay. We are seeking an outstanding person to play a pivotal role in helping the analysts & business users make decisions using data and visualizations. You will partner with key partners across the engineering, analytics & business teams as you design and build query friendly data structures. The ideal candidate is a self-motived teammate, skilled in a broad set of Big Data processing techniques with the ability to adapt and learn quickly, provide results with limited direction, and choose the best possible data processing solution is a must.
- 5+ years of professional experience with Big Data systems, data pipelines and data processing
- Practical hands-on experience with technologies like Apache Hadoop, Apache Pig, Apache Hive, Apache Sqoop & Apache Spark
- Ability to understand API Specs, identify relevant API calls , extract data and implement pipelines & SQL friendly structures!
- Identify Data Validation rules and alerts based on data publishing specifications for data integrity and anomaly detection!
- Understanding on various distributed file formats such as Apache AVRO, Apache Parquet and common methods in data transformation
- Expertise in Python, Unix Shell scripting and Dependency driven job schedulers
- Expertise in Core JAVA, Oracle, Teradata and ANSI SQL
- Familiarity with Apache Oozie and PySpark
- Knowledge on Scala and Splunk a plus.
- Familiarity with rule based tools and APIs for multi stage data correlation on large data sets is a plus
- Translate business requirements by business team into data and engineering specifications - Build scalable data sets based on engineering specifications from the available raw data and derive business metrics/insights - Work with engineering and business partners to define and implement the data engagement relationships required with partners - Understand and Identify server APIs that needs to be instrumented for data reporting and align the server events for execution in already established data pipelines - Explore and understand complex data sets, identify and formulate correlational rules between heterogeneous data sources for effective analytics and reporting - Process, clean and validate the integrity of data used for analysis - Develop Python and Shell Scripts for data ingestion from external sources for business insights - Work hand in hand with the DevOps team and develop monitoring and alerting scripts on various data pipelines and jobs
Education & Experience
Minimum of bachelor’s degree, preferably in Computer Science, Information Technology or EE, or relevant industry experience is preferred