Arbeitsbeschreibung
Responsibilities
- Create and maintain an optimal data pipeline architecture.
- Assemble large, complex data sets that meet functional and/or non-functional business requirements.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and big data technologies.
- Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
- Design, construct, install, test and maintain data management systems.
- Build high-performance algorithms, predictive models, and prototypes.
- Develop set processes for data mining, data modeling, and data production.
- Employ an array of technological languages and tools to connect systems together and research new uses for existing data.
- Install/update disaster recovery procedures.
- Recommend different ways to constantly improve data reliability and quality.
Qualifications
- Experience building and optimizing data pipelines and architectures.
- Strong analytic skills related to working with (un)structured datasets.
- Build processes supporting data transformation, data structures, metadata, dependency and workload management.
- Experience performing root cause analysis to answer specific business questions and identify opportunities for improvement.
- Experience supporting and working with cross-functional teams in a dynamic environment.
- Intellectual curiosity and ability to approach data organization challenges while keeping an eye on what's important.
- We are looking for a candidate with 3+ years of experience in a Data Engineer role, who has attained a degree in Computer Science, Statistics, Information Systems, Physics, Mathematics or another quantitative field. You should also have experience using the following software/tools: Experience with big data tools like e.g. Hadoop, Spark, Kafka, etc.
- Advanced working SQL knowledge and experience working with relational databases, as well as familiarity with a variety of databases.
- Experience with data pipeline and workflow management tools like e.g. Airflow, Azkaban, Luigi, etc.
- Experience with object-oriented/object function scripting languages: Python, Go, Clojure, Ruby, Scala, Java etc.