Data Engineer

Job Summary
The role holder will be responsible for expanding and optimizing our data, data infrastructure and data pipeline architecture. They will spend time optimizing data flows and systems in support of cross functional teams. They will be responsible for designing and maintaining data pipelines, ensuring that data platforms, production schedules, and self-service tools are operating efficiently and always available. The holder will be required to work with analysts and developers to design and maintain data infrastructure to support business requirements.
Job Description
Key Accountabilities:

Design implement and support bank data infrastructure
Design and implement ETL pipelines into data warehouse and other data structures in use by the bank
Prepare large data sets for analytics through labelling, scheduling, and error checking.
Design and support streaming data pipelines into KAFKA and ensure they are always available and usable
Design systems to track data quality and consistency to ensure data is accurate and up to date for both on premise and cloud data sources.
Ensure both on premise and cloud infrastructure are always available.
Build and maintain logical data models, data marts and multidimensional cubes for reporting and analytics
Automate data pipelines for Data Science team in support of ML and AI automation
Manage and optimize Data Science algorithms and procedures in production
Maintain high availability of self-service analytics infrastructure and data platforms.
Monitor and grant access to reports and dashboards deployed on self-service platforms
Implement standards set by the data governance team in development of data artefacts and BI solutions
Track consumption of analytics resources and ensure equitable distribution to other analytics teams.
Actively challenge status quo and offer ideas to improve operations and existing solutions deployed by colleagues
Work in sprints with multidisciplinary teams including Analysts, data scientists, product managers, agile delivery managers, to scope, plan and deliver data driven insight

Education and experience required

Bachelor’s degree in IT, technology, data science, business analytics or math focused fields is preferred (or equivalent on-the-job experience and personal analytics projects).
Certified in AWS, SQL or Hadoop infrastructure support is preferred
Minimum 3 years Technical experience

Knowledge and skills:

Strong analytical and diagnostic skills
Ability to work in remote teams
Experience in working within a large complex organization with multiple stakeholders
Knowledge of technology project management tools such as JIRA, Planner, DevOps

Competencies:

Evidence of experience in maintenance on premise and cloud server stacks
Experience in automating ingestion pipelines from disparate sources and/or reporting from streaming datasets and event architectures
Experience working with big data technologies (preferably within AWS or Azure)
Technical understanding and experience in data engineering tools (Apache Spark, Hadoop, Impala, Hive, Hue, DataBricks)
Experience in data tools (Tableau, SAS, QlikView, PowerBI, Python, R, Knime, Alteryx)
Working knowledge of Hadoop and MS SQL, PowerShell and SSIS
Experience in creating and maintaining data pipelines
Experience in maintenance of SQL Server and Hadoop Clusters
Proficiency in using query languages across multiple platforms
Experience in object-oriented programing/function scripting language is an added advantage
Evidence of working with both structured and unstructured data

Education
Higher Diplomas: Business, Commerce and Management Studies (Required)

Apply via :

absa.wd3.myworkdayjobs.com