Data Engineer

Role Overview: 

The Data Engineer role at Sanku Project Healthy Children is vital for the orchestration and optimization of the organization’s data pipeline. Tasked with ensuring the seamless integration of diverse data sources into a unified, secure, and fault-tolerant system, the role emphasizes the robustness and reliability of data flows. They are responsible for monitoring cloud data systems, overseeing database administration, administering data governance, and ensuring compliance with local and international regulations. Collaborating closely with the Data Scientist and Senior Data Analyst, the Data Engineer ensures the timely, secure, and efficient processing of data, thus empowering the organization to fulfill its mission effectively. 

Key Responsibilities: 
Data Pipeline Administration: 

Monitor data pipeline on AWS, ensuring its robustness and reliability. 
Build, maintain, and optimize data models, data structures, and ETL processes using Apache Airflow and Python. 

Performance Monitoring: 

Regularly monitor data performance and make necessary pipeline modifications using CloudWatch. 
Debug complex data pipeline issues ensuring continuous data flow.  

Database Administration: 

Oversee and optimize databases like MySQL for efficient data handling and querying. 
Generate, document, and test various scripts essential for operational metrics and reports. 

Data Validation and Cleaning: 

Validate data extracted from the pipeline against other relevant data sources. 
Automate processes using AWS Lambda ensuring consistent and accurate data extraction. 
Develop and implement algorithms to clean and validate data.  

Collaboration: 

Work closely with the Data Scientist and Senior Data Analyst to refine data-driven strategies. 
Assist teams with data-related technical issues and fulfill their data pipeline needs. 
Identify system enhancements and recommend changes.  

Data Governance Administration: 

Implement and enforce standards and guidelines across the ETL and data pipeline processes. 
Work collaboratively with stakeholders to define and maintain metadata standards, ensuring consistent data definitions and clarity. 
Oversee data quality assurance processes, ensuring data integrity and reliability throughout the data pipeline. 
Advocate for data privacy and security protocols, ensuring compliance with relevant local and international regulations. 

Data Engineer Stack: 

Data Warehousing: Amazon S3, Amazon RDS 
Data Integration & Processing: Apache Airflow, Python, AWS Lambda 
Monitoring & Alerts: CloudWatch 
Database Management: MySQL 
Data Visualization & Reporting: Power BI 
Data Exchange: REST APIs, JSON, NetSuite REST API, Postman 
Infrastructure & Networking: AWS ECS, AWS EC2, AWS VPC, AWS Subnets, AWS Route Tables, AWS Security Groups 
Automation & Scripting: IaC automation using Terraform, SQL, Python, Bash and Linux Scripting 
Version Control: Git, AWS CodeCommit, GitHub Actions, AWS Code Pipeline 

Key Performance Indicators (KPIs): 

Data Pipeline Efficiency: Measure the performance, speed, reliability, and cost-effectiveness (including cost management) of data pipelines, ensuring data is available for analysis in a timely manner. 
Data Accuracy and Integrity: Monitor the accuracy of data ingested into systems and ensure that data cleaning processes are effectively maintaining the quality of data. 
System Uptime and Resilience: Ensure that data pipelines, including databases and ETL processes, are consistently available with minimal downtime. 

Qualifications & Experience: 

Bachelor’s or Master’s degree in Computer Science, Data Science, Engineering, or a related field. 
Minimum of 5 years of of progressive experience in data engineering, especially in large scale and complex FMCG data environments. 
Proficiency in the aforementioned data stack. 
Demonstrated ability to build scalable data models and data pipelines. 
Familiarity with big data tools and environments. 
Strong problem-solving skills and analytical thinking. 
Ability to work in a team-oriented environment and communicate effectively.

Apply via :

sankuphc.bamboohr.com