Job description
This role is an integral part of our aggressive strategy for evolving mSurvey’s current computing infrastructure. As a Senior Linux System Administrator, you will solve problems of global scale distributed systems that must evolve with a focus on scale, efficiency and availability using your creative abilities and experience with robust systems design.
Key Responsibilities
Component and framework designs supporting the virtualization and orchestration of mSurvey computing infrastructure, from conception and design through testing, deployment and operation
Working on projects that make our network more efficient while sustaining service and component stability, performance and secure
Working with our development QA and system QA teams to come up with regression tests and operational monitoring that cover new changes to our software
Working with our cross business unit engineering teams to support migration designs and critical network rollouts.
Identify and implement areas of improvement with current technologies
Troubleshooting, investigating, and remediate service outages and issues. Act as a mentor and escalation point for junior members of the team
Leading incident response teams as necessary to mitigate and deal with adverse events affecting our infrastructure
Work closely with software development and project management teams to support application deployments
Participate in balanced on-call rotation, after hours and weekends
Understand, engineer, and maintain the design dependencies and integrity within client environments and service level expectations
Performance, Capacity management, licensing, patching and working to maintain these within defined standards for specific clients/assets for applications installed with client environments
Administers all production, development, test, and training server environments
Works with IT and Security to ensure all servers and endpoints comply with relevant guidelines and regulations
Administers all backup and disaster recovery systems
Creating the appropriate documentation for our systems, including architecture and network diagrams and support procedures
Define and prioritize service automation work in line with the wider infrastructure and corporate strategy
Architect and design solutions that will live within Microsoft Azure, Amazon Web Services, and Google Cloud Platform
Skills and Attributes
5+ years of experience developing on Linux
Linux host system hypervisors including KVM
Unix Guru: understanding of Unix/Linux from kernel to shell, file systems, client-server protocols, etc.
Experience with configuration management / infrastructure as code tools like Ansible, Chef and Puppet
Linux kernel development and/or performance tuning
Strong SQL experience
Data center utilization monitoring and COGs modeling
Designing, implementing and deploying continuous build/deployment frameworks
Site reliability engineering and/or work related including service performance
Experience building scalable servers or distributed systems.
Highly responsible, self-disciplined, self-managed, self-motivated, able to work with little or no supervision.
Extensive experience working on multiple projects at a time in a fast paced, results oriented environment.
Excellent written and verbal communications skills
Experience using deploying and utilizing one or more: AWS, Azure, Google Cloud Platform, OpenStack
Experience implementing, designing, deploying: Docker, Kubernetes, Serverless (Lambda’s)
Experience with monitoring alerting using technologies like: Prometheus, Sensu, Nagios, Kafka, Wavefront, BigPanda, DataDog, PagerDuty
Individuals with a love for the African continent who want to be part of the team driving a business revolution