Site Reliability Engineer

Job description
At Ona, we don’t just strive for diversity, we thrive on it. For Ona, diversity has been a spring board for creativity, innovation, and growth. We are committed to giving equal opportunities to employees and applicants regardless of their race, religion, gender, sexual orientation, colour, nationality, age, marital status, or pregnancy status.
We’re looking for developers who want to build foundational data systems that drive change. Our team has worked on projects that record the social infrastructure of entire countries, tally the winners of national elections, and reduce infant mortality. We build software that solves real problems and you will too.
Qualities we’re looking for

Thoughtful coder.

You understand the importance of abstractions and interfaces. You keep modules loosely coupled and know that algorithms + data structures = programs.
You read and understand existing systems before diving in, then you research and stand on the shoulders of giants to follow best practices. You know how to prototype, how to iterate, and when to step back and think it through or ask questions.

Builder.

You are committed to the projects you work on and need to see them through to completion. You understand that solving the user’s problem is the end goal.
You prefer open systems that are verifiably secure, you publish and use open source code, like we do.
Lifelong learner.
You stay up to date with the latest trends and are excited to learn new languages, tools, and best practices.

Explorer.

You thrive in teams and projects that span timezones and cultures.
You’re ready and excited to travel in order to support projects, no matter how dusty or remote.

Requirements

Minimum 3 years maintaining production systems on Linux
Minimum 2 years writing production web applications
Minimum 1 year working with deployment or infrastructure tools, e.g. Ansible, Chef, Puppet
Experience working with remote teams
Strong attention to detail and understanding of architectural dependencies.
Strong troubleshooting and problem solving skills.

Nice to haves

Experience managing and automating infrastructure on AWS, GCP, and Azure
Experience writing Clojure, Java, JavaScript, and Python
Experience using Ansible, Terraform, and Hashicorp Vault
Experience using Docker, Kubernetes, and KOPS
Experience using InfluxDB, Graphite, and Grafana
Experience using Monit, Nginx, and SystemD