Job Description
Summary
We are seeking a Senior DevOps Engineer to join our infrastructure team. In this role, you will be a big part towards the shift to a modern, fully automated, code-driven infrastructure, ensuring we meet SLAs and SLOs while optimizing costs.
You will play a pivotal role in transitioning from a low-automation, hybrid cloud-bare metal setup to a streamlined, Kubernetes-driven environment. By implementing full IaC, you will eliminate bottlenecks, empower developers, and drive the evolution of our CI pipelines into a comprehensive GitOps framework, enhancing scalability and ensuring seamless deployments.
Current Tech Stack: LXC, QEMU VMs, Kubernetes, AWS EKS, S3, nginx, ElasticSearch, MongoDB, Postgres, Prometheus, Grafana, ELK, Jenkins, SOPS, Ansible, Terraform, Talos Linux, ArgoCD, GitHub Actions.
Location: Ideally hybrid in our Lisbon office, but we welcome fully remote candidates.
Responsibilities:
- Develop and execute the DevOps roadmap, focusing on modern, automated, and code-driven infrastructure solutions.
- Enhance system stability, scalability, and performance to meet business and SLA requirements.
- Identify infrastructure deficiencies, propose and implement improvements.
- Collaborate with teams to transition from low automation to full Infrastructure as Code (IaC).Manage and optimize CI pipelines, evolving them into a robust GitOps framework.
- Participate in on-call rotations, ensuring prompt incident resolution.
- Monitor infrastructure health, track uptime SLAs, and implement reliability improvements.
- Support efficient and cost-effective infrastructure scaling.
What we look for:
- Expertise in Infrastructure as Code (IaC) with tools like Terraform, Crossplane, or Pulumi.
- Strong knowledge of Kubernetes and Kubernetes Operators.
- Experience managing modern bare-metal infrastructures.
- Proficiency in monitoring tools like Prometheus and Grafana.
- Hands-on experience managing large-scale databases (e.g., MongoDB, ElasticSearch, Postgres with datasets > 5TB).
- Familiarity with GitOps practices and tools (e.g., ArgoCD, Flux).
- Proficiency in Linux administration and technologies such as LXC, VMs, and Docker.
- Bonus: Experience with managing large-scale storage systems (e.g., Ceph).Solid understanding of nginx, Jenkins, and Ansible.
Skills
- Development
- Software Engineering
- Team Collaboration