Job Description

Summary

We are seeking a skilled Site Reliability Engineer (SRE) to join our **Platform Engineering division** focused on providing blockchain infrastructure to support the consensus mechanism and other blockchain related topics. In this role, you will help ensure the reliability, scalability, and security of our blockchain validator nodes and additional blockchain components running across both cloud and on-prem environments. You’ll work with cutting-edge technologies, ensuring high availability, monitoring, and performance tuning for mission-critical blockchain infrastructure.

In this role, you will:
  1. Provision and manage validator nodes on AWS, OpenStack, and on-prem infrastructure
  2. Build and maintain highly available, fault-tolerant systems to ensure validator nodes achieve maximum uptime and performance
  3. Implement automation for infrastructure management, deployment, and scaling using Terraform, Ansible, Helm and Kubernetes
  4. Set up and maintain Gitlab CI/CD pipelines for deployment automation, testing, security
  5. Develop and implement robust monitoring, logging, and alerting solutions to enable deeper insight to the deployed solutions using tools like Prometheus, Grafana, and ELK stack
  6. Optimize validator performance, ensuring efficient resource utilization and cost-efficiency across cloud and physical environments
  7. Establish and enforce security best practices, including encryption, access controls, etc.
  8. Create disaster recovery and backup plans to safeguard validator data and operations
  9. Collaborate with development and security teams to improve overall infrastructure and respond rapidly to incidents
  10. Conduct post-incident reviews to improve system reliability and decrease downtime
  11. Maintain detailed documentation for all processes, configurations, and infrastructure designs
  12. Participate in an on-call rotation to respond to critical issues in real-time
  13. Assist in regulatory compliance and security audits, ensuring infrastructure meets industry standards
What you need to be successful:
  1. Expertise in managing blockchain infrastructure, with a focus on the consensus mechanism and blockchain node operations
  2. Proficiency in cloud platforms (AWS, OpenStack) and services like EC2, S3, RDS, VPC, and IAM
  3. Strong grasp of networking concepts and security protocols, particularly for distributed systems
  4. Expertise in Docker and securing containerized environments
  5. Experience with Kubernetes and Helm for container orchestration and managing high availability in multi-node environments
  6. Proficient with **CI/CD pipelines**, ideally Gitlab, for automated deployment and testing
  7. Scripting proficiency (e.g., Bash, Python) and experience with Infrastructure as Code (Terraform, Ansible)
  8. In-depth understanding of monitoring and logging tools like Prometheus, Grafana, and the ELK stack
  9. Knowledge of cryptographic security, including encryption, access control, and other compliance-related measures
  10. Strong problem-solving abilities, with a focus on diagnosing and resolving infrastructure issues in real-time
  11. Excellent communication and collaboration skills, working across development, security, and operations teams
What’s in it for you:
  1. Accelerate your career growth by joining one of Europe's leading cryptocurrency management platforms
  2. 25 vacation days per year, with an additional day for each year of service - up to 30 days
  3. Access to cutting-edge technologies, high levels of autonomy, and international working environment
  4. Flexible working hours, hybrid work setup from both our Berlin and Porto offices
  5. Fitness (Urban Sports Club) and mental health (Likeminded) memberships
  6. Hot/cold drinks and snacks in the office, and All Hands meetings once a month with pizza

Skills
  • AWS
  • Communications Skills
  • Problem Solving
  • Python
  • Team Collaboration
© 2024 cryptojobs.com. All right reserved.