Job Description

Summary

What You will Do:

  1. Set high standards for Reliability at Alchemy
  2. Develop and own company wide Reliability best practices like SLO definition, incident management, postmortem reviews, launch readiness reviews, change management
  3. Architect production infrastructure and tools that encourage and enforces high reliability
  4. Inspire the broader engineering organization to ensure Reliability is a first class citizen in the products we build
  5. Collaborate, partner, advice, review and mentor engineering teams on Reliability topics like high reliability architecture, observability, safe change management
  6. Improve critical infrastructure and systems that are used to operate infrastructure at scale (i.e. compute, networking, deployment, observability, code tooling/libraries etc.)
  7. Develop and own best practices for managing production infrastructure: provisioning, application scaling, configuration management, capacity planning, monitoring, etc.
  8. Develop and own best practices for developer processes: CI/CD, dev and staging environments, etc.
  9. Provide input into long-term platform requirements and operational guidelines with a focus on reliability
  10. Continuously raise our standard of engineering excellence by implementing best practices for coding, testing, and deployment
  11. Build and maintain documentation around process and workflows

What We are Looking For:

  1. 6+ years of experience as an Infrastructure Engineer focused on Reliability (e.g., Site Reliability Engineer, Production Engineer, Platform Engineer)
  2. Experience leading and driving company wide reliability efforts and engineering initiatives
  3. Experience with observability best practices and tooling like Prometheus, Grafana and Datadog
  4. Experience designing and operating large-scale, multi-region production systems
  5. Experience working with AWS or other cloud infrastructures
  6. Experience with container schedules and runtimes such as Docker and Kubernetes
  7. Experience building deployment pipelines leveraging common CI/CD tools (e.g. Argo, Flux, Gitops)
  8. Experience with Infrastructure-as-Code (e.g. Terraform, Pulumi, Chef, Puppet, etc)
  9. The cross-functional nature of this role requires strong communication and collaborations skills
  10. (Preferred) Experience with running production services on bare-metal
  11. (Preferred) Experience with Typescript and Python
  12. (Preferred) Excellent understanding of web applications and architecture

More on The Role

Alchemy is committed to offering competitive compensation, including base salary as well as equity. Additionally, Alchemy offers comprehensive medical, dental, and vision coverage, as well as other benefits such as 401k and unlimited flexible time off.

The base salary range for this position is estimated to be between $135,000 - $350,000 annually. Please note this range reflects base salary only, and does not include bonus, equity, or benefits. Your salary will be determined by various factors, including relevant experience, skill set, qualifications, and other business needs.

Skills
  • AWS
  • Communications Skills
  • Development
  • Python
  • Team Collaboration
  • TypeScript
© 2024 cryptojobs.com. All right reserved.