Job Description
Summary
Our Reliability Engineering team highly values people with intellectual curiosity and openness. We collaborate across the organization, helping our engineers think big and take risks while building a culture of diversity, positive energy and blameless truth-seeking. We encourage self-starting on high-impact projects within the context of strong support and mentorship.
- Improve observability, reliability and availability by defining and measuring key metrics
- Build automation and improve systems to eliminate toil and operations work.
- Collaborate with our core infrastructure team to performance tune and optimize our cloud deployments. (Think Docker, Terraform, Kubernetes, EC2, etc.)
- Collaborate with Coinbase product teams to reduce service disruptions and automate incident response
- Proactively find and analyze reliability problems across our business units and stack, then design and implement software to create step-function improvements.
- Educate, mentor and hold accountable the engineering team to improve the reliability of our systems and make reliability a core value of the Coinbase engineering culture.
- Write high quality, well tested code to meet the needs of your customers.
- Deugging extremely difficult technical problems, and making systems and products both work better and are easier to deploy, own, operate and diagnose.
- Review all feature designs within your product area and across the company for cross-cutting projects.
- Be an owner of the security, safety, scale, operational integrity, and architectural clarity of these designs.
- Build pipelines to integrate with 3rd party vendors
- Participate in an on-call support rotation to provide timely troubleshooting and resolution of urgent issues.
What we look for in you (ie. job requirements):
- You have at least 7+ years of experience in software engineering.
- You’ve designed, built, scaled and maintained production services, and know how to compose a service oriented architecture.
- You write high quality, well tested code to meet the needs of your customers.
- You’re passionate about building an open financial system that brings the world together.
- You possess strong technical skills for system design and coding
- Excellent written and verbal communication skills, and a bias toward open, transparent cultural practices.
- Strong skills around observability, debugging and performance tuning
- Strong communication skills and ability to explain technical concepts clearly and simply
- Strong interpersonal skills working with Engineers from junior to principal levels
- Demonstrated critical thinking under pressure
- A willingness to dive into understanding, debugging, and improving any layer of the stack
- This role requires on-call availability to ensure swift resolution of issues outside regular business hours.
Nice to haves:
- Experience designing and building reliable systems capable of handling high throughput and low latency
- Experience with observability and monitoring systems such as Kibana, Datadog, etc.
- Familiarity with working in rapid growth environments
- Experience in Ruby, Go, and Terraform
- Experience with AWS, GCP, Azure, or other cloud environment
- Experience designing and building reliable systems
- Experience working in a highly regulated environment
- Experience writing company-facing blog posts and training materials
- Crypto-forward experience, including familiarity with onchain activity such as interacting with Ethereum addresses, using ENS, and engaging with dApps or blockchain-based services
Skills
- AWS
- Communications Skills
- Cryptocurrency
- Development
- Software Engineering