Site Reliability Engineer at Alchemy

Job description

Our mission is to bring web3 to a billion people, by providing builders with the tools they need to build exceptional onchain products. Alchemy is the only complete developer platform that offers the powerful APIs, SDKs, and tools necessary to build and scale onchain apps and rollups.

Our infrastructure powers 70% of the top web3 teams, 90%+ of web2 companies building in web3 and 100+ million end users. Our customers include top web3 brands like Polymarket, OpenSea, Circle, WorldCoin, as well as major global brands like Shopify and Adobe.

The Alchemy team draws from decades of deep expertise in massively scalable infrastructure, AI, and blockchain from leadership roles at leading companies and universities like Google, Microsoft, Facebook, Stanford, and MIT.

We’re backed by the world’s leading VCs and institutions, including: Lightspeed, Silver Lake, a16z, Coatue, Pantera, Addition, Stanford University, Coinbase, and Charles Schwab, among others.

The Role

Site Reliability Engineers excel in converting manual operational tasks into automated processes while building and maintaining tools and infrastructure. As an SRE you should always tackle problems methodically while taking into account systems scalability, high availability, latency, and resilience. With strong experience in operations, networking, infrastructure, software development, observability and troubleshooting, SREs are one of the most versatile roles anyone can grow into.

Responsibilities

Design, build, and refactor major software components that improve the availability, resilience, performance and efficiency of our system.
Is part of our on-call rotation and responds to our infrastructure incidents in accordance with our policy.
Proactively addresses bugs and bottlenecks as part of our infrastructure.
Can define and choose the best SLI/SLOs in accordance to our system needs.
Is able to choose the best tools for different problems and can adapt to our ever-changing specifications and growth.
Addresses issues in our Incident Management process by reducing and fixing noisy alerts, reducing MTTD and MTTR and is able to support other team members on this aspect.
Able to identify and address design bottlenecks in our infrastructure.
Able to mentor new hires and onboard them to our tools and infrastructure.
Able to address code complexity and efficiency issues while constantly addressing software bugs.
Able to support and guide other team members with code-related problems and participate in and offer effective code reviews.

What We’re Looking For

Experience writing efficient code in one or more programming languages (e.g. Python, Golang, Java, Rust).
Experience developing software applications and tools from scratch that can be expanded and used by other team members by offering a clear structure, reusable code patterns and guidance.
Past experience designing and managing the lifecycle of complex systems while taking into account multiple factors such as costs, systems performance, scalability, resilience and disaster recovery.
Expertise in all aspects of operating Linux-based systems with focus on troubleshooting, configuration and monitoring.
Past experience managing large scale infrastructures running on Baremetal, Public and Private cloud (e.g AWS, GCP, Azure) and Container-based infrastructure (Kubernetes, Openshift, Docker etc.).
Knows the insides of different protocols across the stack such as HTTP, DNS, DHCP, routing protocols, etc.
Leverages programming languages and different automation tools to reduce toil and automate repetitive tasks.
Past experience with IaaC such as Terraform or Pulumi, and Configuration Management tools (e.g. Ansible, Puppet, Chef).
Experience with one or more CI/CD solutions (e.g. Jenkins, ArgoCD, Gitlab pipelines, Spinnaker, Harness) is a must.
Experience implementing monitoring and logging solutions for infrastructure and applications.
Must have experience with monitoring and logging tools such as Prometheus, Thanos, Splunk, Grafana, Graphite, Loki, etc.
Past experience leading a team is a big plus.
Has great communication skills and is able to express his ideas to other team members effectively.

Perks

Attractive salary package
Opportunity to work with the latest cloud and blockchain technologies
Fully remote work or hybrid depending on candidate preferences
Token allocation similar to equity packages in traditional companies
Growth budget, to be spent at the candidate’s discretion
Equipment stipend
Flexible time away
Private Medical Insurance
Start-up environment: internal off-site hackathons, access to company-rented hacker house during summer
Crypto market investment opportunities and guidance

Benefits of using Remote First Jobs

Discover Hidden Jobs

Unique jobs you won't find on other job boards.

Advanced Filters

Filter by category, benefits, seniority, and more.

Priority Job Alerts

Get timely alerts for new job openings every day.

Manage Your Job Hunt

Save jobs you like and keep a simple list of your applications.

Search remote, work from home, 100% online jobs

We help you connect with top remote-first companies.

Search jobs

Hiring remote talent? Post a job

Frequently Asked Questions

What makes Remote First Jobs different from other job boards?

Unlike other job boards that only show jobs from companies that pay to post, we actively scan over 20,000 companies to find remote positions. This means you get access to thousands more jobs, including ones from companies that don't typically post on traditional job boards. Our platform is dedicated to fully remote positions, focusing on companies that have adopted remote work as their standard practice.

How often are new jobs added?

New jobs are constantly being added as our system checks company websites every day. We process thousands of jobs daily to ensure you have access to the most up-to-date remote job listings. Our algorithms scan over 20,000 different sources daily, adding jobs to the board the moment they appear.

Can I trust the job listings on Remote First Jobs?

Yes! We verify all job listings and companies to ensure they're legitimate. Our system automatically filters out spam, junk, and fake jobs to ensure you only see real remote opportunities.

Can I suggest companies to be added to your search?

Yes! We're always looking to expand our listings and appreciate suggestions from our community. If you know of companies offering remote positions that should be included in our search, please let us know. We actively work to increase our coverage of remote job opportunities.

How do I apply for jobs?

When you find a job you're interested in, simply click the 'Apply Now' button on the job listing. This will take you directly to the company's application page. We kindly ask you to mention that you found the position through Remote First Jobs when applying, as it helps us grow and improve our service 🙏

Site Reliability Engineer

Job description

Similar Remote Jobs

Senior Software Engineer - Site Reliability Engineering

Senior Software Engineer, Site Reliability Engineering

Senior Software Engineer, Site Reliability Engineering

Senior Software Engineer, Site Reliability Engineering

Software Engineer, Site Reliability Engineer

Senior Site Reliability Engineering Engineer

Site Reliability Engineer - Storage Engineer

Senior Site Reliability Engineer, DevOps Engineer

Site Reliability Engineer, Observability Engineer