Site Reliability Engineering

  • Remote - United Kingdom

Remote

DevOps

Director

Job description

Who are we?

IOG, is a technology company focused on blockchain research and development. We are renowned for our scientific approach to blockchain development, emphasizing peer-reviewed research and formal methods to ensure security, scalability, and  sustainability.

Our projects include the Cardano blockchain, as well as other products in the areas of decentralized finance (DeFi), governance, and identity management, aiming to advance the capabilities and adoption of blockchain and Web3 technology globally.

About Midnight:

IOG’s Midnight Tribe is a business technology provider and core contributor to the Midnight Network, a blockchain platform for developing decentralized applications that safeguard personal and commercial data. The Midnight Network is the first blockchain to offer programmable data isolation by leveraging zero-knowledge (ZK) proofs to enable selective disclosure of what information is visible on-chain and is designed to help developers implement necessary business policies, such as meeting regulatory requirements.

What the role involves:

As an experienced and visionary Head of Site Reliability Engineering (SRE), you will be responsible for leading the infrastructure and reliability strategy for Midnight, a regulatory-friendly blockchain focused on data protection, privacy, and freedom of expression.

In this senior leadership role, you will own the reliability, scalability, and performance of the Midnight platform. You will be responsible for building and leading a high-performing team of SREs, driving the SRE roadmap, and partnering closely with engineering, security, and product teams to deliver robust production systems.

You will be instrumental in setting the foundations of our infrastructure, designing systems that scale globally, and ensuring high availability, while embracing the unique challenges of a blockchain-based architecture. This is a hands-on leadership role combining technical depth, architectural vision, operational rigor, and people leadership.

  • Lead the SRE team, sharing expertise and best practices.  Coach, mentor and develop SRE team.
  • Demonstrate leadership in driving initiatives that enhance service reliability, scalability, and overall performance.
  • Lead the entire lifecycle of services, including inception, design, deployment, operation, and refinement.
  • Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning, and launch reviews.
  • Oversee the maintenance of live services by continuously measuring and monitoring factors like availability, latency, and overall system health.
  • Assist our teams in creating software that is both simple and flexible to configure and deploy.
  • Lead sustainable incident response practices, ensuring timely resolution with a focus on minimizing impact.
  • Collaborate with software engineering and testing teams to establish and maintain automated regression suite infrastructure and performance testing.
  • Sustainably scale systems through mechanisms like automation; evolve systems by pushing for changes that improve reliability and velocity.
  • Conduct blameless postmortems to analyze incidents, identify root causes, and implement preventive measures.

Who you are:

  • Bachelor’s degree in Computer Science, Information Technology, or a related field.
  • At least 8 years in a Reliability Engineering, DevOps or infrastructure focused role.
  • Proven track record of leading and managing a high-performing SRE team.
  • Experience writing code in Python, Rust/C++ or JavaScript.
  • Proven years of experience in Build and Release engineering, Linux operational excellence and automation.
  • Systematic problem-solving approach, coupled with effective communication skills and a sense of drive.
  • You will be someone who works well on your own and with a team.
  • You are kind and respectful of others’ opinions and you are open and act with integrity when engaging in academic or technical discussions.
  • Proven experience in capacity planning, performance monitoring, and optimization to ensure systems can handle current and future loads efficiently.
  • System engineering experience working with application servers, containers, and web servers.
  • Demonstrated ability to analyze incidents, identify root causes, and implement preventive measures to reduce the likelihood of recurring issues.
  • Strong understanding of cloud architecture including the major cloud providers (AWS, GCP, etc).
  • Experience working with Docker containers and related orchestration technologies (such as Kubernetes or ECS).
  • Knowledge of SRE principles (observability, SLOs, SLIs, logging, etc)
  • Understand underlying networking and security considerations when developing the architecture of our deployment environments.
  • Fluency in git based workflows, commit discipline.
  • Experience in providing mentorship and coaching to team members

Are you an IOGer?

Do you find yourself questioning the status quo? Do you tinker with ideas and long to turn those ideas into solutions? Are you able to spark thoughtful debates, bringing out the inquisitiveness in others? Does the promise of continuously growing excite you? Then get ready to reimagine everything you thought wasn’t possible because that’s what it means to be an IOGer - we don’t set limits, we break them.

  • Remote work
  • Laptop reimbursement
  • New starter package to buy hardware essentials (headphones, monitor, etc)
  • Learning & Development opportunities
  • Competitive PTO

At IOG, we value diversity and always treat all employees and job applicants based on merit, qualifications, competence, and talent. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

Share this job:
Please let Input Output know you found this job on Remote First Jobs 🙏

Benefits of using Remote First Jobs

Discover Hidden Jobs

Unique jobs you won't find on other job boards.

Advanced Filters

Filter by category, benefits, seniority, and more.

Priority Job Alerts

Get timely alerts for new job openings every day.

Manage Your Job Hunt

Save jobs you like and keep a simple list of your applications.

Search remote, work from home, 100% online jobs

We help you connect with top remote-first companies.

Search jobs

Hiring remote talent? Post a job

Frequently Asked Questions

What makes Remote First Jobs different from other job boards?

Unlike other job boards that only show jobs from companies that pay to post, we actively scan over 20,000 companies to find remote positions. This means you get access to thousands more jobs, including ones from companies that don't typically post on traditional job boards. Our platform is dedicated to fully remote positions, focusing on companies that have adopted remote work as their standard practice.

How often are new jobs added?

New jobs are constantly being added as our system checks company websites every day. We process thousands of jobs daily to ensure you have access to the most up-to-date remote job listings. Our algorithms scan over 20,000 different sources daily, adding jobs to the board the moment they appear.

Can I trust the job listings on Remote First Jobs?

Yes! We verify all job listings and companies to ensure they're legitimate. Our system automatically filters out spam, junk, and fake jobs to ensure you only see real remote opportunities.

Can I suggest companies to be added to your search?

Yes! We're always looking to expand our listings and appreciate suggestions from our community. If you know of companies offering remote positions that should be included in our search, please let us know. We actively work to increase our coverage of remote job opportunities.

How do I apply for jobs?

When you find a job you're interested in, simply click the 'Apply Now' button on the job listing. This will take you directly to the company's application page. We kindly ask you to mention that you found the position through Remote First Jobs when applying, as it helps us grow and improve our service 🙏

Apply