Lead Site Reliability Engineer

at Kontakt.io
  • Remote - Poland

Remote

DevOps

Senior

Job description

Kontakt.io is building the platform that care operations run on.

We reduce waste, cut costs, and improve revenue by improving throughput, asset utilization and staff productivity. Our platform uses AI, RTLS, and EHR data to enable self-learning agents to automate workflows, adapt in real-time, and orchestrate all of care delivery operations.

Easy to deploy and scale, it gives a clear picture of spaces, equipment, and people, eliminating inefficiencies and enhancing the patient experience. With measurable 10X ROI and over 20+ use cases, Kontakt.io is the go-to platform for better and faster care delivery operations.

We’re looking for a Lead Site Reliability Engineer  to own the reliability, performance, and automation of our cloud-based, real-time platform. This role will focus on keeping our platform running smoothly 247, minimizing downtime, improving observability, incident response, and self-healing automation. You will lead and scale the SRE team to ensure our infrastructure stays ahead of demand, operates efficiently and meets the needs of our growing healthcare customers.

Responsibilities

  • Ensure 99.99% uptime across our cloud platform, meeting strict SLAs for healthcare customers.
  • Leverage your software engineering expertise to write high-quality, maintainable code that improves system reliability and operational efficiency.
  • Design and implement self-healing, fault-tolerant systems to prevent failures before they happen.
  • Define SLIs, SLOs, and SLAs, ensuring proactive performance monitoring and incident resolution.
  • Architect and manage scalable cloud infrastructure (AWS) for massive real-time data processing.
  • Optimize containerized environments (Kubernetes, Docker) to support multi-region deployments.
  • Lead the adoption of infrastructure as code (Terraform) to fully automate infrastructure management.
  • Build and refine a world-class monitoring, alerting, and logging system using Prometheus, Grafana, OpenTelemetry, and Datadog.
  • Lead incident response and on-call operations, reducing mean time to detection (MTTD) and mean time to resolution (MTTR).
  • Conduct blameless postmortems and continuously improve system resilience.
  • Reduce manual intervention through automated deployment, scaling, and failover mechanisms.
  • Partner with Security & Compliance teams to ensure infrastructure meets HIPAA and SOC 2 standards
  • Lead disaster recovery and business continuity planning to ensure critical healthcare services are always available.
  • Drive technical strategy and roadmap for scalability, monitoring, and reliability engineering.
  • Collaborate with Product, Engineering, and Infrastructure teams to align SRE initiatives with business priorities.

What You Bring

  • 10+ years of experience in Site Reliability Engineering or Cloud Infrastructure.
  • 2+ years of experience as a software engineer
  • Proven success scaling high-traffic, mission-critical platforms in SaaS, IoT, or healthcare.
  • Deep expertise in cloud platforms (AWS), Kubernetes, and distributed systems.
  • Strong background in monitoring, logging, and observability with Prometheus, OpenTelemetry, or similar tools.
  • Hands-on experience with incident management, postmortems, and building resilient systems.
  • Deep knowledge of CI/CD automation, GitOps, and infrastructure as code (Terraform, etc.).
  • A mature leadership approach, with the ability to drive technical strategy while growing and mentoring a high-performance SRE team.
  • Strong understanding of network security, access management, and compliance frameworks (HIPAA, SOC 2).

Bonus Points If You Have:

  • Experience with healthcare IT, including EHR data, FHIR, and HL7 interoperability.
  • Expertise in real-time distributed systems, event-driven architectures, or large-scale data pipelines.
  • Prior experience leading on-call rotations and major incident management processes.

Why You’ll Love It Here

  • Own Mission-Critical Reliability – Ensure hospitals and care facilities always stay online with a 99.99% uptime healthcare platform.
  • Scale AI-Powered Infrastructure – Work on real-time automation and self-healing cloud systems that orchestrate care delivery.
  • Drive Big Impact in Healthcare – Help reduce waste, optimize resources, and improve patient care with technology that delivers 10X ROI.
  • Automation-First Culture – Minimize manual ops with cutting-edge automation, observability, and incident response strategies.
  • Join a High-Performing Team – Work with top engineers, AI experts, and healthcare innovators solving real-world challenges.

200 zł - 220 zł an hour

Ready to Build the Future of Healthcare?

Apply now and help scale the platform that care operations run on. 🚀

Share this job:
Please let Kontakt.io know you found this job on Remote First Jobs 🙏

Benefits of using Remote First Jobs

Discover Hidden Jobs

Unique jobs you won't find on other job boards.

Advanced Filters

Filter by category, benefits, seniority, and more.

Priority Job Alerts

Get timely alerts for new job openings every day.

Manage Your Job Hunt

Save jobs you like and keep a simple list of your applications.

Search remote, work from home, 100% online jobs

We help you connect with top remote-first companies.

Search jobs

Hiring remote talent? Post a job

Frequently Asked Questions

What makes Remote First Jobs different from other job boards?

Unlike other job boards that only show jobs from companies that pay to post, we actively scan over 20,000 companies to find remote positions. This means you get access to thousands more jobs, including ones from companies that don't typically post on traditional job boards. Our platform is dedicated to fully remote positions, focusing on companies that have adopted remote work as their standard practice.

How often are new jobs added?

New jobs are constantly being added as our system checks company websites every day. We process thousands of jobs daily to ensure you have access to the most up-to-date remote job listings. Our algorithms scan over 20,000 different sources daily, adding jobs to the board the moment they appear.

Can I trust the job listings on Remote First Jobs?

Yes! We verify all job listings and companies to ensure they're legitimate. Our system automatically filters out spam, junk, and fake jobs to ensure you only see real remote opportunities.

Can I suggest companies to be added to your search?

Yes! We're always looking to expand our listings and appreciate suggestions from our community. If you know of companies offering remote positions that should be included in our search, please let us know. We actively work to increase our coverage of remote job opportunities.

How do I apply for jobs?

When you find a job you're interested in, simply click the 'Apply Now' button on the job listing. This will take you directly to the company's application page. We kindly ask you to mention that you found the position through Remote First Jobs when applying, as it helps us grow and improve our service 🙏

Apply