Astronomer Logo

Senior Customer Reliability Engineer Infrastructure

Job Description

Astronomer empowers data teams to bring mission-critical software, analytics, and AI to life and is the company behind Astro, the industry-leading unified DataOps platform powered by Apache Airflow®. Astro accelerates building reliable data products that unlock insights, unleash AI value, and powers data-driven applications. Trusted by more than 800 of the world’s leading enterprises, Astronomer lets businesses do more with their data. To learn more, visit www.astronomer.io.

About this role:

The Astronomer Customer Reliability Engineering (CRE) team is responsible for the success of our customers’ usage of our managed Airflow service.

The CRE are responsible for operating, monitoring, and maintaining the platform to ensure availability, predictability, and reliable operations.

As an infrastructure specialist within the team, you will focus on the reliability of the underlying cloud infrastructure and Kubernetes clusters. This entails responding to incidents either raised by a customer, or from our monitoring system and then taking further steps to ensure problems are permanently resolved or monitored. As owners of the observability platform, CRE has unlimited potential to improve the reliability of the product and deliver the best possible outcome for our customers.

This role is directly customer-facing and gives exposure to very diverse problems and requirements. The CRE get the opportunity to interface with customers from a variety of industries across different cloud providers, and all with different expectations. Your contributions will directly impact customers’ success with using the Astronomer products, and you will be able to help make meaningful improvements to the customer experience.

Hybrid Work Model: For this role, you will embrace a flexible hybrid work model, working at least 3 days per week at our Office in Hyderabad while delivering a seamless experience that is digitally and physically connected.

What you get to do:

  • Provide solutions to customers to make them successful using our products.

  • Troubleshoot Customer environments and engage in active triaging with customers

  • Provide feedback to the product development teams on customer needs and pain points.

  • Build out our monitoring and alerting systems.

  • Build and maintain automation to ensure daily operational tasks are handled as efficiently as possible.

  • Help direct the architecture of the products and contribute where possible.

  • Own the customer experience, working directly with customers to prioritize and solve issues, meet SLAs, and provide “white glove” guidance on the path to production.

  • Participate remotely within a fully distributed team.

  • Enhance and Enrich customer documentation

  • Work on a modern, sophisticated, cloud-native product that customers use to connect to dozens of other systems.

  • Help maintain 24x7 coverage through a specified 6-hour pager period during your work day.

  • Participate in paid on-call rotation for weekend coverage.

What you bring to the role:

  • 5+ years of experience, preferably with large, complex cloud infrastructures operating at scale

  • 3+ years of experience with Kubernetes

  • Experience managing a Production  distributed system with at least one major cloud provider (one or all: AWS, GCP, Azure)

  • Strong Network Experience with one of the major Clouds

  • Strong Linux experience

  • Knowledge of how to operate and monitor issues for distributed systems

  • Experience with Observability tools

  • Previous experience in handling customers issues (internal and external)

  • Strong Communication Skills

  • DevOps or CI/CD experience

  • Python scripting

  • Good troubleshooting Skills

Bonus points if you have:

  • Experience as a Site Reliability Engineer

  • Worked with Kubernetes Custom Resources

  • Depth of knowledge with Azure

  • Airflow/Big Data Orchestration experience

  • IaC experience

#LI-Fulltime

#LI-Hybrid

At Astronomer, we value diversity. We are an equal opportunity employer: we do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

Share this job:
Please let Astronomer know you found this job on Remote First Jobs 🙏

1863 similar remote jobs

Explore latest remote opportunities and join a team that values work flexibility.

Remote companies like Astronomer

Explore remote-first companies similar to Astronomer. Discover other top-rated employers that offer flexible schedules and work-from-anywhere options.

Wizeline Logo

Wizeline

1001-5000 www.wizeline.ai

A global technology services provider building digital products and platforms, with a focus on AI-powered solutions.

View company profile →
Keboola Logo

Keboola

Build and automate data workflows and AI pipelines with our orchestration platform.

View company profile →
Dagster Labs Logo

Dagster Labs

An open-source data orchestration platform for building, scheduling, and monitoring AI and data pipelines.

View company profile →
Emi Labs Logo

Emi Labs

A frontline recruitment automation platform that uses AI to accelerate high-volume hiring across LATAM.

View company profile →
Scaler Logo

Scaler

A platform for real estate investors and managers to analyze building data, improve fund performance, and meet sustainability reporting.

View company profile →
Willow Logo

Willow

An AI-driven digital twin platform that optimizes building operations, reduces costs, and enhances sustainability.

View company profile →

Project: Career Search

Rev. 2026.6

[ Remote Jobs ]
Direct Access

We source jobs directly from 21,000+ company career pages. No intermediaries.

01

Discover Hidden Jobs

Unique jobs you won't find on other job boards.

02

Advanced Filters

Filter by category, benefits, seniority, and more.

03

Priority Job Alerts

Get timely alerts for new job openings every day.

04

Manage Your Job Hunt

Save jobs you like and keep a simple list of your applications.

21,000+ SOURCES UPDATED 24/7
Apply