Observability Engineer

Job description

Location: Fully remote within the US

Long term contract (2+ years)

Must be a US Citizen

Our client’s Enterprise Monitoring team is looking for a senior Level Observability engineer. The team is responsible for enterprise infrastructure, application, and network monitoring for on-prem, hybrid, and various Clouds. The selected candidate will be joining a team of skilled engineers with a broad background in enterprise monitoring and Observability.

Your Impact:

As an Observability Engineer, this role is focused on maintaining the reliability, scalability and availability of our Log management solution as well as our Metrics and Observability platform which heavily uses automation (terraform, Ansible and scripts), this role requires maintaining performance KPI of our solutions and defining their SLOs.

Responsibilities:

  • Maintain and deploy monitoring and alerting.

  • Design, configuration and maintenance of log aggregation solution at a large scale.

  • Set up and manage ingestion pipelines and data transformations

  • Have the mindset of “automate any task”.

  • Monitoring and Alerting: Build and maintain robust monitoring systems using tools like Elk, Dynatrace, Prometheus, OTEL and Grafana to detect potential issues early and trigger alerts for timely response.

  • Maintain associated documentation as it applies to our audit and certification requirements

  • Participate in troubleshooting, capacity planning, and performance analysis activities

  • Research new monitoring requirements and in many cases write code for that.

  • Strong expertise in setting up monitoring policies/rules/templates; and writing scripts to accomplish monitoring requirements.

What you need to succeed:

  • BS/MS in CS/engineering or equivalent, OR 5+ years of experience.
  • 3+ years of experience working directly with monitoring tools as either an Admin, SME or as an Architect, preferably with Dynatrace and/or ELK.
  • Hands-on experience with designing data pipelines using filebeat, Logstash and/or fluentbit/fluentd.
  • Expert level with Either Dynatrace (managed, cloud as well as offline, with full scope of best practices and setup as it relates to Active gate, cloud, on-prem and custom with workflows), or with Elastic on-prem and cloud with best practices around the platform.
  • Fluent in writing scripts in languages like Python and (Bash or powershell) to automate tasks.
  • Experience in Terraform and Ansible. Syntax, best practices, and managing complex configurations in multi commercial and Gov clouds to build and manage infra and applications.
  • Very good working knowledge with Linux OS.
  • Highly self-motivated and directed
  • Good analytical and problem-solving/troubleshooting abilities.

Helpful Skills:

  • Knowledge of SNMP, TCP dump and tracing.
  • Knowledge of AIOPS platform.
  • Other scripting experience (JavaScript, Java, PowerShell, or others)
Share this job:
Please let ASCENDING know you found this job on Remote First Jobs 🙏

Similar Remote Jobs

Benefits of using Remote First Jobs

Discover Hidden Jobs

Unique jobs you won't find on other job boards.

Advanced Filters

Filter by category, benefits, seniority, and more.

Priority Job Alerts

Get timely alerts for new job openings every day.

Manage Your Job Hunt

Save jobs you like and keep a simple list of your applications.

Search remote, work from home, 100% online jobs

We help you connect with top remote-first companies.

Search jobs

Hiring remote talent? Post a job

Frequently Asked Questions

What makes Remote First Jobs different from other job boards?

Unlike other job boards that only show jobs from companies that pay to post, we actively scan over 20,000 companies to find remote positions. This means you get access to thousands more jobs, including ones from companies that don't typically post on traditional job boards. Our platform is dedicated to fully remote positions, focusing on companies that have adopted remote work as their standard practice.

How often are new jobs added?

New jobs are constantly being added as our system checks company websites every day. We process thousands of jobs daily to ensure you have access to the most up-to-date remote job listings. Our algorithms scan over 20,000 different sources daily, adding jobs to the board the moment they appear.

Can I trust the job listings on Remote First Jobs?

Yes! We verify all job listings and companies to ensure they're legitimate. Our system automatically filters out spam, junk, and fake jobs to ensure you only see real remote opportunities.

Can I suggest companies to be added to your search?

Yes! We're always looking to expand our listings and appreciate suggestions from our community. If you know of companies offering remote positions that should be included in our search, please let us know. We actively work to increase our coverage of remote job opportunities.

How do I apply for jobs?

When you find a job you're interested in, simply click the 'Apply Now' button on the job listing. This will take you directly to the company's application page. We kindly ask you to mention that you found the position through Remote First Jobs when applying, as it helps us grow and improve our service 🙏

Apply