Site Reliability Engineer

at Zealogics Inc
  • Remote - India

Remote

DevOps

Mid-level

Job description

Key Responsibilities:

  • Lead investigation and resolution of critical, recurring, or high-impact incidents across Azure and Microsoft 365 automation workflows.

  • Deep-dive into PowerShell, Bicep, and YAML scripts to identify logic errors, misconfigurations, or scalability limitations within automated provisioning workflows.

  • Debug and optimize .NET (C#) components within Azure Functions or related application layers used in workflow orchestration.

  • Analyze usage patterns and telemetry data from Azure Monitor, Application Insights, and Log Analytics to identify systemic issues or opportunities for automation enhancement.

  • Implement fixes and design improvements to automation logic that reduce manual intervention and improve workflow reliability (e.g., auto-remediation scripts, retry logic).

  • Own and evolve the automation framework for Teams and SPO lifecycle operations — including operations like create/delete, external sharing restrictions, and role/ownership changes.

  • Collaborate with product owners and architects to introduce new automation use cases or extend existing workflows.

  • Conduct post-incident reviews (PIRs) for high-severity incidents, drive root cause analysis (RCA), and implement corrective actions.

  • Mentor L1 and L2 engineers, conduct knowledge-sharing sessions, and support onboarding of new team members.

  • Stay updated with changes in Azure, Microsoft 365 APIs, and automation tooling (PowerShell modules, Bicep schema updates, etc.)

  • Provide guidance on architecture and best practices for automation reliability

Required Skills & Experience:

  • 12+ years of experience in cloud platform engineering, DevOps, or site reliability engineering (SRE) roles with a focus on automation and operational excellence.

  • Proficiency in PowerShell scripting, including writing reusable modules, automation logic, and error handling for production workloads.

  • Extensive experience with Infrastructure as Code using Bicep, including authoring, debugging, and deploying templates for complex Azure resources.

  • Strong understanding of CI/CD processes and YAML pipelines, with hands-on experience in automating build/release workflows in Azure DevOps.

  • Proficient in .NET (C#) — especially for debugging Azure Functions or working on backend components integrated into M365 automation flows.

  • In-depth knowledge of Microsoft 365 platform, including API usage, Teams & SharePoint Online provisioning, governance, and permissions management.

  • Proven ability to troubleshoot and optimize Azure-native services such as API Management, Azure Functions, Storage, Service Bus, Key Vault, and Container Apps.

  • Skilled in telemetry and observability — leveraging Azure Monitor, Log Analytics, Kusto queries, and custom logging to proactively identify issues.

  • Experience conducting root cause analysis, post-incident reviews, and implementing system-wide improvements to reduce incident frequency and MTTR.

  • Experience in mentoring support engineers, contributing to runbook creation, and improving team capability over time.

  • Strong analytical, documentation, collaboration and stakeholder communication skills

Share this job:
Please let Zealogics Inc know you found this job on Remote First Jobs 🙏
Apply now