Job description
Important Information
Location: Brazil
Job Mode: Full-time
Work Mode: Work from home
Job Summary
As a Senior Site Reliability Engineer (SRE) at Encora, your role is to lead efforts and ensure the reliability, availability, and performance of applications and platforms. As a Senior SRE, the role will include both oversight for production operations of systems, as well as incident management, root cause analysis, and implementing preventative measures. You will collaborate with development teams to enhance application performance and reliability.
For our clients, you’ll work with a global team responsible for end-customer-facing, business-critical applications. You will partner with Infrastructure, Platform Engineering, and Development teams to identify issues and improve system reliability.
Responsibilities and Duties
- Coaching and mentoring fellow team members;
- Use Splunk and other observability tools to monitor and troubleshoot application issues;
- Capture metrics and create dashboards using Splunk and other tools;
- Work with a global team to provide 24⁄7 support for production applications running on AWS and Mulesoft;
- Perform incident management, root cause analysis, and implement preventative measures;
- Work with team members and clients to investigate and escalate incidents;
- Responding proactively to indications of issues or complaints by customers;
- Applying industry best practices throughout our processes.
Essential Skills
- Experience in Tier 2 or Tier 3 product support of one of the following roles: business/systems analysis, technology/development, data/reporting, project management;
- Possess the ability to analyze logs and code to fix Tier 2 support issues;
- Experience as a Site Reliability Engineer (SRE), preferably with a focus on applications instead of platforms;
- Extensive experience with observability and monitoring, especially with OpenTelemetry, Splunk, AppDynamics, and Datadog;
- Experience with AWS and/or Kubernetes;
- Background in DevOps practices;
- Scripting experience with Python;
- Experience with L1 and L2 support, incident management, ITIL, and writing documentation;
- Experience with disaster recovery, business continuity planning, creating ServiceNow dashboards, Linux, and shell scripting;
- Deep background working in an Agile methodology;
- Knowledge of cloud-native application architecture design patterns;
- Experience using Postman or similar for making API calls and testing;
- Experience with Mulesoft.
About Encora
Encora is the preferred digital engineering and modernization partner of some of the world’s leading enterprises and digital native companies. With over 9,000 experts in 47+ offices and innovation labs worldwide, Encora’s technology practices include Product Engineering & Development, Cloud Services, Quality Engineering, DevSecOps, Data & Analytics, Digital Experience, Cybersecurity, and AI & LLM Engineering.
At Encora, we hire professionals based solely on their skills and qualifications, and do not discriminate based on age, disability, religion, gender, sexual orientation, socioeconomic status, or nationality.