Job description
Basic Function
The Senior Site Reliability Engineer (SRE) is a developer with a strong operations mindset, responsible for ensuring the reliability, availability, and scalability of Lumin Digitalโs applications. This role focuses on eliminating manual tasks through automation, maintaining Service Level Objectives (SLO), and closely collaborating with Software Engineers (SWE) to implement and maintain best practices for large-scale systems. The ideal candidate thrives in solving complex problems, automating processes, and creating resilient systems.
Essential Functions, Responsibilities, Experience:
Design, implement, and manage CI/CD pipelines to improve deployment efficiency.
Monitor and resolve issues in all environments, ensuring SLO and uptime targets are consistently met.
Collaborate with Software Engineers to address SRE concerns during feature design and deployment.
Participate in capacity planning and demand forecasting to proactively address performance bottlenecks and scalability needs.
Perform change management to maintain system stability and minimize disruptions.
Generate uptime and SLO reports for internal review and leadership visibility.
Engage in SRE scrum team activities to drive agile development processes.
Ensure security best practices are followed, safeguarding data integrity and system resilience.
Perform other duties as assigned
Where the Role Will Grow:
30 Days: Understand the architecture, monitoring tools, and CI/CD pipelines currently in use. Begin participating in SRE team activities and resolving basic operational issues.
90 Days: Take ownership of monitoring and alerting systems, improve incident response processes, and contribute to SLO reporting.
1 Year: Deliver measurable improvements in system reliability, scaling capabilities, and automation processes. Take a leadership role in SRE best practices and mentor junior team members.
Knowledge, Skills, & Abilities:
Exceptional full-stack troubleshooting skills, with a focus on resolving system-level issues.
Expertise in at least one configuration management system (e.g., Chef, Ansible, Puppet).
Strong understanding of networking protocols and components such as HTTP, DNS, TCP/IP, and Load Balancing.
Experience with cloud hosting platforms, with AWS preferred (Google Cloud and Azure also valued).
Hands-on experience with Terraform, Kubernetes, and containerization technologies like Docker.
Solid understanding of CI/CD workflows and the ability to architect robust pipelines.
Familiarity with monitoring and alerting strategies, including self-healing and escalation processes.
Commitment to improving on-call experiences by creating resilient and automated systems.
Strong problem-solving skills with a focus on automation and operational efficiency.
Security mindset with a focus on protecting data integrity and resilience.
Excellent written and verbal communication skills.
Proven ability to work within an agile scrum team.
Ability to participate in a 24x7 on-call rotation.
2+ years of experience as a software engineer, with C#, Angular, or JavaScript preferred.
AWS certifications such as SysOps or Solutions Architect (preferred but not essential).
Experience with Amazon RDS, EKS, and CloudWatch.
Education:
Bachelorโs degree or higher in Computer Science, or equivalent experience required.
Travel:
Minimal, generally 12 days or less per year, ~2X team get togethers a year
$170,000 - $200,000 a year
LIFE AT LUMIN DIGITAL
Lumin Digital is a trailblazer in digital banking solutions, driven by a unique approach to technology, service, and people. We empower credit unions and banks by creating cutting-edge digital experiences that continuously serve, engage, and grow their membership base. Lumin is 100% cloud-native, purpose-built to unlock the full advantages of the cloud for financial institutions and their users.
At Lumin, we thrive on curiosity and innovation. Our culture fosters trust - in our expertise and decisions, respect - for diverse perspectives and talents, and boldness - in pursuing innovative paths. These values guide us, shaping a workplace where collaboration thrives, ideas flourish, and new possibilities are discovered. Focused on continuous improvement and innovation, we encourage our team to explore, experiment, and put new ideas into action, challenging the usual way of doing things.
All qualified applicants, including those with arrest or conviction records, will be considered for employment. Any conditional offer will include a notice regarding the review of the candidateโs criminal history as part of the hiring process.
For more information, visit lumindigital.com.