Job description
SEON is the leading fraud prevention system of record, catching fraud before it happens at any point across the customer journey. Trusted by over 5,000 global companies, we combine your company’s data with our proprietary real-time signals to deliver actionable fraud insights tailored to your business outcomes. We deliver the fastest time to value in the market through a single API call, enabling quick and seamless onboarding and integration. By analyzing billions of transactions, we’ve prevented $200 billion in fraudulent activities, showcasing why the world’s most innovative companies choose SEON.
The Site Reliability Engineering (SRE) team at SEON ensures our products and services’ reliability, scalability, and efficiency. The SRE team provides Incident Response, Reliability Engineering consulting, and limited embedded SRE engagements.
We seek a highly experienced and motivated SRE Manager to lead a team of Site Reliability Engineers. You will play a crucial role in maintaining the reliability and efficiency of our services, ensuring that our products and services are reliable while coordinating with cross-functional teams across various geographical regions. You will have a proven track record of leading top-performing teams in complex, fast-paced environments and will excel in organizing and motivating a team amidst rapid growth and change.
This role offers flexibility. It can be based in Budapest with a hybrid schedule or anywhere in the European Union with a remote setup, including occasional travel to our other offices.
WHAT YOU’LL DO:
- Lead and grow a high-performing SRE team responsible for the reliability, performance, and scalability of production systems.
- Own the incident management process, postmortems, and root cause analysis to improve system resilience.
- Drive implementation of SLAs, SLOs, and error budgets across services to align operational goals with business objectives.
- Champion the use of automation to reduce manual work and improve deployment and recovery times.
- Collaborate with software engineering and Platform engineering teams to ensure systems are designed for reliability and operational efficiency.
- Oversee system monitoring, alerting, and observability efforts using tools like Prometheus, Grafana, Datadog, or similar.
- Manage on-call rotations, and ensure proper documentation, runbooks, and playbooks are maintained.
- Identify and drive continuous improvement in system architecture, capacity planning, and deployment strategies.
- Ensure compliance with security, privacy, and regulatory requirements within the infrastructure.
- Provide mentorship, performance reviews, and career development opportunities for SRE team members.
- You will communicate effectively with stakeholders at all levels, providing updates on team performance, project status, and incident resolutions.
- You will advocate for the SRE team within the broader organization, representing their needs and concerns
W HAT YOU’LL BRING:
- Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent practical experience)
- Proven success in leading high-performing SRE or DevOps teams in a large-scale, fast-paced environment
- Extensive experience running high-availability web services at a large scale, with comprehensive knowledge of cloud-native architectures and advanced networking concepts
- Strategic vision to balance immediate operational needs with long-term reliability and scalability objectives
- Outstanding communication and interpersonal skills, with the ability to build strong relationships with team members and stakeholders
- Strong technical background with hands-on experience in cloud computing, system architecture, automation, and monitoring
- Excellent problem-solving skills with a focus on root cause analysis and proactive improvements
- Exceptional organizational skills, with the ability to manage multiple priorities and projects simultaneously
- Experience with tools and technologies such as AWS, Kubernetes, Terraform, Prometheus, Grafana, Jenkins, and similar.
NICE TO HAVE:
- Cloud Architect Certification in one of the public clouds (AWS, GCP, Azure)
- Good Knowledge of security controls for SOC2 and ISO certifications
WHAT’S NEXT:
Does that sound good to you? Great, we can’t wait to hear from you! Would you like to learn more about what it’s like to work at SEON first?
👉 https://seon.io/careers/