Job Description
Company Description
ABOUT IQ-EQ
We’re a leading Investor Services group offering end-to-end services in administration, accounting, reporting, regulatory and compliance needs of the investment sector worldwide.
We employ a global workforce of 5,800+ people across 25 jurisdictions and have assets under administration (AUA) exceeding US$750 billion. We work with 13 of the world’s top 15 private equity firms.
Our services are underpinned by a Group-wide commitment to ESG and best-in-class technology including a global data platform and innovative proprietary tools supported by in-house experts.
Above all, what makes us different is our people. Operating as trusted partners to our clients, we deliver intelligent solutions through a combination of technical expertise and strong relationships based on deep client understanding.
We’re driven by our Group purpose, to power people and possibilities.
Job Description
About the Role
We are looking for a DevOps Engineer to manage and support our production-grade Data Platform. Unlike standard cloud-native roles, this position is hands-on with Bare-Metal Kubernetes (K3s).
The role includes ownership of the full lifecycle of highly available k3s clusters, with responsibilities spanning from the OS layer (nftables, proxying) to the application layer. The on-premises infrastructure will be integrated with AWS and Azure to ensure secure, compliant, and high-performance operations for data engineering teams.
The Stack (What you will work with)
- Orchestration: K3s (High Availability, Embedded etcd).
- Networking: Flannel (VXLAN), MetalLB, HAProxy, Keepalived, Traefik Ingress.
- Storage: Longhorn (Distributed Block Storage).
- Identity & Security: Authentik, Azure AD (Entra ID), AWS IAM Roles Anywhere, Nginx WAF + ModSecurity.
- Observability: Prometheus, Headlamp, Cloudwatch.
- Data Ops: Dagster, PostgreSQL, S3, Snowflake, Azure Storage
Key Responsibilities
1. Bare-Metal Kubernetes Management
- Manage and troubleshoot HA K3s clusters running with embedded etcd.
- Troubleshoot complex CNI issues involving Flannel VXLAN (custom non-standard ports) and overlay networking.
- Maintain internal and external load balancing strategies using MetalLB (Layer 2) and HAProxy/Keepalived for high availability.
2. Storage & Data Platform Support
- Administer Longhorn distributed storage: manage volume replication, snapshots, and disaster recovery/backup to S3.
- Support the Dagster data orchestration platform, ensuring reliable execution of data pipelines and integration with PostgreSQL and AWS S3.
3. Hybrid Identity & Security
- Manage OIDC/SSO flows using Authentik integrated with Azure AD. You will handle user attribute mapping (Python-based) and secure application access.
- Maintain AWS IAM Roles Anywhere to facilitate secure, certificate-based access (X.509) from on-prem workloads to AWS resources without long-lived credentials.
- Configure and tune Web Application Firewalls (WAF) using Nginx and ModSecurity (OWASP Core Rule Set) to protect services while managing OIDC exclusions.
4. Observability, Automation & SRE
- Maintain the Prometheus stack and Cloudwatch dashboards for cluster monitoring.
- Implement SRE practices: Define Service Level Objectives (SLOs) for critical pipelines and lead blameless post-mortems for incidents.
- Focus on “Toil Reduction” by automating manual, repetitive operational tasks to improve efficiency.
- Implement GitOps-style deployments using Helm and Gitea (integrated with Azure DevOps).
- Manage infrastructure configuration, including secrets encryption at rest and corporate proxy integrations.
Qualifications
Technical Requirements (Must-Haves)
1. Kubernetes (Deep Dive / On-Prem)
- 4+ years experience with Kubernetes, specifically with lightweight or bare-metal distributions (K3s, RKE2, or vanilla K8s).
- Experience troubleshooting etcd (embedded or external) and performing cluster backups/restores.
- Strong understanding of Ingress controllers (Traefik preferred) and TLS termination.
2. Linux & Networking Internals
- Good Linux networking skills: nftables/iptables, VXLAN overlays, and troubleshooting ephemeral port conflicts.
- Experience with HAProxy and Keepalived for virtual IP (VIP) management.
- Comfortable working behind corporate proxies (handling NO_PROXY configurations, custom certificate trust chains, and DNS resolution).
3. IAM & Hybrid Cloud
- Experience with OIDC/SAML authentication flows (Keycloak, Authentik, or similar).
- Hands-on experience with AWS IAM, specifically trusting external identities or bridging on-prem resources to AWS (S3, etc.).
- Experience managing Azure AD enterprise applications/registrations is a plus.
4. Storage Operations
- Experience with Container Attached Storage (CAS) solutions like Longhorn, Rook/Ceph, or OpenEBS. Understanding of PVCs, StorageClasses, and volume snapshots.
Preferred Qualifications (Nice-to-Haves)
- Data Ops: Familiarity with Dagster, Airflow, or similar data pipeline tools.
- SRE Mindset: Familiarity with Site Reliability Engineering principles (SLIs/SLOs, Error Budgets).
- Security: Experience implementing Kubernetes cluster security policies, OWASP ModSecurity rules and tuning WAFs to reduce false positives.
- Scripting: Proficiency in Python (specifically for automation), Bash or Rust.
- Certifications: CKA (Certified Kubernetes Administrator) or CKS (Certified Kubernetes Security Specialist).
Additional Information
OUR COMMITMENT TO YOU AND THE ENVIRONMENT
As a forward-looking business, sustainability is integral to our strategy and operations. Our sustainability depends on us building and maintaining meaningful, long-term relationships with all our stakeholders – including our employees, clients, and local communities – while also reducing our impact on our natural environment.
There is always more we can, and should do, to improve – whether in relation to our people, our clients, our planet, or our governance. Our ongoing success as a business depends on our sustainability and agility in a changing and challenging global landscape. We’re committed to fostering an inclusive, equitable and diverse culture for our people, led by our Diversity, Equity, and Inclusion steering committee.
Our learning and development programmes and systems (including PowerU and MyCampus) enable us to invest in growing our employees’ careers, while our hybrid working approach supports our employees in achieving balance and flexibility while remaining connected to their colleagues. We want to empower our 5,500+ employees - from 94 nationalities, speaking 41 languages across 25 countries - to each achieve their potential. Through IQ-EQ Launchpad we support female managers launching their first fund, in an environment where only 15% of all private equity and venture capital firms are gender balanced.
We’re committed to growing long-term relationships with our clients and supporting them in achieving their objectives. We understand that our clients’ sustainability and success leads to our sustainability and success. We’re emotionally invested in our clients right from the beginning.












