Job Description
About Onedome
We’re a UK fintech building high-throughput digital infrastructure for the mortgage and property space. Recently acquired Trussle. We’re taking our platform to the next level: fully automated, self-healing, observable, and built to handle real traffic spikes (we have a TV launch coming up).
The role
You’ll be our senior platform engineer. Not a traditional ops role. You’ll treat infrastructure as a product, build the internal developer platform our engineers deploy through, and bring SRE discipline into the application tier when it’s needed.
You won’t have a platform team to lean on. You’ll be building that capability. We know exactly what we need.
What you’ll do
- Own 5 AWS accounts across the organisation (QA, Prod, Infra, plus two others)
- Architect and maintain infrastructure as code with Terraform. Replace the click-ops that’s still around
- Stand up CI/CD pipelines engineers actually want to use. Blue/green and canary where it makes sense
- Run releases reliably and reduce key-person risk on the deploy process
- Set up monitoring, alerting, and incident response so we catch issues before customers do. Define SLIs and SLOs that map to real user journeys
- Lead incident response as the primary infra responder. Run blameless post-mortems
- Patch and harden the platform. We’re a fintech, this matters. IAM least privilege, secrets management, vulnerability scanning, MFA enforcement
- Lead infrastructure work for the TV launch: traffic spikes, autoscaling, capacity planning, runbooks
- Partner with backend engineers when production issues cross into the app tier. JVM tuning, connection pools, async patterns, memory leaks
- Architect multi-region DR. Drive the platform toward self-healing
- Optimise cost across DynamoDB, Lambda, and the wider AWS footprint
- Set architectural guardrails with tech leads and SWEs. Mentor engineers as the platform capability grows
You must have
- 6+ years hands-on AWS in production (VPC, IAM, EC2/ECS/EKS, Lambda, S3, CloudFront, Route53, RDS, DynamoDB, SQS)
- Production Terraform (Terragrunt a plus). Multi-environment state management
- Production Kubernetes and Docker. EKS specifically
- Strong CI/CD experience: GitHub Actions, GitLab, Jenkins, or ArgoCD. GitOps practices
- Strong infrastructure security: IAM least privilege, secrets, patching, vulnerability scanning, MFA enforcement
- Observability from scratch: CloudWatch, Datadog, Grafana, Prometheus, OpenTelemetry, or ELK
- Solid backend competency in Java (Spring Boot), Python, or Node. Understanding of JVM, concurrency, async systems
- Comfort being the senior infrastructure person. Self-directed, opinionated, disciplined about documentation
- B2/C1 Level English
Nice to have
- Fintech or other regulated industry experience
- DynamoDB at scale (we run many tables with PITR and S3 exports, Lambda-driven daily incrementals)
- AWS cost optimisation track record
- Past experience as a first infrastructure hire at a small team
- Building internal developer platforms or self-service tooling
- AWS DevOps Pro, Solutions Architect Pro, or CKA
- Service mesh experience (Istio, Linkerd) and DevSecOps practices
- Firebase or GCP exposure (we have a small footprint there)
What you won’t get from us
- A platform team to lean on. You’ll be building that capability
- Click-ops everywhere. You’ll be replacing what exists with IaC
- Ambiguity about the work. We know exactly what we need
Working setup
- UK working hours.
- Fully remote across the EU
- Tooling: Microsoft 365, Teams, Planner, Nuclino, GitHub
- aws-vault with MFA enforcement on all accounts
- Sustainable on-call rotation












