DevOps Engineer - Platform Automation
Job Description
[Up to c. £225k Comp Package | Hybrid Working]
Role Overview
We’re partnering with a highly technical quantitative research organisation that operates at extreme scale across data, compute, and experimentation. Following a recent organisational change that moved parts of the HPC function into a standalone service, the core business is now significantly expanding its internal storage engineering capability to meet growing demand. This role sits within a rebuilding storage platform group and is fundamentally an engineering position, not a traditional storage or HPC operations role. The focus is on automation, reliability, and performance, with storage treated as a platform that must scale cleanly and operate predictably. You’ll act as an experienced engineering presence within the team - designing automation, improving resilience, and serving as an escalation point for a separate operations function.
Crucially, this role is open to strong automation engineers who are not storage specialists. If you bring deep experience in Python, Go, Git-driven workflows, and platform automation - and want to learn storage at serious scale - this role is designed for that profile...
Key Responsibilities
- Engineer and extend automation across the storage estate, reducing manual intervention and improving consistency, reliability, and recovery
- Design and maintain tooling that supports provisioning, lifecycle management, performance tuning, and observability of large-scale storage platforms
- Act as an escalation point for the storage operations team, handling complex or non-standard issues while driving permanent fixes upstream
- Partner with platform, compute, and research teams to understand data movement patterns and performance constraints across HPC-style workloads
- Lead deep technical investigations into performance, capacity, or reliability issues, translating findings into durable engineering improvements
- Contribute to Git-driven infrastructure workflows, CI/CD pipelines, and automation frameworks used across the platform
- Support a light on-call rotation (roughly 1 in 6), with low but meaningful out-of-hours involvement focused on true engineering incidents
- Play an active role in major upcoming initiatives, including large-scale storage deployments and new internal data orchestration capabilities
What You’ll Bring…
- 4-9 years’ experience in infrastructure engineering, platform engineering, or DevOps roles within fast-moving, production environments
- Strong programming ability in Python and/or Go, used to build automation, tooling, and operational frameworks
- Confidence working with Git-based workflows and modern automation tooling
- Solid Linux fundamentals, with the ability to debug performance and system-level issues under pressure
- Experience working alongside Kubernetes-based or containerised platforms
- An engineering mindset first: comfortable owning problems end-to-end and improving systems rather than working around them
- Ability to operate as a senior technical contributor - guiding others through influence rather than formal management
- Comfort working in a fast-paced environment where priorities shift and impact matters
- (Preferred) Exposure to large-scale storage platforms such as scale-out file systems, object storage, or vendor solutions like VAST or Isilon
- (Preferred) Familiarity with performance-sensitive or data-intensive workloads
...
Apply for this role
All fields marked with * are required.