AI Infrastructure Engineer - Platform Engineering
Job Description
[Up to c. $400k Comp Package | Hybrid Working - 3 Days in Office]
Role Overview
We’re partnering with a leading multi-strategy investment firm as it expands its platform engineering capability to better support rapidly growing AI and data-driven workloads. This role sits within a central infrastructure function responsible for compute, cloud, and platform services, but is uniquely aligned to the needs of AI-focused engineering teams. Rather than owning core infrastructure directly, you will act as a technical bridge - working closely with developers to understand their requirements and ensuring those needs are translated into scalable, production-ready platform solutions.
The position is fundamentally infrastructure-led, with a strong emphasis on cloud environments, automation, and distributed systems. While exposure to AI/ML workloads is beneficial, the core requirement is the ability to support demanding engineering teams by designing and enabling robust, high-performance infrastructure that can evolve alongside their needs...
Key Responsibilities
- Act as the primary interface between engineering teams and the central platform function, translating requirements into practical infrastructure solutions
- Enable and support compute environments for data-intensive and AI-related workloads across cloud and on-prem systems
- Collaborate with platform engineers to deliver scalable infrastructure across compute, storage, networking, and container orchestration layers
- Design and improve infrastructure patterns that support distributed workloads, high-performance compute, and evolving application requirements
- Define observability and monitoring approaches, ensuring strong visibility into system performance, usage, and reliability
- Support the integration and lifecycle management of specialised compute resources, including GPU-backed environments where required
- Contribute to automation and infrastructure-as-code practices to improve consistency, scalability, and operational efficiency
- Advise engineering teams on best practices across performance, reliability, and cost optimisation
- Help standardise workflows for environment provisioning, deployment, and reproducibility across teams
- Drive continuous improvements in how infrastructure supports high-demand, front-office-facing use cases
What You’ll Bring…
- 6-10 years’ experience in infrastructure engineering, platform engineering, DevOps, or SRE within complex environments
- Strong hands-on experience with cloud platforms (AWS preferred) and modern infrastructure patterns
- Solid expertise in Infrastructure as Code and automation tooling (e.g. Terraform or similar)
- Exposure to AI/ML workloads, distributed compute, or data-intensive systems
- Strong Linux and systems fundamentals, including networking and distributed systems concepts
- Experience working with Kubernetes and containerised workloads in production environments
- Proven ability to support demanding engineering teams and translate requirements into scalable infrastructure solutions
- Familiarity with observability, monitoring, and production reliability practices
- Strong communication skills, comfortable working across engineering and business-facing teams
- (Preferred) Experience supporting GPU-enabled infrastructure or high-performance compute environments
- (Preferred) Familiarity with modern workload orchestration or model-serving frameworks
...
Apply for this role
All fields marked with * are required.