AI Infrastructure Engineer - Platform Engineering

United States, New York
Permanent
Job ID: 2445

Job Description


[Up to c. $400k Comp Package | Hybrid Working - 3 Days in Office]


Role Overview

We’re partnering with a leading multi-strategy investment firm as it expands its platform engineering capability to better support rapidly growing AI and data-driven workloads. This role sits within a central infrastructure function responsible for compute, cloud, and platform services, but is uniquely aligned to the needs of AI-focused engineering teams. Rather than owning core infrastructure directly, you will act as a technical bridge - working closely with developers to understand their requirements and ensuring those needs are translated into scalable, production-ready platform solutions.

The position is fundamentally infrastructure-led, with a strong emphasis on cloud environments, automation, and distributed systems. While exposure to AI/ML workloads is beneficial, the core requirement is the ability to support demanding engineering teams by designing and enabling robust, high-performance infrastructure that can evolve alongside their needs...


Key Responsibilities

  • Act as the primary interface between engineering teams and the central platform function, translating requirements into practical infrastructure solutions
  • Enable and support compute environments for data-intensive and AI-related workloads across cloud and on-prem systems
  • Collaborate with platform engineers to deliver scalable infrastructure across compute, storage, networking, and container orchestration layers
  • Design and improve infrastructure patterns that support distributed workloads, high-performance compute, and evolving application requirements
  • Define observability and monitoring approaches, ensuring strong visibility into system performance, usage, and reliability
  • Support the integration and lifecycle management of specialised compute resources, including GPU-backed environments where required
  • Contribute to automation and infrastructure-as-code practices to improve consistency, scalability, and operational efficiency
  • Advise engineering teams on best practices across performance, reliability, and cost optimisation
  • Help standardise workflows for environment provisioning, deployment, and reproducibility across teams
  • Drive continuous improvements in how infrastructure supports high-demand, front-office-facing use cases


What You’ll Bring…

  • 6-10 years’ experience in infrastructure engineering, platform engineering, DevOps, or SRE within complex environments
  • Strong hands-on experience with cloud platforms (AWS preferred) and modern infrastructure patterns
  • Solid expertise in Infrastructure as Code and automation tooling (e.g. Terraform or similar)
  • Exposure to AI/ML workloads, distributed compute, or data-intensive systems
  • Strong Linux and systems fundamentals, including networking and distributed systems concepts
  • Experience working with Kubernetes and containerised workloads in production environments
  • Proven ability to support demanding engineering teams and translate requirements into scalable infrastructure solutions
  • Familiarity with observability, monitoring, and production reliability practices
  • Strong communication skills, comfortable working across engineering and business-facing teams
  • (Preferred) Experience supporting GPU-enabled infrastructure or high-performance compute environments
  • (Preferred) Familiarity with modern workload orchestration or model-serving frameworks


...


Apply for this role

All fields marked with * are required.

I confirm I have a pre-existing right to work in the role’s location *
I require visa sponsorship now or will require it in the future

Back to Job Listings