Systems Engineer - HPC, GPU & Agentic AI Infrastructure

United States, New York
Permanent
Job ID: 2499

Job Description


[Up to c. $700k Comp Package | Hybrid Working]


Role Overview

We’re representing a world-leading computational research organisation operating at the intersection of supercomputing, machine learning, and scientific discovery, now expanding its systems engineering team in New York. The role supports large-scale Linux, HPC, GPU, storage, networking, Kubernetes, and cloud environments used by researchers and AI-driven systems. A key focus will be maintaining on-premise compute platforms while designing secure cloud environments that isolate agentic workloads from sensitive internal data.

The organisation is open to strong systems engineers through to senior or lead-level candidates. What matters most is deep Linux expertise, infrastructure-at-scale experience, technical curiosity, and the ability to work across complex systems without being narrowly siloed...


Key Responsibilities

  • Engineer and support large-scale Linux-based compute environments used for scientific, AI, and research workloads
  • Help operate and improve on-premise HPC and GPU cluster infrastructure, including compute, storage, networking, and scheduling layers
  • Design and maintain Kubernetes-backed environments for agentic AI workflows and distributed applications
  • Contribute to secure cloud infrastructure patterns that allow AI agents and research tooling to run safely without unnecessary access to sensitive internal systems
  • Support high-performance GPU platforms, large CPU clusters, and storage environments operating at petabyte scale
  • Troubleshoot complex issues across Linux, networking, filesystems, distributed applications, and compute workloads
  • Build automation and tooling to improve provisioning, reliability, observability, and user experience across infrastructure platforms
  • Work closely with researchers, engineers, and security teams to make advanced compute resources accessible, secure, and reliable
  • Contribute to architecture decisions around cloud, Kubernetes, HPC, networking, and workload isolation
  • Continuously improve platform performance, scalability, and operational resilience as infrastructure demand increases


What You’ll Bring…

  • 4-12 years’ experience in systems engineering, Linux infrastructure, HPC, cloud infrastructure, or large-scale platform environments
  • Strong Linux fundamentals, including practical understanding of processes, networking, filesystems, permissions, performance, and troubleshooting
  • Experience administering or engineering large Linux environments, ideally involving compute clusters or research infrastructure
  • Experience with GPU clusters, HPC schedulers, RDMA networking, large-scale storage, or low-level systems performance
  • Strong scripting or programming ability, ideally with Python, for automation and infrastructure tooling
  • Hands-on exposure to Kubernetes, particularly for running distributed workloads or platform services
  • Experience working with cloud infrastructure, especially where security, isolation, or scalable compute environments are important
  • Understanding of high-performance or distributed systems, including compute, storage, networking, and workload orchestration
  • Ability to diagnose unfamiliar technical problems across multiple layers of the stack
  • Clear communication skills, with the ability to work effectively with researchers, engineers, infrastructure teams, and security stakeholders
  • Strong intellectual curiosity and willingness to learn new systems, technologies, and scientific computing environments
  • (Preferred) Exposure to secure workload isolation, agentic AI infrastructure, or sandboxed compute environments
  • (Preferred) Experience acting as a technical lead or senior engineer within a complex infrastructure team


...


Apply for this role

All fields marked with * are required.

I confirm I have a pre-existing right to work in the role’s location *
I require visa sponsorship now or will require it in the future

Back to Job Listings