Hardware Machine Learning Engineer

United States, New York, Illinois, Chicago
Permanent
Job ID: 2434

Job Description


[Up to c. $500k Comp Package | On-Site Working]


Role Overview

We’re partnering with a leading quantitative trading firm as it expands the use of machine learning within its ultra-low-latency trading infrastructure. The firm already operates one of the most advanced hardware environments in the industry and is now exploring how neural network inference can be deployed directly within its custom FPGA and ASIC ecosystem. This role sits at the intersection of machine learning research and hardware implementation. Working closely with quantitative researchers and hardware engineers, you will help shape models that can realistically operate within the strict latency constraints of the firm’s production systems.

The position emphasises hardware-aware ML modelling - understanding how machine learning architectures interact with hardware constraints and collaborating early in the research lifecycle to design models that can ultimately run at production speeds...


Key Responsibilities

  • Work alongside ML researchers to design models that can operate within extremely low-latency trading environments
  • Evaluate new machine learning architectures and determine their feasibility for deployment on custom hardware infrastructure
  • Apply optimisation techniques such as quantisation, pruning, sparsity, and efficient matrix representations to improve inference performance
  • Collaborate with hardware engineers to translate model requirements into hardware-compatible implementations
  • Experiment with model structures and inference techniques that reduce computational overhead without sacrificing predictive performance
  • Analyse trade-offs between model complexity, latency, and hardware resource usage
  • Develop supporting tooling and experimentation frameworks in languages such as Python or C++
  • Contribute to internal research initiatives exploring new approaches to hardware-accelerated ML inference
  • Participate in early-stage research discussions to help guide model development toward hardware-feasible designs


What You’ll Bring…

  • Experience (likely 4-8 yrs) working with machine learning models in performance-constrained environments
  • Strong understanding of neural network architectures and inference optimisation techniques
  • Practical exposure to techniques such as quantisation, pruning, sparsity, or model compression
  • Familiarity with hardware acceleration concepts including FPGA or ASIC environments
  • Experience deploying ML models onto hardware accelerators such as FPGA or custom silicon
  • Ability to reason about computational cost, latency constraints, and model performance trade-offs
  • Proficiency in Python and/or C++ for model experimentation and tooling development
  • Experience collaborating across multidisciplinary teams including researchers and engineers
  • (Preferred) Familiarity with ML compiler stacks or optimisation frameworks (e.g., MLIR, TVM, XLA)
  • (Preferred) Experience with matrix optimisation techniques or efficient linear algebra implementations
  • (Preferred) Advanced degree (MS/PhD) in Computer Science, Electrical Engineering, Machine Learning, Physics or related discipline


...


Apply for this role

All fields marked with * are required.

I confirm I have a pre-existing right to work in the role’s location *
I require visa sponsorship now or will require it in the future

Back to Job Listings