Lead & Senior Site Reliability Engineers - Enterprise Productivity Platform
Job Description
[Up to c. $500k Comp Package for Lead SRE | Up to c. $400k Comp Package for Senior SRE | Hybrid Working]
Are you ready to shape the future of enterprise-scale infrastructure reliability at one of the world’s most technically sophisticated algorithmic trading firms? Our client is a global leader in ultra-low-latency systems and quantitative research, where speed, precision, and scale underpin everything they do. They're now building out their Site Reliability Engineering (SRE) function within Enterprise Technology, and are hiring both a Lead and Senior SRE to help define and drive this capability forward. These roles will help establish engineering best practices across a broad tech stack and scale operational excellence in a mission-critical environment...
...
Lead Site Reliability Engineer
As the Lead SRE, you'll be responsible for setting direction and strategy for the newly formed SRE team, guiding engineers, and taking ownership over the availability, resilience, and performance of the enterprise infrastructure stack. You’ll bring a hands-on mindset to operational challenges, improve systems design, and reduce friction between users and the productivity platforms they rely on daily...
Key Responsibilities
- Lead, coach, and develop a growing team of SREs while fostering a culture of collaboration and technical excellence
- Drive automation initiatives that reduce toil, improve uptime, and accelerate internal development velocity
- Define and deliver the strategic roadmap for SRE within the Enterprise Technology team
- Balance engineering ambition with pragmatic delivery; iterate quickly while maintaining high standards for security and reliability
- Oversee containerised environments (on-prem and cloud), managing bridge services, batch processing and integrations
- Partner with internal stakeholders to reduce user friction and ensure teams have seamless access to the tools and data they need
- Develop and refine monitoring, alerting, and observability systems for proactive issue detection and resolution
What You Bring...
- 6+ years in site reliability, DevOps, or infrastructure engineering roles
- 2+ years in a leadership capacity, with experience managing or mentoring engineers
- Advanced proficiency in Python and shell scripting
- Deep Linux administration skills and strong knowledge of containerisation technologies (e.g. Docker, Kubernetes)
- Expertise with IaC and configuration management (e.g. Terraform, SaltStack, Ansible)
- Experience operating CI/CD workflows via tools such as Jenkins, GitHub Actions, or ArgoCD
- Strong organisational and communication skills; ability to break down complex problems and lead through ambiguity
...
Senior Site Reliability Engineer
This role is ideal for an engineer who enjoys building scalable automation, eliminating operational bottlenecks, and championing platform resilience. As one of the first hires in a foundational SRE function, you'll play a critical role in maturing the tooling, systems, and processes that underpin the internal productivity ecosystem...
Key Responsibilities
- Engineer high-availability systems across both on-prem and cloud environments, supporting internal tools and collaboration platforms
- Design, manage, and automate containerised services and backend integrations
- Build infrastructure that scales through code, using robust configuration management and CI/CD principles
- Improve internal observability and alerting to ensure proactive performance monitoring
- Partner with security and engineering leads to implement best practices for operational risk and reliability
- Break down engineering projects into deliverables and communicate progress with clarity
What You Bring...
- 4+ years of SRE, platform, or infrastructure engineering experience
- Strong command of Python and systems-level scripting
- Proficient in Linux system administration and container orchestration
- Experience with Terraform, Salt, Puppet, or similar tools for infrastructure as code
- Exposure to CI/CD tooling and version-controlled infrastructure pipelines
- Familiarity with complex system troubleshooting and monitoring tools
- A mindset focused on automation, scalability, and reducing operational burden
...
Apply for this role
All fields marked with * are required.