london (city of london), south east england, united kingdom Hybrid / WFH Options
Understanding Recruitment
/infrastructure engineering role Strong scripting skills in Python , Bash , or Ruby Familiarity with configuration management tools (Ansible, Puppet, or Chef) Interest or exposure to observability tools like Datadog , Prometheus , or Grafana A passion for learning and improving in high-performance environments This is a rare chance to learn from elite engineers and contribute directly to a platform supporting global More ❯
slough, south east england, united kingdom Hybrid / WFH Options
Understanding Recruitment
/infrastructure engineering role Strong scripting skills in Python , Bash , or Ruby Familiarity with configuration management tools (Ansible, Puppet, or Chef) Interest or exposure to observability tools like Datadog , Prometheus , or Grafana A passion for learning and improving in high-performance environments This is a rare chance to learn from elite engineers and contribute directly to a platform supporting global More ❯
Build and maintain Infrastructure as Code (IaC) using Terraform and Ansible. Design highly reliable, scalable, and secure infrastructure supporting performance-critical workloads. Build proactive monitoring, observability, and alerting with Prometheus, Grafana, Azure Monitor, DataDog, and Dynatrace. Troubleshoot complex system issues spanning applications, networks, and infrastructure. Define platform SLAs, SLOs, and governance standards for self-service use. Collaborate closely with Salesforce … and Ansible, along with scripting in PowerShell, Python, or Bash Experience implementing GitOps workflows and managing platform SLAs, SLOs, and governance standards Familiarity with observability and monitoring tools including Prometheus, Grafana, Azure Monitor, DataDog, or Dynatrace Preferred experience supporting Salesforce DevOps pipelines and working with Java, .NET, or Node.js application environments Exposure to AI/ML platforms, real-time data More ❯
Build and maintain Infrastructure as Code (IaC) using Terraform and Ansible. Design highly reliable, scalable, and secure infrastructure supporting performance-critical workloads. Build proactive monitoring, observability, and alerting with Prometheus, Grafana, Azure Monitor, DataDog, and Dynatrace. Troubleshoot complex system issues spanning applications, networks, and infrastructure. Define platform SLAs, SLOs, and governance standards for self-service use. Collaborate closely with Salesforce … and Ansible, along with scripting in PowerShell, Python, or Bash Experience implementing GitOps workflows and managing platform SLAs, SLOs, and governance standards Familiarity with observability and monitoring tools including Prometheus, Grafana, Azure Monitor, DataDog, or Dynatrace Preferred experience supporting Salesforce DevOps pipelines and working with Java, .NET, or Node.js application environments Exposure to AI/ML platforms, real-time data More ❯
/EKS knowledge to help the team overcome technical barriers. What They’re Looking For - 5–10 years’ hands-on Kubernetes (EKS on AWS) experience. - Strong skills with Terraform, Prometheus, and scaling infra. - Collaborative and adaptable in a fast-paced environment where priorities shift quickly. - Ability to solve technical challenges and mentor others through example. If you're interested and More ❯
london (city of london), south east england, united kingdom
Propel
/EKS knowledge to help the team overcome technical barriers. What They’re Looking For - 5–10 years’ hands-on Kubernetes (EKS on AWS) experience. - Strong skills with Terraform, Prometheus, and scaling infra. - Collaborative and adaptable in a fast-paced environment where priorities shift quickly. - Ability to solve technical challenges and mentor others through example. If you're interested and More ❯
/EKS knowledge to help the team overcome technical barriers. What They’re Looking For - 5–10 years’ hands-on Kubernetes (EKS on AWS) experience. - Strong skills with Terraform, Prometheus, and scaling infra. - Collaborative and adaptable in a fast-paced environment where priorities shift quickly. - Ability to solve technical challenges and mentor others through example. If you're interested and More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Venn Group
of technologies including RHEL, CentOS, Ubuntu, VMware, and F5 load balancers Manage web services, LAMP stack applications, Samba servers, and authentication proxies Utilise tools such as Ansible, Katello, Nagios, Prometheus, and Grafana for configuration and monitoring Automate routine tasks using scripts and infrastructure-as-code practices Maintain clear and up-to-date technical documentation Support knowledge sharing and training for More ❯
london, south east england, united kingdom Hybrid / WFH Options
M-XR
models (MongoDB, PostgreSQL) Implement asset storage, retrieval, and management systems (AWS S3) Build job queue management for async ML workflows (SNS, SQS) Setup application monitoring and logging (CloudWatch, Grafana, Prometheus) Implement CI/CD for application deployment (Bitbucket Pipelines) Create API documentation and developer tools What we are looking for 5+ years backend development experience with production applications Track record More ❯
slough, south east england, united kingdom Hybrid / WFH Options
M-XR
models (MongoDB, PostgreSQL) Implement asset storage, retrieval, and management systems (AWS S3) Build job queue management for async ML workflows (SNS, SQS) Setup application monitoring and logging (CloudWatch, Grafana, Prometheus) Implement CI/CD for application deployment (Bitbucket Pipelines) Create API documentation and developer tools What we are looking for 5+ years backend development experience with production applications Track record More ❯
london (city of london), south east england, united kingdom Hybrid / WFH Options
M-XR
models (MongoDB, PostgreSQL) Implement asset storage, retrieval, and management systems (AWS S3) Build job queue management for async ML workflows (SNS, SQS) Setup application monitoring and logging (CloudWatch, Grafana, Prometheus) Implement CI/CD for application deployment (Bitbucket Pipelines) Create API documentation and developer tools What we are looking for 5+ years backend development experience with production applications Track record More ❯
ensuring high availability, optimal performance, and reliability across production and non-production environments. This includes working on incident response, capacity planning, WAN optimization, and system observability using tools like Prometheus and Grafana . Key Responsibilities: Administer and maintain Solace PubSub+ appliances and software brokers across environments (on-prem and cloud). Provide production support for messaging-related incidents, including root … cause analysis and resolution. Monitor system performance and health using Prometheus and Grafana ; proactively identify and address anomalies. Configure and optimize Solace across WAN environments , ensuring low-latency, secure, and reliable messaging. Collaborate with development and application support teams to troubleshoot message flow issues and integration problems. Perform capacity planning , scaling, and tuning of Solace infrastructure to meet current and … background in production support , preferably in a 24x7 enterprise environment. Experience working with distributed systems over WAN , with an understanding of networking, latency, and failover strategies. Solid experience with Prometheus and Grafana for system monitoring and alerting. Proficiency in troubleshooting message delivery, persistence, and topic routing. Experience with capacity management , performance tuning, and system scaling. Familiarity with Linux/Unix More ❯
ensuring high availability, optimal performance, and reliability across production and non-production environments. This includes working on incident response, capacity planning, WAN optimization, and system observability using tools like Prometheus and Grafana . Key Responsibilities: Administer and maintain Solace PubSub+ appliances and software brokers across environments (on-prem and cloud). Provide production support for messaging-related incidents, including root … cause analysis and resolution. Monitor system performance and health using Prometheus and Grafana ; proactively identify and address anomalies. Configure and optimize Solace across WAN environments , ensuring low-latency, secure, and reliable messaging. Collaborate with development and application support teams to troubleshoot message flow issues and integration problems. Perform capacity planning , scaling, and tuning of Solace infrastructure to meet current and … background in production support , preferably in a 24x7 enterprise environment. Experience working with distributed systems over WAN , with an understanding of networking, latency, and failover strategies. Solid experience with Prometheus and Grafana for system monitoring and alerting. Proficiency in troubleshooting message delivery, persistence, and topic routing. Experience with capacity management , performance tuning, and system scaling. Familiarity with Linux/Unix More ❯
london (city of london), south east england, united kingdom
BGC Group
ensuring high availability, optimal performance, and reliability across production and non-production environments. This includes working on incident response, capacity planning, WAN optimization, and system observability using tools like Prometheus and Grafana . Key Responsibilities: Administer and maintain Solace PubSub+ appliances and software brokers across environments (on-prem and cloud). Provide production support for messaging-related incidents, including root … cause analysis and resolution. Monitor system performance and health using Prometheus and Grafana ; proactively identify and address anomalies. Configure and optimize Solace across WAN environments , ensuring low-latency, secure, and reliable messaging. Collaborate with development and application support teams to troubleshoot message flow issues and integration problems. Perform capacity planning , scaling, and tuning of Solace infrastructure to meet current and … background in production support , preferably in a 24x7 enterprise environment. Experience working with distributed systems over WAN , with an understanding of networking, latency, and failover strategies. Solid experience with Prometheus and Grafana for system monitoring and alerting. Proficiency in troubleshooting message delivery, persistence, and topic routing. Experience with capacity management , performance tuning, and system scaling. Familiarity with Linux/Unix More ❯
their aggressive growth plans, they are looking for a pragmatic and commercially oriented SRE to design, implement and maintain scalable and reliable systems. Tech Stack: Python/C++, Terraform, Prometheus, Kubernetes, Cloud Computing The core function of the role is to monitor and maintain uptime for trading systems, pricing engines and risk management tools. The client can offer market leading More ❯
their aggressive growth plans, they are looking for a pragmatic and commercially oriented SRE to design, implement and maintain scalable and reliable systems. Tech Stack: Python/C++, Terraform, Prometheus, Kubernetes, Cloud Computing The core function of the role is to monitor and maintain uptime for trading systems, pricing engines and risk management tools. The client can offer market leading More ❯
london (city of london), south east england, united kingdom
Paragon Alpha - Hedge Fund Talent Business
their aggressive growth plans, they are looking for a pragmatic and commercially oriented SRE to design, implement and maintain scalable and reliable systems. Tech Stack: Python/C++, Terraform, Prometheus, Kubernetes, Cloud Computing The core function of the role is to monitor and maintain uptime for trading systems, pricing engines and risk management tools. The client can offer market leading More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Ncounter
EVPN, VLAN/VxLAN, MLAG, STP. Hands-on with Arista/Cisco; strong troubleshooting tools (Wireshark, netcat, etc.). Familiar with network security, automation (Python, Ansible), and observability stacks (Prometheus, Grafana). Excellent communicator with experience delivering in high-stakes, collaborative settings. STEM degree and CCNP/CCIE preferred. Why Join? Join a trusted global institution where networking is core More ❯
L2 AWS DevOps Support 12 months Hursley - onsite Active SC clearance required, eligible candidates will be considered Inside IR35 - Umbrella only Role overview: We are seeking a skilled and experienced Level 2 AWS DevOps Engineer to join our dynamic team. More ❯
london (city of london), south east england, united kingdom
Duffel
Create the future of travel with us ✈️ Whether it’s to visit the people closest to us, starting an exciting adventure, or a career-defining business trip, travel is an essential part of our lives. Yet we've all experienced More ❯
Create the future of travel with us ✈️ Whether it’s to visit the people closest to us, starting an exciting adventure, or a career-defining business trip, travel is an essential part of our lives. Yet we've all experienced More ❯
Create the future of travel with us ✈️ Whether it’s to visit the people closest to us, starting an exciting adventure, or a career-defining business trip, travel is an essential part of our lives. Yet we've all experienced More ❯