ensuring high availability, optimal performance, and reliability across production and non-production environments. This includes working on incident response, capacity planning, WAN optimization, and system observability using tools like Prometheus and Grafana . Key Responsibilities: Administer and maintain Solace PubSub+ appliances and software brokers across environments (on-prem and cloud). Provide production support for messaging-related incidents, including root … cause analysis and resolution. Monitor system performance and health using Prometheus and Grafana ; proactively identify and address anomalies. Configure and optimize Solace across WAN environments , ensuring low-latency, secure, and reliable messaging. Collaborate with development and application support teams to troubleshoot message flow issues and integration problems. Perform capacity planning , scaling, and tuning of Solace infrastructure to meet current and … background in production support , preferably in a 24x7 enterprise environment. Experience working with distributed systems over WAN , with an understanding of networking, latency, and failover strategies. Solid experience with Prometheus and Grafana for system monitoring and alerting. Proficiency in troubleshooting message delivery, persistence, and topic routing. Experience with capacity management , performance tuning, and system scaling. Familiarity with Linux/Unix More ❯
and platform engineering. Tech Stack Cloud: AWS (EC2, RDS, S3, IAM, CloudWatch, Lambda) Infrastructure as Code: Terraform Containerisation & Orchestration: Docker, Kubernetes (EKS), Helm Configuration Management: Ansible Monitoring & Observability: Grafana, Prometheus CI/CD: GitHub Actions Automation & Scripting: Python, Bash, Go or Java What We’re Looking For Proven experience running AWS cloud infrastructure in a production or regulated (financial) environment. … Hands-on experience managing Kubernetes clusters (preferably EKS). Strong understanding of Infrastructure as Code using Terraform. Familiarity with monitoring and observability stacks such as Prometheus and Grafana. Experience building and maintaining CI/CD pipelines (GitHub Actions or similar). Strong scripting or automation skills using Python, Bash, Go or Java . A collaborative mindset — comfortable working alongside developers More ❯
/EKS knowledge to help the team overcome technical barriers. What They’re Looking For - 5–10 years’ hands-on Kubernetes (EKS on AWS) experience. - Strong skills with Terraform, Prometheus, and scaling infra. - Collaborative and adaptable in a fast-paced environment where priorities shift quickly. - Ability to solve technical challenges and mentor others through example. If you're interested and More ❯
City of London, London, United Kingdom Hybrid / WFH Options
M-XR
models (MongoDB, PostgreSQL) Implement asset storage, retrieval, and management systems (AWS S3) Build job queue management for async ML workflows (SNS, SQS) Setup application monitoring and logging (CloudWatch, Grafana, Prometheus) Implement CI/CD for application deployment (Bitbucket Pipelines) Create API documentation and developer tools What we are looking for 5+ years backend development experience with production applications Track record More ❯
london (city of london), south east england, united kingdom Hybrid / WFH Options
M-XR
models (MongoDB, PostgreSQL) Implement asset storage, retrieval, and management systems (AWS S3) Build job queue management for async ML workflows (SNS, SQS) Setup application monitoring and logging (CloudWatch, Grafana, Prometheus) Implement CI/CD for application deployment (Bitbucket Pipelines) Create API documentation and developer tools What we are looking for 5+ years backend development experience with production applications Track record More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Alexander Ash Consulting
with research and infrastructure teams to deliver scalable, reliable solutions. Drive automation using Terraform, Ansible, GitLab, Jenkins , and support SDLC best practices. Provide visibility and performance monitoring using Splunk, Prometheus, Grafana . Contribute to containerisation and orchestration strategy with Docker and Kubernetes . Stay ahead of industry trends, conduct POCs, and deliver technical recommendations. What We’re Looking For 10+ … experience with DevOps and CI/CD tooling (Terraform, Ansible, GitLab, Jenkins). Programming/scripting knowledge in Python, Golang, or similar . Experience with metrics visualisation tools (Splunk, Prometheus, Grafana). Knowledge of containerisation and orchestration (Docker, Kubernetes). Experience in hedge funds, trading firms, or other low-latency/HPC environments is highly desirable. More ❯
Create the future of travel with us ✈️ Whether it’s to visit the people closest to us, starting an exciting adventure, or a career-defining business trip, travel is an essential part of our lives. Yet we've all experienced More ❯