London, England, United Kingdom Hybrid / WFH Options
Derisk360
CI/CD pipelines using Cloud Build, GitLab, Jenkins, or ArgoCD. Implement Istio, ingress controllers, network policies, and GCP IAM to secure microservice communications. Monitor and optimize systems using Prometheus, Grafana, and Cloud Operations Suite. Collaborate with platform, DevOps, and security teams in a multi-tenant Kubernetes ecosystem. Troubleshoot container performance, networking, and scaling issues. Provide architectural documentation and technical More ❯
Working knowledge of microservice infrastructure components. Excellent debugging and troubleshooting skills. Experience with Kubernetes. Experience in cloud computing (preferably AWS). Experience with common SRE toolchains, such as Grafana, Prometheus, Elasticsearch, Kibana, and Jaeger, is a plus. #J-18808-Ljbffr More ❯
infrastructure provisioning and automation Deep understanding of CI/CD principles, GitOps, and automation Experience with containerization and orchestration tools (Docker, etc.) Familiarity with monitoring and logging tools (e.g., Prometheus, Grafana, Azure Monitor) Demonstrated experience in managing multiple Azure landing zones with enterprise-scale governance and policies Strong knowledge of Azure security services (e.g., Azure Security Center, Defender for Cloud More ❯
products under test: Containerisation (e.g. Docker), Virtualisation and Provisioning, Workload and job scheduling (e.g. Kubernetes, Ray) on high core-count machines and rack-scale installations, Management and Observability (e.g. Prometheus, OpenTelemetry, DataDog, Splunk, etc.). 10+ years of relevant experience related to quality assurance/testing teams. Experience with the Atlassian suite and CI/CD platforms such as Jenkins More ❯
configuration, and firewalls Familiarity with CI/CD pipelines and tools (GitOps, ArgoCD, Helm) Understanding of WebRTC , STUN/TURN, and other real-time communication protocols Monitoring & logging experience – Prometheus , Grafana , ELK , or Loki Experience with MongoDB , PostgreSQL , Redis and secure back-end infrastructure Exposure to Zero Trust environments , secure enclaves, and hybrid deployment models Why Apply? Collaborate on secure More ❯
products under test: Containerisation (e.g. Docker), Virtualisation and Provisioning, Workload and job scheduling (e.g. Kubernetes, Ray) on high core-count machines and rack-scale installations, Management and Observability (e.g. Prometheus, OpenTelemetry, DataDog, Splunk, etc.). 10+ years of relevant experience related to quality assurance/testing teams. Experience with the Atlassian suite and CI/CD platforms such as Jenkins More ❯
sleeves and writing that wonderful code! Subject-matter Qualifications (subject to assignment & your personal engineering focus): Infrastructure as code, and DevOps practices. Experience with monitoring tools such as Grafana, Prometheus or ELK stack. Experience with configuration management tools such as Puppet, Ansible and Chef. Experience with building Event-Driven applications (Kafka) and API solutions. Experience with data engineering, including SQL More ❯
Liverpool, England, United Kingdom Hybrid / WFH Options
Bellrock Group
tools—GitHub Actions and Octopus Deploy. Proficient in writing and managing Infrastructure as Code (Terraform, ARM templates). Experienced in setting up and maintaining observability stacks (e.g. Application Insights, Prometheus, Grafana). Familiar with container orchestration concepts; Kubernetes experience is a plus. Scripting or programming experience in PowerShell, Python, or similar languages. Comfortable balancing speed of delivery with system stability More ❯
London, England, United Kingdom Hybrid / WFH Options
ZILO™
problem-solving skills and attention to detail. Strong communication and teamwork abilities. Knowledgeable in Java profiling and JVM memory model Preferred Qualifications Experience with monitoring and logging tools (e.g., Prometheus, Grafana, ELK Stack). Familiarity with Agile methodologies and DevOps practices. Benefits Enhanced leave - 38 days inclusive of 8 UK Public Holidays Private Health Care including family cover Life Assurance More ❯
applications and infrastructure Other duties as needed About You 5+ years' experience in Site Reliability Engineer roles Expert+ level Linux administration, scripting, and troubleshooting Demonstrable knowledge of Observability tools (Prometheus/Grafana, New Relic, Splunk, DataDog) Comprehensive experience with AWS (Amazon Web Services) and its core capabilities (VPC, EC2, ECS, Route53, Fargate, ALB/NLB distributions, etc) Extensive experience with More ❯
operational run books. Scripting skills in Bash, Python, or similar. Nice to Have: Familiarity with CI/CD pipelines and deployment automation. Knowledge of monitoring/logging tools like Prometheus, Grafana and ELK Exposure to security and compliance practices in cloud environments. Skills: Strong communication and collaboration skills. Calm under pressure, particularly during incident response. Eagerness to learn and continuously More ❯
London, England, United Kingdom Hybrid / WFH Options
Inizio
registry tools. Experience fine-tuning LLMs. Background in agentic AI frameworks (e.g., LangGraph, AutoGen, CrewAI). Contributions to relevant open-source projects. Familiarity with monitoring/observability tools like Prometheus, Grafana, Sentry. What We Offer Competitive salary and benefits package, including private medical insurance and company pension. Flexible, remote-first working environment with occasional travel. The opportunity to work on More ❯
REST APIs Microservices in the cloud using PHP and Node. AWS: EC2, S3 DevOps tools: Terraform (preferred), Ansible (desirable) CI/CD via GitHub Actions. Message queues: RabbitMQ Monitoring: Prometheus, Grafana. No current PHP or JS automated testing. Able to communicate technical updates to the C-suite. Additional/Desirable experience: Proven experience in mobile development, particularly with Flutter and More ❯
products under test: Containerisation (e.g. Docker), Virtualisation and Provisioning, Workload and job scheduling (e.g. Kubernetes, Ray) on high core-count machines and rack-scale installations, Management and Observability (e.g. Prometheus, OpenTelemetry, DataDog, Splunk, etc.). 10+ years of relevant experience related to quality assurance/testing teams. Experience with the Atlassian suite and CI/CD platforms such as Jenkins More ❯
Azure, AWS or GCP. Experience with Kubernetes is desirable. You have a high degree of experience in observing the performance and health of applications via tools such as Grafana, Prometheus, Data Dog, Sentry, etc. You have a strong desire and are an advocate for performant applications. You have a flair for simplicity when problem solving. Excellent communication skills, with the More ❯
London, England, United Kingdom Hybrid / WFH Options
Deutsche Bank
/Unix systems, SQL, and programming languages such as C++, Java or Python. Strong understanding of distributed systems and low-latency architectures Hands-on experience with observability stacks (e.g., Prometheus, Grafana, Splunk, Geneos, OpenTelemetry) and infrastructure automation (e.g., Ansible, Terraform, CI/CD pipelines) Strong understanding of the trade lifecycle, market data, and fixed income products, FX or algorithmic trading More ❯
Working knowledge of microservice infrastructure components. Excellent debugging and troubleshooting skills. Experience with Kubernetes. Experience in cloud computing (preferably AWS). Experience with common SRE toolchains, such as Grafana, Prometheus, Elasticsearch, Kibana, and Jaeger, is a plus. #ICBCareer #ICBEngineering About Us J.P. Morgan is a global leader in financial services, providing strategic advice and products to the world's most More ❯
extended experience) A willingness to travel to visit customer sites Technical Skills: Terraform AWS/Azure/GCP Containerisation technologies such as Kubernetes and Docker Nagios Grafana and or Prometheus Configuration automation with Ansible Linux preferably Redhat/Centos Vmware Good Network skills (Firewalls & Switches) A driving license. You can have: Windows Server Knowledge A knowledge of ITIL processes, and More ❯
Liverpool, England, United Kingdom Hybrid / WFH Options
Concerto
tools—GitHub Actions and Octopus Deploy. Proficient in writing and managing Infrastructure as Code (Terraform, ARM templates). Experienced in setting up and maintaining observability stacks (e.g. Application Insights, Prometheus, Grafana). Familiar with container orchestration concepts; Kubernetes experience is a plus. Scripting or programming experience in PowerShell, Python, or similar languages. Comfortable balancing speed of delivery with system stability More ❯
pragmatism, empathy and understanding when interacting with team, stakeholders and customers. Technologies we use AWS Kubernetes (EKS) Terraform Istio Flux Crossplane Helm ELK/EFK Hashicorp Vault GitHub Actions Prometheus/Thanos/Grafana New Relic Kafka OPA Gatekeeper As well as lots of on-the-job training and endless opportunities, you’ll get: Colleague discount across our multi-brands More ❯
with automation, IaC, and CI/CD principles. Understand Network concepts, Infrastructure, and common protocols. Able to write basic scripts for automation Build dashboards in Grafana and understanding of Prometheus and PromQL. Knowledge of SDLC and experience integrating solutions into CI pipelines Experience with cloud (AWS, GCP) is beneficial, but not essential. Able to self-manage Jira tickets and provide More ❯
control Experience with testing methodology and frameworks Experience with continuous integration systems and pipelines Experience of video and audio encoding and streaming Experience of observability tooling such as grafana, prometheus, opensearch and creating observable systems Knowledge of Rust, Java, c# Knowledge of PlayStation hardware and SDK Benefits of working in Gaming, Developer and Future Technology: Technology development is not constrained More ❯
leading and developing high-performing engineering teams, providing mentorship, support, and opportunities for growth. Strong knowledge of software engineering best practices, system design, observability, resilience, and expertise in Telemetry, Prometheus and Grafana Experience with cloud platforms like GCP, Azure, AWS, etc Drive the implementation of observability pipelines for different systems and applications across dojo. Strong understanding of containerisation and orchestration More ❯
London, England, United Kingdom Hybrid / WFH Options
V7 Labs
with a focus on streamlining code deployment. Solid knowledge of Kubernetes and Docker for container management and orchestration in cloud environments. Experience working with monitoring and observability tools like Prometheus and Grafana to ensure system reliability and visibility. Advanced skills in either Python, or Elixir, or a strong willingness to learn new programming languages as needed. Proven track record of More ❯