service level goals You'll Stand Out If You Have: Practical experience managing large-scale Kubernetes clusters; certifications in Kubernetes are a strong bonus Hands-on familiarity with the Grafana Observability Suite, including tools like Loki, Mimir, and Tempo Background in administering or developing with popular monitoring and automation tools such as Splunk, Datadog, PagerDuty, or Rundeck Experience using configuration … such as Jenkins, GitLab CI/CD, or CircleCI Strong understanding of containerisation (e.g., Docker, Kubernetes) and microservices architecture Skilled in using observability and monitoring tools such as Prometheus, Grafana, ELK stack, or AWS CloudWatch Excellent analytical and troubleshooting abilities, especially within complex distributed systems Proven experience handling incident management and conducting blameless postmortems, including leading cross-functional teams through More ❯
Southampton, Hampshire, South East, United Kingdom Hybrid / WFH Options
Spectrum It Recruitment Limited
service level goals You'll Stand Out If You Have: Practical experience managing large-scale Kubernetes clusters; certifications in Kubernetes are a strong bonus Hands-on familiarity with the Grafana Observability Suite, including tools like Loki, Mimir, and Tempo Background in administering or developing with popular monitoring and automation tools such as Splunk, Datadog, PagerDuty, or Rundeck Experience using configuration … such as Jenkins, GitLab CI/CD, or CircleCI Strong understanding of containerisation (e.g., Docker, Kubernetes) and microservices architecture Skilled in using observability and monitoring tools such as Prometheus, Grafana, ELK stack, or AWS CloudWatch Excellent analytical and troubleshooting abilities, especially within complex distributed systems Proven experience handling incident management and conducting blameless postmortems, including leading cross-functional teams through More ❯
Grays, England, United Kingdom Hybrid / WFH Options
TES
environment. Security Best Practices: Strong understanding of security frameworks and compliance standards for cloud infrastructure and DevOps processes. Monitoring & Observability: Understanding of monitoring, logging, and alerting tools (e.g., Prometheus, Grafana, ELK) to ensure system performance and issue tracking. Skills CI/CD Tools: Hands-on experience with Jenkins, GitLab CI/CD, Travis CI, or similar tools for building CI More ❯
Actions) Work with cloud platforms such as AWS, Azure, or GCP Manage infrastructure as code using tools like Terraform Monitor and maintain production systems using tools such as Prometheus, Grafana, or Datadog Collaborate with development and QA teams to improve deployment processes and system reliability Contribute to incident response, troubleshooting, and root cause analysis Requirements Approximately 18 months of experience More ❯
Sheffield, South Yorkshire, United Kingdom Hybrid / WFH Options
Experis
Integration services such as messaging and streams. Building RESTful API Services. Containerisation, Kubernetes, serverless functions. Microservices, and distributed tracing. Enterprise logging, monitoring, and alerting frameworks (e.g., ELK, Splunk, Prometheus, Grafana). Automation scripting (using scripting languages such as Terraform, Ansible etc.). Experience of working with Continuous Integration (CI), Continuous Delivery (CD) and continuous testing tools. Experience working within an More ❯
Lisburn, Northern Ireland, United Kingdom Hybrid / WFH Options
Camlin Energy
Science, Management Information Systems, or related is desirable but not essential. Nice to have but not essential: Container Orchestration (Kubernetes, Docker Swarm) Service monitoring and graphing tools (Prometheus + Grafana, Nagios + Munin) Elastic stack Infrastructure as Code (Terraform) Repository solutions (Jfrog Artifactory, Jfrog Bintray, Reprepro) Lets Encrypt/ACME OpenVPN Apache Tomcat Messaging streams or communication platforms (RabbitMQ, Postfix More ❯
as: Docker, OpenShift, Kubernetes etc. Infrastructure as Code and CI/CD paradigms and systems such as: Ansible, Terraform, Jenkins, Bamboo, Concourse etc. Monitoring utilising products such as: Prometheus, Grafana, ELK, filebeat etc. Observability - SRE Big Data solutions (ecosystems) and technologies such as: Apache Spark and the Hadoop Ecosystem Edge technologies e.g. NGINX, HAProxy etc. Excellent knowledge of YAML or More ❯
DevOps to optimize build times, parallelize tests, and reduce pipeline flakiness. Result Analysis & Root Cause • Analyze test outputs, system logs, and metrics (e.g., via ELK Stack or Prometheus/Grafana) to pinpoint failures and performance regressions. • Lead root-cause investigations for infrastructure incidents, producing clear post-mortem reports and remediation recommendations. Defect Management • Log, triage, and track defects in Jira More ❯
of building and maintaining CI/CD pipelines using the likes of GitLab, Jenkins, CircleCI, CodeBuild etc. Familiarity with scripting (Bash or Python). Monitoring and alerting tools - Prometheus, Grafana or Splunk, ELK. We're looking for someone who wants to progress their career into the DevOps arena. Submit your CV now to be considered. IND_PC1 Carbon60, Lorien & SRG More ❯
and containerization, Linux, Relational and NoSQL databases, building RESTful API Services, Containerisation, Kubernetes, serverless functions, Microservices, and distributed tracing. Enterprise logging, monitoring, and alerting frameworks (eg, ELK, Splunk, Prometheus, Grafana). Automation Scripting (using Scripting languages such as Terraform, Ansible etc.). Strong understanding of security principles in cloud and enterprise systems. Familiarity with audit and compliance considerations in regulated More ❯
development teams to improve time-to-market and promote cloud adoption Create CI/CD pipelines using Jenkins, Bitbucket/Git Create monitoring dashboards and metrics using Splunk, Prometheus, Grafana Write unit tests in Go/Python/Java Work on a globally distributed team CME Group: Where Futures Are Made CME Group (www.cmegroup.com) is the world's leading derivatives More ❯
South East London, England, United Kingdom Hybrid / WFH Options
Amber Labs
Agile teams using tools like Git , Jira , and Confluence Eligible for SC and NPPV3 clearance Desirable: Container orchestration with Kubernetes HashiCorp tools: Vault , Consul , Packer Monitoring and observability with Grafana , Prometheus , or similar Familiarity with cloud networking, VPCs, NAT Gateways, security groups, etc. Personal Attributes: Proactive and self-driven with a passion for technology Strong problem-solving mindset Collaborative team More ❯
Leeds, England, United Kingdom Hybrid / WFH Options
Sportserve
GCP) and Kubernetes ecosystems Deep expertise in CI/CD tools (GitLab CI/CD, FluxCD), Infrastructure as Code (Terraform, Helm, Ansible), container orchestration (Kubernetes, Docker), monitoring tools (Prometheus, Grafana, ELK) Proven track record of platform modernization and process improvement Strong scripting and programming skills (Bash, Golang) Understanding of security, SRE principles, and cost optimization in cloud environments Effective troubleshooting More ❯
agile development methodologies such as Scrum or Kanban Experience with infrastructure as code (IaC) tools such as Terraform or CloudFormation Familiarity with monitoring and logging tools such as Prometheus, Grafana, or ELK Stack Experience with machine learning and artificial intelligence technologies Desirable Certifications Strong proficiency in at least one of the following AWS certifications: AWS Certified Solutions Architect - Associate AWS More ❯
Monitoring and Alerting: Set up monitoring and logging systems to proactively detect and address potential issues, ensuring optimal performance and reliability in environments like on-prem Prometheus/Thanos, Grafana Cloud, and Grafana Cloud Loki. Database Management: Manage hundreds of on-prem PostgreSQL databases, including performance tuning, backups, and disaster recovery strategies. Collaboration: Work closely with cross-functional teams, including More ❯
Experience with Terraform, Kubernetes, Kafka, Docker, Redis, MongoDB. Experience with application clustering, load balancing, high availability, and reliability concepts and supporting technologies. Experience with monitoring systems such as Prometheus, Grafana, Splunk, or the ELK Stack. Clear written and verbal communication skills. Some level of participation in an on-call escalation path. A passion for providing excellent service to all internal More ❯
of CI/CD pipelines (GitHub Actions, GitLab CI/CD, Jenkins, etc.). Familiarity with cloud platforms (AWS, GCP, Azure). Experience with monitoring tools such as Prometheus, Grafana, or the ELK stack. Basic scripting abilities in Python, Bash, or PowerShell. Seniority Level Mid-Senior level Employment Type Full-time Job Function Information Technology Industries Software Development and Technology More ❯
green deployments, canary releases). Implement comprehensive monitoring, logging, and alerting systems to proactively identify and address performance issues, errors, and security threats. Use tools like Azure Monitor, Prometheus, Grafana, or similar to collect and analyse metrics, logs, and traces. Configure alerts and notifications to ensure timely responses to critical events. Security & Compliance: Implement security best practices and controls within More ❯
Southampton, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
GitHub Actions, CircleCI, or similar. Experience with messaging systems (RabbitMQ, Kafka, etc.) and event-driven architecture. Proficiency in infrastructure as code (Terraform preferred). Familiarity with monitoring stacks (Prometheus, Grafana, ELK, etc.) and system tuning. Security-conscious mindset; experience implementing controls in regulated or financial environments is a plus. Excellent problem-solving skills and a proactive attitude. Strong communication and More ❯
other CI tools; Maven, Gradle or other build tools; Ansible or other IT Automation/software provisioning tools; JIRA, Confluence; * Experience in monitoring/reporting tools such as Splunk, Grafana/Prometheus etc * Experience in Agile practices * Working knowledge of environment monitoring tools such as GCO, NewRelic, Prometheus, Grafana. * Collaboration Skills: Proactive can-do attitude; A creative approach towards solving More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
Arm Limited
Skills and Experience: Experience in a GitOps solution such as ArgoCD, Flux or Fleet Implementation of the Security Development Lifecycle (SDL) in infrastructure Monitoring and observability using Prometheus and Grafana, ELK stack or equivalent Use of Kubernetes management systems such as Rancher Familiarity with open source project development cycles and contribution processes, particularly around CI/CD infrastructure Accommodations at More ❯
Perl, JAVA) and automation skills. Thorough knowledge of Jenkins and pipeline using Groovy script. Experience with Docker containers and Amazon Linux 2023 AMI. Experience with system monitoring tools (e.g., Grafana, Alert Manager, Prometheus, and Node exporter). Ability to analyse and resolve complex infrastructure resource and application deployment issues. Experience with Git, Jira, Confluence, and ServiceNow for incident and change More ❯