Southampton, Hampshire, South East, United Kingdom Hybrid / WFH Options
Spectrum It Recruitment Limited
service level goals You'll Stand Out If You Have: Practical experience managing large-scale Kubernetes clusters; certifications in Kubernetes are a strong bonus Hands-on familiarity with the Grafana Observability Suite, including tools like Loki, Mimir, and Tempo Background in administering or developing with popular monitoring and automation tools such as Splunk, Datadog, PagerDuty, or Rundeck Experience using configuration … such as Jenkins, GitLab CI/CD, or CircleCI Strong understanding of containerisation (e.g., Docker, Kubernetes) and microservices architecture Skilled in using observability and monitoring tools such as Prometheus, Grafana, ELK stack, or AWS CloudWatch Excellent analytical and troubleshooting abilities, especially within complex distributed systems Proven experience handling incident management and conducting blameless postmortems, including leading cross-functional teams through More ❯
Hampshire, England, United Kingdom Hybrid / WFH Options
Spectrum IT Recruitment
service level goals You'll Stand Out If You Have: Practical experience managing large-scale Kubernetes clusters; certifications in Kubernetes are a strong bonus Hands-on familiarity with the Grafana Observability Suite, including tools like Loki, Mimir, and Tempo Background in administering or developing with popular monitoring and automation tools such as Splunk, Datadog, PagerDuty, or Rundeck Experience using configuration … such as Jenkins, GitLab CI/CD, or CircleCI Strong understanding of containerisation (e.g., Docker, Kubernetes) and microservices architecture Skilled in using observability and monitoring tools such as Prometheus, Grafana, ELK stack, or AWS CloudWatch Excellent analytical and troubleshooting abilities, especially within complex distributed systems Proven experience handling incident management and conducting blameless postmortems, including leading cross-functional teams through More ❯
Portsmouth, England, United Kingdom Hybrid / WFH Options
Spectrum IT Recruitment
service level goals You'll Stand Out If You Have: Practical experience managing large-scale Kubernetes clusters; certifications in Kubernetes are a strong bonus Hands-on familiarity with the Grafana Observability Suite, including tools like Loki, Mimir, and Tempo Background in administering or developing with popular monitoring and automation tools such as Splunk, Datadog, PagerDuty, or Rundeck Experience using configuration … such as Jenkins, GitLab CI/CD, or CircleCI Strong understanding of containerisation (e.g., Docker, Kubernetes) and microservices architecture Skilled in using observability and monitoring tools such as Prometheus, Grafana, ELK stack, or AWS CloudWatch Excellent analytical and troubleshooting abilities, especially within complex distributed systems Proven experience handling incident management and conducting blameless postmortems, including leading cross-functional teams through More ❯
Grays, England, United Kingdom Hybrid / WFH Options
TES
environment. Security Best Practices: Strong understanding of security frameworks and compliance standards for cloud infrastructure and DevOps processes. Monitoring & Observability: Understanding of monitoring, logging, and alerting tools (e.g., Prometheus, Grafana, ELK) to ensure system performance and issue tracking. Skills CI/CD Tools: Hands-on experience with Jenkins, GitLab CI/CD, Travis CI, or similar tools for building CI More ❯
Actions) Work with cloud platforms such as AWS, Azure, or GCP Manage infrastructure as code using tools like Terraform Monitor and maintain production systems using tools such as Prometheus, Grafana, or Datadog Collaborate with development and QA teams to improve deployment processes and system reliability Contribute to incident response, troubleshooting, and root cause analysis Requirements Approximately 18 months of experience More ❯
Lisburn, Northern Ireland, United Kingdom Hybrid / WFH Options
Camlin Energy
Science, Management Information Systems, or related is desirable but not essential. Nice to have but not essential: Container Orchestration (Kubernetes, Docker Swarm) Service monitoring and graphing tools (Prometheus + Grafana, Nagios + Munin) Elastic stack Infrastructure as Code (Terraform) Repository solutions (Jfrog Artifactory, Jfrog Bintray, Reprepro) Lets Encrypt/ACME OpenVPN Apache Tomcat Messaging streams or communication platforms (RabbitMQ, Postfix More ❯
as: Docker, OpenShift, Kubernetes etc. Infrastructure as Code and CI/CD paradigms and systems such as: Ansible, Terraform, Jenkins, Bamboo, Concourse etc. Monitoring utilising products such as: Prometheus, Grafana, ELK, filebeat etc. Observability - SRE Big Data solutions (ecosystems) and technologies such as: Apache Spark and the Hadoop Ecosystem Edge technologies e.g. NGINX, HAProxy etc. Excellent knowledge of YAML or More ❯
DevOps to optimize build times, parallelize tests, and reduce pipeline flakiness. Result Analysis & Root Cause • Analyze test outputs, system logs, and metrics (e.g., via ELK Stack or Prometheus/Grafana) to pinpoint failures and performance regressions. • Lead root-cause investigations for infrastructure incidents, producing clear post-mortem reports and remediation recommendations. Defect Management • Log, triage, and track defects in Jira More ❯
of building and maintaining CI/CD pipelines using the likes of GitLab, Jenkins, CircleCI, CodeBuild etc. Familiarity with scripting (Bash or Python). Monitoring and alerting tools - Prometheus, Grafana or Splunk, ELK. We're looking for someone who wants to progress their career into the DevOps arena. Submit your CV now to be considered. IND_PC1 Carbon60, Lorien & SRG More ❯
South East London, England, United Kingdom Hybrid / WFH Options
SoTalent
teams to ensure robust and scalable integrations. Drive continuous improvement, automation, and cost-optimization across engineering platforms. Provide advanced troubleshooting and 3rd-line production support using tools like Prometheus, Grafana, and ELK Stack . Maintain detailed technical documentation, system diagrams, and operational runbooks. Ensure compliance with data security and regulatory standards (e.g., GDPR, ISO 27001). Contribute to disaster recovery More ❯
development teams to improve time-to-market and promote cloud adoption Create CI/CD pipelines using Jenkins, Bitbucket/Git Create monitoring dashboards and metrics using Splunk, Prometheus, Grafana Write unit tests in Go/Python/Java Work on a globally distributed team CME Group: Where Futures Are Made CME Group (www.cmegroup.com) is the world's leading derivatives More ❯
South East London, England, United Kingdom Hybrid / WFH Options
Amber Labs
Agile teams using tools like Git , Jira , and Confluence Eligible for SC and NPPV3 clearance Desirable: Container orchestration with Kubernetes HashiCorp tools: Vault , Consul , Packer Monitoring and observability with Grafana , Prometheus , or similar Familiarity with cloud networking, VPCs, NAT Gateways, security groups, etc. Personal Attributes: Proactive and self-driven with a passion for technology Strong problem-solving mindset Collaborative team More ❯
Monitoring and Alerting: Set up monitoring and logging systems to proactively detect and address potential issues, ensuring optimal performance and reliability in environments like on-prem Prometheus/Thanos, Grafana Cloud, and Grafana Cloud Loki. Database Management: Manage hundreds of on-prem PostgreSQL databases, including performance tuning, backups, and disaster recovery strategies. Collaboration: Work closely with cross-functional teams, including More ❯
Experience with Terraform, Kubernetes, Kafka, Docker, Redis, MongoDB. Experience with application clustering, load balancing, high availability, and reliability concepts and supporting technologies. Experience with monitoring systems such as Prometheus, Grafana, Splunk, or the ELK Stack. Clear written and verbal communication skills. Some level of participation in an on-call escalation path. A passion for providing excellent service to all internal More ❯
of CI/CD pipelines (GitHub Actions, GitLab CI/CD, Jenkins, etc.). Familiarity with cloud platforms (AWS, GCP, Azure). Experience with monitoring tools such as Prometheus, Grafana, or the ELK stack. Basic scripting abilities in Python, Bash, or PowerShell. Seniority Level Mid-Senior level Employment Type Full-time Job Function Information Technology Industries Software Development and Technology More ❯
green deployments, canary releases). Implement comprehensive monitoring, logging, and alerting systems to proactively identify and address performance issues, errors, and security threats. Use tools like Azure Monitor, Prometheus, Grafana, or similar to collect and analyse metrics, logs, and traces. Configure alerts and notifications to ensure timely responses to critical events. Security & Compliance: Implement security best practices and controls within More ❯
Southampton, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
GitHub Actions, CircleCI, or similar. Experience with messaging systems (RabbitMQ, Kafka, etc.) and event-driven architecture. Proficiency in infrastructure as code (Terraform preferred). Familiarity with monitoring stacks (Prometheus, Grafana, ELK, etc.) and system tuning. Security-conscious mindset; experience implementing controls in regulated or financial environments is a plus. Excellent problem-solving skills and a proactive attitude. Strong communication and More ❯
other CI tools; Maven, Gradle or other build tools; Ansible or other IT Automation/software provisioning tools; JIRA, Confluence; * Experience in monitoring/reporting tools such as Splunk, Grafana/Prometheus etc * Experience in Agile practices * Working knowledge of environment monitoring tools such as GCO, NewRelic, Prometheus, Grafana. * Collaboration Skills: Proactive can-do attitude; A creative approach towards solving More ❯
Excellent Shell Scripting and Python skills Experience with monitoring/metrics platforms (Datadog/Prometheus) Build/Deployment - CI/CD, Jenkins, Bitbucket, Git, Maven, Helm Monitoring - Splunk, Prometheus, Grafana, ELK Stack Knowledge of security best practices in cloud environments. Able to assess security of existing applications and define standards for new projects Experience with Nix desirable The desired profile More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
Arm Limited
Skills and Experience: Experience in a GitOps solution such as ArgoCD, Flux or Fleet Implementation of the Security Development Lifecycle (SDL) in infrastructure Monitoring and observability using Prometheus and Grafana, ELK stack or equivalent Use of Kubernetes management systems such as Rancher Familiarity with open source project development cycles and contribution processes, particularly around CI/CD infrastructure Accommodations at More ❯
Perl, JAVA) and automation skills. Thorough knowledge of Jenkins and pipeline using Groovy script. Experience with Docker containers and Amazon Linux 2023 AMI. Experience with system monitoring tools (e.g., Grafana, Alert Manager, Prometheus, and Node exporter). Ability to analyse and resolve complex infrastructure resource and application deployment issues. Experience with Git, Jira, Confluence, and ServiceNow for incident and change More ❯
CI/CD pipelines Containerization and Orchestration: Knowledge of containerization and orchestration technologies (e.g., Docker, Kubernetes) Monitoring and Logging: Experience with monitoring and logging tools like DataDog, Prometheus, or Grafana Data Engineering Skills: Knowledge of event streaming platforms (e.g., Apache Kafka) and SQL database management Strong Communication and Collaboration: Excellent communication skills and the ability to work effectively in a More ❯
/AWX), automation pipelines (ArgoCD/Azure DevOps or similar), Infrastructure as Code (Terraform) and Version Control (git) Containerization and Orchestration: Docker/Kubernetes Monitoring and Logging: (Prometheus/Grafana/Elastic stack) Networking and Security: Virtual Private Cloud/Security Groups and Network ACLs/Identity and Access management Storage technologies (Object/File) - Implementing and managing storage accounts More ❯