london (city of london), south east england, united kingdom Hybrid / WFH Options
Searchability NS&D
Kubernetes, Docker, Helm Proficient in Terraform, CI/CD Pipelines (Drone/GitLab) Excellent understanding of Kafka internals, stream processing, and secure Kafka deployments Strong experience across monitoring (Prometheus, Grafana, CloudWatch) Knowledge of security hardening, IAM, WAF, Shield, Vault Working knowledge of Agile, Infrastructure-as-Code, and DevSecOps practices UK*C or Enhanced DV (eDV) Clearance is a must To More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Computer Futures
Architecting and maintaining AWS cloud environments Managing Kubernetes clusters (plus Docker & Helm) Building CI/CD pipelines and automated deployment tools Driving observability with tools like CloudWatch, ELK, and Grafana Mentoring junior engineers and shaping DevOps best practices Ensuring security, compliance, and disaster recovery readiness What You Bring You're a tech-savvy problem solver with a passion for DevOps More ❯
as log ingestion and communication issues. Design and develop scalable, robust, and high-performance data pipelines and data storage solutions. Develop and maintain observability frameworks using tools like Kibana, Grafana, or similar Work with cross-functional teams to define observability and search requirements. Scale, script and maintain our development and production platform foundation with AWS and GCP Stay updated on More ❯
things work as needed. Experience with relational and non-relational databases. Experience delivering high levels of observability and proficiency in improving early warning systems, for example: has worked with Grafana/DataDog/Prometheus. Collaborating with internal/external teams/engineers and fostering an inclusive environment, where all points of view are welcomed and encouraged. Own and lead multiple More ❯
Nottingham, Nottinghamshire, United Kingdom Hybrid / WFH Options
Commify Group
requirements for cloud-based solutions Comprehensive knowledge of Microsoft Azure Cloud offerings (especially in PaaS) Experience with tools such as Terraform, Ansible, VSTS, ARM, Puppet, Chef, Jenkins, ELK, and Grafana Understanding of DNS, Load Balancer configuration, Active Directory, and network infrastructure in the cloud Experience in agile environments and methodologies including TDD, Scrum, or Kanban Knowledge of monitoring and alerting More ❯
large datasets. Git - Version control is second nature. You know how to branch, commit, and collaborate cleanly. Bonus Skills (nice to have): Apache Hadoop, Spark/Docker, Kubernetes/Grafana, Prometheus, Graylog/Jenkins/Java, Scala/Shell scripting Team ️ Our Tech Stack We build with the tools we love (and we love good tools): TypeScript, Node.js, React, Python More ❯
large datasets. Git - Version control is second nature. You know how to branch, commit, and collaborate cleanly. Bonus Skills (nice to have): Apache Hadoop, Spark/Docker, Kubernetes/Grafana, Prometheus, Graylog/Jenkins/Java, Scala/Shell scripting Team ️ Our Tech Stack We build with the tools we love (and we love good tools): TypeScript, Node.js, React, Python More ❯
pipelines Familiarity with regulated workflows: ISO27001, SOC2, GDPR aren't just abbreviations, and don't fill you with dread Observability skills: Well familiar with Open Telemetry, Prometheus, Loki and Grafana CI/CD pipeline skills: You know what it takes to build templates and guardrails to allow the most junior developers to confidently push code, safely knowing that the computer More ❯
cloud architecture IoT 'smart' edge devices (using nVidia AI chips) Linux-based embedded OS on our Edge devices Continuous Integration and Delivery using Jenkins, SonarQube Terraform for infrastructure management Grafana, Elasticsearch, Kibana & New Relic for metrics, logs and monitoring In the company we also use: VueJS, MySQL, Spring Boot, Apache Camel, AWS Redshift, AWS SageMaker, Pentaho, Balena, Serverless functions Winnow More ❯
Ontario, California, United States Hybrid / WFH Options
annex it solutions
/GCP) with a focus on scalability, security, and cost optimization. Automate configuration management and deployments using Terraform, Ansible, or Chef. Implement monitoring, alerting, and logging solutions using Prometheus, Grafana, ELK, or equivalent. Troubleshoot complex production issues and ensure high system availability. Collaborate with development, QA, and security teams to improve software reliability and performance. Provide technical guidance and mentorship … ARM templates, Ansible. Strong scripting skills in Python, Bash, or PowerShell. Solid understanding of networking, security, and Linux/Windows system administration. Experience with monitoring/logging solutions: Prometheus, Grafana, ELK Stack. Excellent problem-solving, communication, and collaboration skills. Preferred Qualification Azure DevOps Engineer (AZ-400) or AWS DevOps certification. Experience with microservices architecture and serverless technologies Familiarity with Agile More ❯
Sheffield, South Yorkshire, Yorkshire, United Kingdom Hybrid / WFH Options
VANLOQ LIMITED
Required Skills: Proven experience in Python development & FastAPI Strong knowledge of PostgreSQL database administration Excellent problem-solving, debugging, and analytical skills Nice to Have: Exposure to observability tools ( Prometheus, Grafana, OpenTelemetry ) Experience with enterprise tools (Control M, True Sight, Guardium, Tenable Nessus, Delinea) Understanding of security and software development in highly regulated environments End-to-end experience with CI/ More ❯
Newcastle Upon Tyne, Tyne and Wear, England, United Kingdom Hybrid / WFH Options
Lorien
experience with Azure or AWS. Solid background with Terraform and IaC. Proven use of CI/CD tools (Jenkins, GitHub Actions, GitLab CI, etc.). Knowledge of Prometheus and Grafana for monitoring. Familiarity with collaboration tools like Slack. Either: Prior management/team lead experience, or A Senior DevOps engineer ready to progress into a managerial role. (Bonus) Background in More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Harnham - Data & Analytics Recruitment
up Experience Strong cloud skills (AWS, GCP, Azure) and containerisation (Docker, Kubernetes) Experience in automating deployments and orchestrating cloud environments Nice to have: Python (Jupyter, PyTorch), monitoring tools (Prometheus, Grafana), cloud databases (RDS, Aurora, Spanner), CI/CD tools (CircleCI), and data visualisation experience. This is a unique opportunity to join a visionary team redefining AI in 3D , with the More ❯
Employment Type: Full-Time
Salary: £140,000 - £160,000 per annum, Inc benefits
of Kubernetes and GPU scheduling, including setup of GPU-enabled clusters and deployment of GPU workloads in Kubernetes. Familiarity with GPU monitoring and observability, using tools such as Prometheus, Grafana, NVIDIA Data Center GPU Manager (DCGM), or custom scripts. Proven ability to analyze deployment approaches for GPU-accelerated serving frameworks and deliver reference implementations. Experience implementing software quality engineering practices More ❯
Sheffield, South Yorkshire, Yorkshire, United Kingdom Hybrid / WFH Options
DWP Digital
and skill at identifying performance bottlenecks cross application and infrastructure layers. Strong working knowledge and practical experience of performance testing tools (JMeter and K6) and performance monitoring tools like Grafana, Kibana, Kiali, Prometheus and AWS CloudWatch. Detailed working knowledge and familiarity with IaC and modern DevOps practices You and your role If you're passionate about making sure systems run More ❯
Birmingham, West Midlands, United Kingdom Hybrid / WFH Options
DWP Digital
and skill at identifying performance bottlenecks cross application and infrastructure layers. Strong working knowledge and practical experience of performance testing tools (JMeter and K6) and performance monitoring tools like Grafana, Kibana, Kiali, Prometheus and AWS CloudWatch. Detailed working knowledge and familiarity with IaC and modern DevOps practices You and your role If you're passionate about making sure systems run More ❯
Leeds, West Yorkshire, Yorkshire, United Kingdom Hybrid / WFH Options
DWP Digital
and skill at identifying performance bottlenecks cross application and infrastructure layers. Strong working knowledge and practical experience of performance testing tools (JMeter and K6) and performance monitoring tools like Grafana, Kibana, Kiali, Prometheus and AWS CloudWatch. Detailed working knowledge and familiarity with IaC and modern DevOps practices You and your role If you're passionate about making sure systems run More ❯
Newcastle Upon Tyne, Tyne and Wear, North East, United Kingdom Hybrid / WFH Options
DWP Digital
and skill at identifying performance bottlenecks cross application and infrastructure layers. Strong working knowledge and practical experience of performance testing tools (JMeter and K6) and performance monitoring tools like Grafana, Kibana, Kiali, Prometheus and AWS CloudWatch. Detailed working knowledge and familiarity with IaC and modern DevOps practices You and your role If you're passionate about making sure systems run More ❯
Annapolis Junction, Maryland, United States Hybrid / WFH Options
Codescratch LLC
Experience with asynchronous messaging systems (RabbitMQ, Apache Kafka, etc.) Experience creating and integrating with remote services via HTTP, Thrift, or gRPC Experience monitoring application performance with metrics (Prometheus, InfluxDB, Grafana) and logs with ELK Stack (ElsticSearch, Logstash, Kibana) Salary Range Pay range $165,000 - $205,000 . (Plus Benefits) The pay range for this job level is a general estimated More ❯
Lexington, Massachusetts, United States Hybrid / WFH Options
Raft
DoD/Air Force AOC Weapon System and operating standards within cleared facilities (SIPR, IL6) - Familiarity with AWS and cloud technologies - Skill in operating observability tooling and alerting (Prometheus, Grafana, etc.) - Knowledge of Platform One Big Bang Clearance Requirements: Active Secret security clearance Work Type: Hybrid - Hanscom AFB, MA highly preferred (or local to Reston, VA or Hampton, VA or More ❯
practise in those environments Terraform Azure DevOps pipelines and deployment automation Container build and hardening Automation for certificate renewals, cost reporting, BC/DR planning Observability tooling and alerting (Grafana, Elastic Prometheus ) Linux system administration Networking, routing, load balancing, IPv4/IPv6, VPC, VPNs, fwd/rev proxies Skills and experience 3+ years in senior DevOps role owning business-critical More ❯
management experience Tech Stack AWS, Terragrunt, Terraform, EKS, Helm, ArgoCD, Docker, Gitlab SaaS, Gitlab CI, Victoria Metrics (Prometheus compatible), Vault, Clickhouse, PostgreSQL, MariaDB/MySQL, MongoDB, KeyCloak, ELK logging, Grafana Legacy (migrating from): Kubernetes (Bare Metal), Ceph, Jenkins, Gitlab On-Premises AI Usage Disclaimer At FXC Intelligence, we are enthusiastic about the use of AI tools and value candidates with More ❯
and deploying services with Java and Spring Boot. Comfort working in a cloud-native environment - Kubernetes (EKS), containers, scaling etc. An interest in observability, using tools like Prometheus and Grafana to keep services healthy and understand usage patterns. Familiarity with AWS services and how to integrate them into modern applications. A keen focus on quality and security, baking testing and More ❯
Sheffield, England, United Kingdom Hybrid / WFH Options
Vallum Associates
Management Operations and automation workflows Troubleshoot automation issues across scripting, API s and containerized environments. Nice to have Exposure on enhancing observability with knowledge of tools such as Prometheus, Grafana, and OpenTelemetry. Advantageous to have enterprise tools knowledge (i.e., Control M, True sight, Guardium, Tenable Nessus, Delinea) Knowledge of Security and Software Development in a Highly regulated environment End-to More ❯
handsworth, yorkshire and the humber, united kingdom Hybrid / WFH Options
Vallum Associates
Management Operations and automation workflows Troubleshoot automation issues across scripting, API s and containerized environments. Nice to have Exposure on enhancing observability with knowledge of tools such as Prometheus, Grafana, and OpenTelemetry. Advantageous to have enterprise tools knowledge (i.e., Control M, True sight, Guardium, Tenable Nessus, Delinea) Knowledge of Security and Software Development in a Highly regulated environment End-to More ❯