and release gating into the SDLC. Ensure pipeline scalability and governance while maintaining developer velocity. Observability & Troubleshooting Lead the implementation and usage of modern observability stacks (e.g., OpenTelemetry, Prometheus, Grafana, Splunk, Datadog). Establish SLOs, SLIs, and error budgets with product and engineering teams. Drive root cause identification using distributed tracing, advanced log analysis, and anomaly detection. Security, Audit & Compliance More ❯
Node, RabbitMQ Databases - Postgres, MariaDB, MongoDB, ClickHouse, Redis, JupyterLab, Metabase Data Engineering & Orchestration - Python, Airflow, Kafka, DataHub Cloud & Infrastructure - AWS, K8s DevOps & CI/CD - Git, GitLab CI, DBS, Grafana, ELK, Prometheus, Docker, Docker Compose Why join us? Shape the future of a data business at the forefront of global payments insights A chance to work with a vibrant, friendly More ❯
applications and optimizing fleet utilization - Strong understanding of network fundamentals (DNS, DHCP, TCP/IP, routing, load balancing, load shedding) and experience with monitoring frameworks (such as CloudWatch, Datadog, Grafana, Elastic or similar) - Experience scripting operating system tasks in Bash, Python, etc. and with Infrastructure as Code, (such as CDK, CloudFormation, Puppet, Chef, Ansible, or similar) - Experience operating services in More ❯
Contribute to codebases as needed to drive projects forward Requirements Technical Expertise Proven experience managing Kubernetes clusters and expertise in container orchestration. Experience with observability tools (e.g., DataDog, Prometheus, Grafana) Experience with Infrastructure as Code (IaC) tools like Terraform or CloudFormation Experience in Database optimization and management (especially for multi-tenant architectures) Extensive knowledge of AWS services, including EKS, Lambda More ❯
variety of CI/CD tools and technologies (e.g., Git, Gitlab, Jenkins, GCP, AWS) Knowledge of containerisation and microservice architecture Ability to develop dashboard UIs for publishing performance (e.g., Grafana, Apache Superset, etc.) Exposure to safety certification standards and processes We provide: Competitive salary, benchmarked against the market and reviewed annually Company share programme Hybrid and/or flexible work More ❯
Job role: IBM Power/Midrange Systems Engineer Location: South West of England Salary: £50,000 - £65,000 Employment Type: Permanent Chapman Tate Associates are proud to be working with a highly respected and innovative manufacturing business currently looking to More ❯
Cheltenham, Gloucestershire, United Kingdom Hybrid / WFH Options
TwinStream
DV Application Support Engineer - Contract (outside of IR35) Who are we: In 2019, our founders were working as engineers solving complex cross domain problems within government organisations TwinStream was formed to consolidate their collective expertise and experience into one business More ❯
CD pipelines (Jenkins, GitLab CI/CD or similar) * Configuring Kubernetes clusters for secure, scalable deployments * Building automation across infrastructure provisioning and testing * Implementing monitoring and alerting (e.g., Prometheus, Grafana) * Managing repositories and version control (Git) * Driving SRE practices around performance, resilience, and supportability * Working closely with dev teams to integrate platform tooling into workflows * Supporting infrastructure security, maintainability and …/SRE/DevOps roles * Strong Kubernetes experience (config and deployment) * Deep CI/CD experience - Jenkins, GitLab CI/CD or similar * Skilled with infra observability tooling (Prometheus, Grafana, etc.) * Confident with Git and repo management workflows * Strong automation mindset - reducing manual intervention wherever possible * Cloud experience (AWS, Azure or GCP) * Must be a sole UK national and eligible More ❯
DevOps Engineer level Incident, change & problem management experience. This role is heavily operational-oriented, including on-call requirements Strong background in setup & operation of enterprise observability tooling, specifically Prometheus, Grafana and Splunk, including usage of PromQL Proficient in one or more languages of Python, Go, Bash, SQL Familiar with GitHub/GitOps/container orchestration/Kubernetes operations Working configuration … and deployment management experience with CI/CD Skills AWS prometheus grafana Splunk Go SQL Job Title: SRE Location: London, UK Job Type: Contract Trading as TEKsystems. Allegis Group Limited, Maxis 2, Western Road, Bracknell, RG12 1RT, United Kingdom. No. (phone number removed). Allegis Group Limited operates as an Employment Business and Employment Agency as set out in the More ❯
stack, cross-functional teams, working closely with people of different specialisms within your team and across the business. AWS, Serverless, Terraform, C#, .NETCore, TypeScript, Node.js, GraphQL, React, Snowflake, Docker, Grafana GitHub for source control and continuous integration Developing solutions using Generative AI models Robust and performant cloud/serverless applications, with a focus on user experience and business growth. Backstage … of the technologies above, so if your experience doesn't cover some areas but you have cloud/serverless experience, please apply. How we get there Tools and Practices: Grafana, AWS Cloudwatch, CI/CD pipelines. Methodologies: Test-Driven Development (TDD), Pair Programming, and Experimentation. Engineering Principles: We apply core engineering principles, including SOLID, KISS, Conway's Law, and the More ❯
internal tooling and services Hands-on experience with AWS, Kubernetes, Docker, and modern CI/CD pipelines Familiarity with infrastructure-as-code (e.g., Terraform) and observability tooling (e.g., Prometheus, Grafana) Comfortable working on distributed systems and improving developer workflows A product mindset and a collaborative approach to problem-solving Experience with Kafka, gRPC, or open-source contributions is a bonus More ❯
IP, VLANs, routing). You will bring some of these skills, but more importantly you're interested in learning these things: • Hardware & physical infrastructure. • Data-driven monitoring and observability (Grafana, InfluxDB, Prometheus, Elastic). • Exposure to configuration management (Puppet, Ansible, Terraform). • Some exposure to scripting (Bash, Python). • Supporting CI/CD delivery pipelines (GitLab, GitHub). 25 days More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
Suits Me Limited
to enable rapid and reliable delivery of services Contributing to the design of scalable and secure platform components that enable developer productivity Building and improving observability tooling (e.g. CloudWatch, Grafana) to support rapid detection and resolution of issues Collaborating with developers and stakeholders across squads to understand infrastructure needs and ensure best practices are applied Writing technical documentation and contributing More ❯
jest, or similar ; • Experience in server technologies, specifically AWS; • Experience in Docker applications will be a plus; • Familiarity with open-source software a plus; • Familiarity with JIRA, Jenkins, Elasticsearch, Grafana, Kibana is a plus; • Experience in Kubernetes will be a plus; Additional Information • Communication is extremely important; our engineers work across every team in the organization. Candidates need to be More ❯
as Azure, AWS or GCP. Experience with Kubernetes is desirable. You have a high degree of experience in observing the performance and health of applications via tools such as Grafana, Prometheus, Data Dog, Sentry, etc. You have a strong desire and are an advocate for performant applications. You have a flair for simplicity when problem solving. Excellent communication skills, with More ❯
Gloucester, Gloucestershire, South West Hybrid / WFH Options
CGI
Automation Tester (DV Security Clearance) Position Description CGI was recognised in the Sunday Times Best Places to Work List 2025 and has been named one of the 'World's Best Employers' by Forbes magazine. We offer a competitive salary, excellent More ❯
using programming languages. Python or Java is preferred. Full understanding of the end-to-end trade lifecycle (FX knowledge preferred) Experience using monitoring tools such as Splunk, Prometheus or Grafana etc. Expertise on containerization alongside tools like Docker, Kubernetes, and CI/CD. Exposure to Linux/Unix and SQL This is a great opportunity for a Production Engineer to More ❯
Farnborough, Hampshire, England, United Kingdom Hybrid / WFH Options
Randstad Technologies
or private cloud platforms Proficient in Infrastructure as Code - Ansible, Terraform Skilled in CI/CD tools Solid scripting skills - PowerShell, Python, or equivalent Experience with monitoring tools - Prometheus, Grafana, Kibana Please note: Active SC Clearance is essential Hybrid working - Farnborough-based Day Rate: £450-£550/day Duration: 6 months | Inside IR35 If this seems of interest to you More ❯
on AWS and other providers Operating MongoDB (or other document database) clusters Operating Redis (or other key-value storage) clusters Administering Linux servers Maintaining distributed software Operating Prometheus and Grafana Operating logging collection and analysis systems Participating in the on-call rotation(4:00am - 16:00pm UTC) Skills: Kubernetes & containers (advanced) AWS/EKS (advanced) Linux (advanced) Terraform and IaC … in general (proficient) Helm (proficient) Go and/or Python (familiar) MongoDB (or similar) Redis (or similar) Monitoring - prometheus, grafana, thanos (familiar) Grasp of networking concepts (subnets, routing, peering, load balancing, NAT, etc.) Common networking protocols (DNS, TCP/IP, HTTP, TLS, UDP) Proactive, energetic, innovative and change oriented Nice to have: GCP or Azure Bare metal infrastructure engineering API More ❯
on AWS and other providers Operating MongoDB (or other document database) clusters Operating Redis (or other key-value storage) clusters Administering Linux servers Maintaining distributed software Operating Prometheus and Grafana Operating logging collection and analysis systems Working hours within 16:00pm - 4:00am UTC Skills: Kubernetes & containers (advanced) AWS/EKS (advanced) Linux (advanced) Terraform and IaC in general (proficient … Helm (proficient) Go and/or Python (familiar) MongoDB (or similar) Redis (or similar) Monitoring - prometheus, grafana, thanos (familiar) Grasp of networking concepts (subnets, routing, peering, load balancing, NAT, etc.) Common networking protocols (DNS, TCP/IP, HTTP, TLS, UDP) Proactive, energetic, innovative and change oriented Nice to have: GCP or Azure Bare metal infrastructure engineering API management experience Large More ❯
NPM for application builds Cypress for automated testing Git for source control Terraform and Ansible for infrastructure configuration OpenShift, RHEL/CentOS, and Docker for deployment targets InfluxDB and Grafana for monitoring and observability Oracle (or equivalent RDBMS), AMQP, and S3-compatible object storage systems Please note this role requires active UK C DV Clearance. Hold the necessary clearance but More ❯
Hounslow, London, United Kingdom Hybrid / WFH Options
Deerfoot Recruitment Solutions
and work independently across technical tasks What You'll Need Languages & Tools: Python, Ansible (C++, Go a plus), Git, Jira, Confluence Cloud & Infrastructure: Azure, Kubernetes, OpenShift Monitoring: Splunk, Prometheus, Grafana Databases: Oracle (OCA/OCP a plus) Environments: Linux/Unix Strong debugging, problem-solving, and collaboration skills Proven experience in DevOps and service reliability roles Interested? Apply now and More ❯
Nginx • Azure Load Testing Azure Application Insights Azure Kubernetes Service • Platform tuning experience Beneficial skills • Bicep • CloudFlare • ARM Templates • Familiar with Octopus Deploy • Knowledge of C# .NET • Prometheus/Grafana dashboards • Seq, Loki or other application logging software • VM's Company benefits • Full private health insurance through our healthcare partner, Vitality Health • Group Life Insurance and Income Protection • BUPA Dental More ❯
deep understanding of UNIX, Linux, networking (TCP/IP), and databases (both relational and NoSQL). Experience in android and iOS application debugging. Experience with observability tools such as Grafana and Prometheus, and skills in documenting procedures for knowledge management. Strong interpersonal and communication skills to thrive in fast-paced, dynamic environments. NOTE: As part of the operation staff members More ❯
Chester, Cheshire, United Kingdom Hybrid / WFH Options
Lloyds Banking Group
such as Jest, Enzyme, React Testing Library, Pact, Cypress and Playwright. DevOps: Familiarity with CI/CD and build pipelines, using tools such as Github , Harness, Jenkins, Docker, ELK, Grafana and Dynatrace. Take ownership and responsibility for the lifespan of the things you contribute to. A "you build it, you run it" attitude. ABOUT WORKING FOR US Our ambition is More ❯