and automation. Strong knowledge of CI/CD tooling, IaC, and cloud-native technologies. Advanced scripting (Bash, Python) and automation experience. Skilled in monitoring and observability tools (e.g., Prometheus, Grafana, ELK). Strong problem-solving, communication, and leadership skills. Familiarity and Experience of CI/CD Tools: Jenkins, GitLab CI Infrastructure as Code: Terraform, Ansible, Helm Cloud Platforms: AWS, Azure More ❯
Hemel Hempstead, Hertfordshire, South East, United Kingdom Hybrid / WFH Options
Eckoh PLC
and automation tooling (Gitlab experience preferable). Experience with 'infrastructure as code' (Terraform, CloudFormation), containerisation (Docker), and orchestration (Kubernetes). Proficiency with observability and monitoring solutions (e.g., CloudWatch, Prometheus, Grafana, Splunk). Strong understanding of cloud-native development practices and agile ways of working. Confident conducting peer code reviews and providing constructive technical feedback. Desirables: Experience designing solutions in multi More ❯
of networking concepts, protocols, and security principles. Familiarity with infrastructure as code (IaC) tools and configuration management frameworks (e.g. Terraform). Knowledge of monitoring and logging tools (e.g. Prometheus, Grafana, ELK Stack, AWS Cloudwatch) for infrastructure and application monitoring. Excellent problem-solving skills, attention to detail, and ability to work independently and collaboratively in a fast-paced environment. Effective communication More ❯
for storage infrastructure configuration and deployment. Develop Infrastructure-as-Code (IaC) solutions (e.g., using Terraform, Ansible) for scalable and repeatable storage provisioning. Integrate monitoring dashboards and alerting systems (e.g., Grafana, Prometheus, ELK) to ensure visibility into storage health and performance. Collaborate with infrastructure, platform, and cloud teams to align automation with operational goals. Ensure solutions meet enterprise standards for security More ❯
Wokingham, Berkshire, United Kingdom Hybrid / WFH Options
Nordcloud
stakeholders Your key skills: L1 to L3 networking CICD tools such as Azure DevOps, GitHub Actions, Gitlab, Jenkins, TeamCity Scripting languages such as PowerShell, bash Observability/Monitoring: Prometheus, Grafana, Splunk Must have experience with either Kubernetes or OpenShift Hosting technologies such as IIS, nginx, Apache, App Service, LightSail Analytical and creative approach to problem solving We encourage you to More ❯
business requirements Essential Requirements Specialist Knowledge: Demonstrable experience in observability engineering, infrastructure monitoring, or event management roles Experience with traditional and modern observability stacks such as SCOM, SolarWinds, Prometheus, Grafana and Elastic Stack (ELK) Hands-on experience with BMC Helix Operations Manager, TrueSight, or similar enterprise monitoring platforms Solid understanding of AIOps concepts, including event correlation, noise reduction, anomaly detection More ❯
on git-based commercial source control or similar (e.g., AzureDevOps, github including Actions, gitlab, bitbucket etc). Good to have Ideally, developing/configuring and publishing dashboards (ideally via Grafana or PowerBI). Ideally, Infrastructure as a code with Cloud formation/ARM templates, Terraform and Ansibl. Ideally, Linux Server Administration including container technology & ecosystem (docker, Kubernetes, Prometheus) linked to More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Lorien
to work independently or lead a small team Nice to Have: Experience with TYK API Gateway Exposure to microservices and event-driven architectures Familiarity with observability tools (e.g., Prometheus, Grafana) Carbon60, Lorien & SRG - The Impellam Group STEM Portfolio are acting as an Employment Business in relation to this vacancy. More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Harnham - Data & Analytics Recruitment
and cost optimisation Nice to Have Experience with ML tooling (MLflow, Kubeflow) Knowledge of FastAPI , Databricks, or Snowflake Exposure to SRE practices or cloud security certifications Familiarity with Prometheus , Grafana , or Datadog Interested? If you want to be part of a world-class AI team at an early stage-where your infrastructure decisions will directly shape the company's success More ❯
Guildford, Surrey, United Kingdom Hybrid / WFH Options
Electronic Arts
e.g. Perforce, Git) Configuration management tools (e.g. Chef, Ansible, Terraform, Packer) Secrets management tools (e.g Vault) Virtualization environments and tools (e.g. VMs, vSphere) Data and Observability tools (e.g. Splunk, Grafana, New Relic, Open Telemetry) Growth-oriented mindset About Electronic Arts We're proud to have an extensive portfolio of games and experiences, locations around the world, and opportunities across EA. More ❯
GitLab , GitHub Actions, or CircleCI Strong testing capabilities using JUnit , RestAssured , or similar frameworks Proactive with monitoring, observability, and system health Desirable Skills: Exposure to monitoring platforms like Datadog, Grafana, Prometheus , or PagerDuty Familiarity with Python scripting Experience with Kubernetes and deployment tools such as Helm Why Join H&B Tech? Help define the future of digital health & wellness in More ❯
welcome Proficiency in testing frameworks like JUnit and RestAssured A passion for monitoring, observability , and maintaining resilient systems Desirable Skills: Experience with monitoring and alerting tools like Datadog, Prometheus, Grafana, or PagerDuty Exposure to Python scripting Familiarity with deployment platforms such as Kubernetes and tools like Helm Why Join H&B Tech? Be part of a fast-moving, forward-thinking More ❯
Strong knowledge of containerisation (e.g., Docker) and orchestration (e.g., Kubernetes). Deep understanding of cloud security principles: IAM, network security, encryption. Experience with monitoring/alerting tools (e.g., Prometheus, Grafana, ELK stack). Proficient in Git or other version control systems. Desirable Knowledge, Skills and Experience: Certifications in OCI or other cloud platforms (AWS, GCP). Experience with security tools More ❯
infrastructure as code. Good understanding of networking and network protocols. Desirable: Experience with scripting or programming in Python. Experience working with Amazon Web Services (AWS). Experience working with Grafana (LGTM) and Prometheus. Experience working with highly available and distributed infrastructure. More ❯
live data visualisation Collaborate with QA and DevOps to enhance automated testing and deployment pipelines Lead efforts in securing, scaling, and monitoring the frontend environment Use observability tools (Prometheus, Grafana, Loki) to monitor UI health and performance Drive UI architectural decisions, performance benchmarking, and best practice implementation Skills and Experience Required Degree in Computer Science, Engineering, or a related field More ❯
variety of CI/CD tools and technologies (e.g., Git, Gitlab, Jenkins, GCP, AWS) Knowledge of containerisation and microservice architecture Ability to develop dashboard UIs for publishing performance (e.g., Grafana, Apache Superset, etc.) Exposure to safety certification standards and processes We provide: Competitive salary, benchmarked against the market and reviewed annually Company share programme Hybrid and/or flexible work More ❯
Support Kubernetes/OpenShift environments and application deployments Enable developers through onboarding and technical support Maintain and improve CI/CD pipelines (Tekton, Argo CD) Monitor systems using Prometheus, Grafana, Splunk, Loki, and EFK Automate infrastructure provisioning using scripting and IaC tools Collaborate with vendors and internal teams for issue resolution What You'll Bring Strong Linux (Red Hat) and More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
INTEC SELECT LIMITED
Conduct architecture reviews, technical audits, and drive adoption of best practices Partner with infrastructure teams to ensure system reliability and operational efficiency Integrate monitoring and logging solutions (e.g., Prometheus, Grafana, ELK) Define strategies for disaster recovery, scaling, and infrastructure resilience Improve observability by enhancing visibility into performance and error metrics Skills and Experience Required 10+ years of backend development experience More ❯
cloud/Linux fundamentals. Curiosity and the confidence to ask questions in a fast-moving team. Nice-to-haves Exposure to Kubernetes, Docker or Terraform. Experience with observability stacks (Grafana, Prometheus, OpenTelemetry). Familiarity with Postgres. Interest in data-privacy, AdTech/MarTech or large-scale data processing. Familiarity with Kafka, gRPC or Apache Spark. As well as working as More ❯
Fleet, Hampshire, United Kingdom Hybrid / WFH Options
Minutes To Seconds
Expert in Linux systems (systemd, networking, kernel tuning), Kubernetes internals, and container runtimes Real-world application of SRE principles in high-stakes, always-on environments Strong background operating Prometheus, Grafana, and Elasticsearch/Fluentd/Kibana (ELK/EFK) stacks Preferred Qualifications Experience integrating Kubernetes with OpenStack and Magnum Knowledge of Rancher add-ons: Fleet, Longhorn, CIS Scanning Familiarity with More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Rise Technical Recruitment Limited
Comfortable managing deployments using CI/CD pipelines (Github Actions, Jenkins, etc.)*Solid understanding of cloud infrastructure including AWS, Kubernetes, and contect delivery*Exposure to observability tooling (Datadog, Sentry, Grafana) and performance tuning best practice Reference Number: BBBH259301 To apply for this role or for to be considered for further roles, please click "Apply Now" or contact Tommy Williams at More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Rise Technical Recruitment Limited
incidents The Person: *5+ years in SRE, DevOps, or infrastructure engineering*Strong experience with AWS, EKS/Kubernetes, and Terraform*Familiar with Kafka and observability tools like Datadog or Grafana*Able to troubleshoot issues across infrastructure and application layers Reference number: BBBH259300 To apply for this role or for to be considered for further roles, please click "Apply Now" or More ❯
in Computer Science or a related field (or equivalent experience). Preferred Qualifications: Full-stack data platform knowledge. Experience working with OAuth/OIDC and IAM technologies. Familiarity with Grafana, Datadog, or similar monitoring tools. Prior experience developing pipelines in Prophecy IDE. *Rates depend on experience and client requirements More ❯
Bracknell, Berkshire, United Kingdom Hybrid / WFH Options
Techex
ST 2022, ST 2110) Ability to use test equipment/software to analyse MPEG streams Experience of public cloud platform architecture/design Experience with either Influx, Redis, Kafka, Grafana, Kibana Our Values and Benefits Techex has an impressive history with extremely high customer engagement and satisfaction. As a business we have developed consistently through our stellar reputation in the More ❯