problem-solving and analytical abilities. Excellent communication and teamwork skills. Eagerness to learn and adapt in a fast-paced trading environment. Desirable Experience with metrics & monitoring, OpenTelemetry, Splunk, Prometheus, Grafana, etc. Experience and knowledge of working with distributed systems Experience with Kubernetes Knowledge of networking (HTTP/TCP/UDP/IP). Experience in Financial markets. Experience working in More ❯
Docker and container orchestration (ECS, EKS, or Kubernetes) Experience setting up CI/CD pipelines using GitHub Actions or similar tools Familiarity with monitoring and alerting tools (e.g. Prometheus, Grafana, CloudWatch, Sentry, DataDog) A security-first mindset when designing and managing infrastructure Nice to Haves Experience working in regulated or high-trust environments Knowledge of zero-downtime deployment patterns and More ❯
charts and managing container infrastructure. Knowledge of GitOps tools (e.g. ArgoCD). Knowledge of Service Mesh technologies (e.g. Anthos). Exposure to monitoring, logging, and observability tooling (e.g. Prometheus, Grafana, GCP Operations Suite). Behavioural Competencies Cross-Team Collaboration: Works effectively with engineering, security, support, and governance to improve platform maturity. Problem Solving: Identifies platform bottlenecks and works to resolve More ❯
on git-based commercial source control or similar (e.g., AzureDevOps, github including Actions, gitlab, bitbucket etc). Good to have Ideally, developing/configuring and publishing dashboards (ideally via Grafana or PowerBI). Ideally, Infrastructure as a code with Cloud formation/ARM templates, Terraform and Ansibl. Ideally, Linux Server Administration including container technology & ecosystem (docker, Kubernetes, Prometheus) linked to More ❯
Leeds, West Yorkshire, Yorkshire, United Kingdom Hybrid / WFH Options
Netcompany UK Limited
Kubernetes Service (AKS), Azure Synapse Analytics, or Azure Cognitive Services Azure certifications, such as Azure Solutions Architect Expert or DevOps Engineer Expert Experience with infrastructure monitoring tools like Prometheus, Grafana, or Azure Monitor at scale Background in implementing disaster recovery and high-availability solutions for critical systems Qualifications Bachelor's or Master's degree in Computer Science, Information Technology, or More ❯
Salford, Manchester, United Kingdom Hybrid / WFH Options
Lloyds Bank plc
error budgets, and incident response. Experience with infrastructure as code (e.g., Terraform, Deployment Manager) and CI/CD pipelines. Proficiency in monitoring, logging, and observability tools (e.g., Stackdriver, Prometheus, Grafana). Knowledge of Linux systems, networking, and cloud security best practices. It would be great if you also had Experience working in DevOps environments, with a focus on automation, scalability More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Lorien
to work independently or lead a small team Nice to Have: Experience with TYK API Gateway Exposure to microservices and event-driven architectures Familiarity with observability tools (e.g., Prometheus, Grafana) Carbon60, Lorien & SRG - The Impellam Group STEM Portfolio are acting as an Employment Business in relation to this vacancy. More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Harnham - Data & Analytics Recruitment
and cost optimisation Nice to Have Experience with ML tooling (MLflow, Kubeflow) Knowledge of FastAPI , Databricks, or Snowflake Exposure to SRE practices or cloud security certifications Familiarity with Prometheus , Grafana , or Datadog Interested? If you want to be part of a world-class AI team at an early stage-where your infrastructure decisions will directly shape the company's success More ❯
Cambourne, Cambridgeshire, United Kingdom Hybrid / WFH Options
Remotestar
and maintaining highly reliable infrastructure and services. Expertise in incident management, including incident response, resolution, and post-mortem analysis. Proficiency in monitoring, alerting, and observability tools such as Prometheus, Grafana, ELK stack or Datadog. Experience with cloud platforms such as AWS, Azure, or GCP, including infrastructure as code tools like Terraform or CloudFormation. Strong scripting and automation skills, with proficiency More ❯
Micro Frontends and BFFs Hands-on expertise in React and TypeScript development with an eye for performance and resilience Proven ability to implement observability practices using tools like Prometheus, Grafana, or Azure Monitor Proficiency in containerisation and orchestration (Docker, Kubernetes - ideally AKS or GKE) Experience building and maintaining CI/CD pipelines for frontend applications (e.g. Azure DevOps, GitHub Actions More ❯
driven development, using tools such as Jest, React Testing Library, Cypress, Playwright or Pact. Practical knowledge of CI/CD pipelines and observability tooling - including GitHub, Jenkins, Docker, ELK, Grafana, and Dynatrace. Demonstrated ability to manage, mentor and develop engineers within a delivery-focused team. A collaborative and delivery-focused mindset, comfortable leading from both a technical and team development More ❯
Guildford, Surrey, United Kingdom Hybrid / WFH Options
Electronic Arts
e.g. Perforce, Git) Configuration management tools (e.g. Chef, Ansible, Terraform, Packer) Secrets management tools (e.g Vault) Virtualization environments and tools (e.g. VMs, vSphere) Data and Observability tools (e.g. Splunk, Grafana, New Relic, Open Telemetry) Growth-oriented mindset About Electronic Arts We're proud to have an extensive portfolio of games and experiences, locations around the world, and opportunities across EA. More ❯
Strong knowledge of containerisation (e.g., Docker) and orchestration (e.g., Kubernetes). Deep understanding of cloud security principles: IAM, network security, encryption. Experience with monitoring/alerting tools (e.g., Prometheus, Grafana, ELK stack). Proficient in Git or other version control systems. Desirable Knowledge, Skills and Experience: Certifications in OCI or other cloud platforms (AWS, GCP). Experience with security tools More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
Gamma Communications plc
you'll need: Comfortable with automation, IaC, and CI/CD principles. Understand Network concepts, Infrastructure, and common protocols. Able to write basic scripts for automation Build dashboards in Grafana and understanding of Prometheus and PromQL. Knowledge of SDLC and experience integrating solutions into CI pipelines Experience with cloud (AWS, GCP) is beneficial, but not essential. Able to self-manage More ❯
infrastructure as code. Good understanding of networking and network protocols. Desirable: Experience with scripting or programming in Python. Experience working with Amazon Web Services (AWS). Experience working with Grafana (LGTM) and Prometheus. Experience working with highly available and distributed infrastructure. More ❯
CD tools (eg, Jenkins, GitLab CI, GitHub Actions ). Solid knowledge of Linux systems, networking, and containerisation (Docker) . Understanding of monitoring and tracing tools such as Prometheus, Jaeger, Grafana . Strong analytical, troubleshooting, and communication skills. If you are interested in this opportunity, please apply now with your updated CV in Microsoft Word/PDF format. Disclaimer Notwithstanding any More ❯
delivery Experience building and maintaining CI/CD pipelines, preferably with Azure DevOps Solid grasp of Git version control and GitOps principles Familiarity with observability tooling such as Prometheus, Grafana, or GCP Operations Suite Scripting ability with tools like Bash or Python Understanding of shared service models, access control, and platform support processes Desirable: experience with ArgoCD and service mesh More ❯
live data visualisation Collaborate with QA and DevOps to enhance automated testing and deployment pipelines Lead efforts in securing, scaling, and monitoring the frontend environment Use observability tools (Prometheus, Grafana, Loki) to monitor UI health and performance Drive UI architectural decisions, performance benchmarking, and best practice implementation Skills and Experience Required Degree in Computer Science, Engineering, or a related field More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
CACI Limited
Terraform Background in configuration management and automation Proficiency with containers and orchestration tools such as Helm, Docker, and Kubernetes Experience designing monitoring and logging solutions like CloudWatch, ELK, and Grafana Basic programming skills in at least one language Experience developing and managing CI/CD pipelines Security Clearance: Due to industry requirements, the candidate must obtain high-level security clearance More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
Suits Me Limited
to enable rapid and reliable delivery of services Contributing to the design of scalable and secure platform components that enable developer productivity Building and improving observability tooling (e.g. CloudWatch, Grafana) to support rapid detection and resolution of issues Collaborating with developers and stakeholders across squads to understand infrastructure needs and ensure best practices are applied Writing technical documentation and contributing More ❯
up and managing monitoring, metrics, and alerting systems Experience operating production-grade services at scale Great to have: Experience with tools such as: Terraform, SaltStack, MongoDB, Elasticsearch, Kafka, Prometheus, Grafana or HashiCorp Vault Experience with securing applications, services, and data, including authentication, authorization, TLS, and encryption Exposure to Kubernetes (administering, deploying, or developing apps on K8s clusters) Understanding of compliance More ❯