products under test: Containerisation (e.g. Docker), Virtualisation and Provisioning, Workload and job scheduling (e.g. Kubernetes, Ray) on high core-count machines and rack-scale installations, Management and Observability (e.g. Prometheus, OpenTelemetry, DataDog, Splunk, etc.). 10+ years of relevant experience related to quality assurance/testing teams. Experience with the Atlassian suite and CI/CD platforms such as Jenkins More ❯
/Unix systems, SQL, and programming languages such as C++, Java or Python. Strong understanding of distributed systems and low-latency architectures Hands-on experience with observability stacks (e.g., Prometheus, Grafana, Splunk, Geneos, OpenTelemetry) and infrastructure automation (e.g., Ansible, Terraform, CI/CD pipelines) Strong understanding of the trade lifecycle, market data, and fixed income products, FX or algorithmic trading More ❯
languages: Java, Python, Go Lang Container orchestration/Cloud platform: RedHat Openshift/AWS/Azure DevOps tools - Ansible, Chef, Kubernetes, GitLab SRE logging & Monitoring Tools - ELK stack, Grafana, Prometheus, Open Telemetry Other highly valued skills include: Strong understanding of Agile application development methodology. Strong knowledge of API development/principles Collaborating with the development teams to build scalable and More ❯
languages: Java, Python, Go Lang Container orchestration/Cloud platform: RedHat Openshift/AWS/Azure DevOps tools - Ansible, Chef, Kubernetes, GitLab SRE logging & Monitoring Tools - ELK stack, Grafana, Prometheus, Open Telemetry Other highly valued skills include: Strong understanding of Agile application development methodology. Strong knowledge of API development/principles Collaborating with the development teams to build scalable and More ❯
languages: Java, Python, Go Lang Container orchestration/Cloud platform: RedHat Openshift/AWS/Azure DevOps tools - Ansible, Chef, Kubernetes, GitLab SRE logging & Monitoring Tools - ELK stack, Grafana, Prometheus, Open Telemetry Other highly valued skills include: Strong understanding of Agile application development methodology. Strong knowledge of API development/principles Collaborating with the development teams to build scalable and More ❯
GitHub Actions, or CircleCI Strong testing capabilities using JUnit , RestAssured , or similar frameworks Proactive with monitoring, observability, and system health Desirable Skills: Exposure to monitoring platforms like Datadog, Grafana, Prometheus , or PagerDuty Familiarity with Python scripting Experience with Kubernetes and deployment tools such as Helm Why Join H&B Tech? Help define the future of digital health & wellness in a More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
Gamma Communications plc
with automation, IaC, and CI/CD principles. Understand Network concepts, Infrastructure, and common protocols. Able to write basic scripts for automation Build dashboards in Grafana and understanding of Prometheus and PromQL. Knowledge of SDLC and experience integrating solutions into CI pipelines Experience with cloud (AWS, GCP) is beneficial, but not essential. Able to self-manage Jira tickets and provide More ❯
Databases - Postgres, MariaDB, MongoDB, ClickHouse, Redis, JupyterLab, Metabase Data Engineering & Orchestration - Python, Airflow, Kafka, DataHub Cloud & Infrastructure - AWS, K8s DevOps & CI/CD - Git, GitLab CI, DBS, Grafana, ELK, Prometheus, Docker, Docker Compose Why join us? Shape the future of a data business at the forefront of global payments insights A chance to work with a vibrant, friendly team in More ❯
that we continuously improve Monitoring and Optimization: Monitor AWS infrastructure performance, troubleshoot issues, and implement optimizations for cost and performance. Implement logging, monitoring, and alerting solutions using AWS CloudWatch, Prometheus, Grafana, and other monitoring tools. Conduct periodic reviews of infrastructure to identify opportunities for optimization and cost reduction. Security and Compliance: Implement security best practices and compliance standards in AWS More ❯
Hands-on experience with classic hosting technologies (e.g. Kubernetes, AWS) • Familiarity with telephony technologies such as SIP, session border controllers, and related components. • Familiarity with observability tools such as Prometheus, Grafana, and Loki. • Strong Experience in Microsoft technology stack • Proficiency in tools such as GitLab, Docker, Terraform, CI/CD, and various deployment architectures. • Strong understanding of cloud cost governance More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Lorien
ability to work independently or lead a small team Nice to Have: Experience with TYK API Gateway Exposure to microservices and event-driven architectures Familiarity with observability tools (e.g., Prometheus, Grafana) Carbon60, Lorien & SRG - The Impellam Group STEM Portfolio are acting as an Employment Business in relation to this vacancy. More ❯
with Kubernetes, Docker, Helm Proficient in Terraform, CI/CD Pipelines (Drone/GitLab) Excellent understanding of Kafka internals, stream processing, and secure Kafka deployments Strong experience across monitoring (Prometheus, Grafana, CloudWatch) Knowledge of security hardening, IAM, WAF, Shield, Vault Working knowledge of Agile, Infrastructure-as-Code, and DevSecOps practices UK*C or Enhanced DV (eDV) Clearance is a must More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Searchability NS&D
with Kubernetes, Docker, Helm Proficient in Terraform, CI/CD Pipelines (Drone/GitLab) Excellent understanding of Kafka internals, stream processing, and secure Kafka deployments Strong experience across monitoring (Prometheus, Grafana, CloudWatch) Knowledge of security hardening, IAM, WAF, Shield, Vault Working knowledge of Agile, Infrastructure-as-Code, and DevSecOps practices UK*C or Enhanced DV (eDV) Clearance is a must More ❯
Hands-on experience in technical integrations and POCs Comfortable coding in any high-level programming language (Java, Go, Python) Strong hands-on knowledge of Kubernetes, AWS, Azure, GCP, Docker, Prometheus, and OpenTelemetry Industry knowledge and opinions on Monitoring, Observability, Log Management, SIEM Engineering/DevOps Background - advantage Experience in Technical Sales of Log Analytics/Monitoring/APM/SIEM More ❯
setting up and managing monitoring, metrics, and alerting systems Experience operating production-grade services at scale Great to have: Experience with tools such as: Terraform, SaltStack, MongoDB, Elasticsearch, Kafka, Prometheus, Grafana or HashiCorp Vault Experience with securing applications, services, and data, including authentication, authorization, TLS, and encryption Exposure to Kubernetes (administering, deploying, or developing apps on K8s clusters) Understanding of More ❯
AWS Control Tower, GCP Resource Manager, etc. Network - AWS Transit Gateway, GCP Shared VPC, AWS Route53, GCP Cloud DNS, etc. Observability - AWS OpenSearch, GCP Monitoring/Traces, OpenTelemetry, Grafana, Prometheus, etc. Automation Prowess: Hands-on experience with modern Infrastructure as Code (IaC) automation tools and frameworks (Terraform, Jenkins, Ansible, etc.). Software Development Acumen: A software development background is highly More ❯
Azure, AWS or GCP. Experience with Kubernetes is desirable. You have a high degree of experience in observing the performance and health of applications via tools such as Grafana, Prometheus, Data Dog, Sentry, etc. You have a strong desire and are an advocate for performant applications. You have a flair for simplicity when problem solving. Excellent communication skills, with the More ❯
via Grafana or PowerBI). Ideally, Infrastructure as a code with Cloud formation/ARM templates, Terraform and Ansibl. Ideally, Linux Server Administration including container technology & ecosystem (docker, Kubernetes, Prometheus) linked to AAD. Ideally, experience in telecommunications and similar regulated verticals and environments. Ideally, working knowledge of ISO 27000, ITIL, or similar regulated environment. Ideally, exposure to CRM & ERP systems More ❯
testing. Strong knowledge of containerisation (e.g., Docker) and orchestration (e.g., Kubernetes). Deep understanding of cloud security principles: IAM, network security, encryption. Experience with monitoring/alerting tools (e.g., Prometheus, Grafana, ELK stack). Proficient in Git or other version control systems. Desirable Knowledge, Skills and Experience: Certifications in OCI or other cloud platforms (AWS, GCP). Experience with security More ❯
AWS Control Tower, GCP Resource Manager, etc. Network - AWS Transit Gateway, GCP Shared VPC, AWS Route53, GCP Cloud DNS, etc. Observability - AWS OpenSearch, GCP Monitoring/Traces, OpenTelemetry, Grafana, Prometheus, etc. Automation Prowess: Hands-on experience with modern Infrastructure as Code (IaC) automation tools and frameworks (Terraform, Jenkins, Ansible, etc.). Software Development Acumen: A software development background is highly More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Harnham - Data & Analytics Recruitment
observability, and cost optimisation Nice to Have Experience with ML tooling (MLflow, Kubeflow) Knowledge of FastAPI , Databricks, or Snowflake Exposure to SRE practices or cloud security certifications Familiarity with Prometheus , Grafana , or Datadog Interested? If you want to be part of a world-class AI team at an early stage-where your infrastructure decisions will directly shape the company's More ❯
and live data visualisation Collaborate with QA and DevOps to enhance automated testing and deployment pipelines Lead efforts in securing, scaling, and monitoring the frontend environment Use observability tools (Prometheus, Grafana, Loki) to monitor UI health and performance Drive UI architectural decisions, performance benchmarking, and best practice implementation Skills and Experience Required Degree in Computer Science, Engineering, or a related More ❯
Guildford, Surrey, United Kingdom Hybrid / WFH Options
Person Centred Software Ltd
and communication skills across distributed teams Bonus points for experience with:Flutter, Blazor, Angular, React, microservices, SaaS platforms, Azure services (Functions, Service Bus), GitLab CI/CD, monitoring tools (Prometheus, Azure App Insights), high availability systems. What We Offer: A base salary of £60,000 - £75,000 and bonusdepending on experience Modern town centre offices in Guildford, with opportunityfor ad More ❯
and predictive analytics. Understanding of AI frameworks and libraries (e.g., TensorFlow, PyTorch, Scikit-learn) and their application in network automation and monitoring. Experience with telemetry and observability frameworks (e.g., Prometheus, Grafana) for real-time network monitoring and troubleshooting. Experience : Minimum of 7 years' of experience in network engineering, operations, and support. Proven ability to work hands-on and take strong More ❯