Prometheus Jobs in Cardiff

1 to 25 of 27 Prometheus Jobs in Cardiff

Site Reliability Engineer

cardiff, United Kingdom
Hybrid / WFH Options
Spectrum IT Recruitment
CI/CD, or CircleCI Strong understanding of containerisation (e.g., Docker, Kubernetes) and microservices architecture Skilled in using observability and monitoring tools such as Prometheus, Grafana, ELK stack, or AWS CloudWatch Excellent analytical and troubleshooting abilities, especially within complex distributed systems Proven experience handling incident management and conducting blameless postmortems More ❯
Posted:

DevOps Engineer

cardiff, united kingdom
Hybrid / WFH Options
Inspirec
Orchestrate and manage containerized applications using Docker, supporting streamlined deployment and environment consistency across development and production. Implement comprehensive monitoring and alerting solutions with Prometheus, Grafana, and AlertManager to proactively identify and resolve system performance issues. Champion DevOps best practices in automation, security, and agile delivery to drive continuous improvement More ❯
Posted:

Senior Software Engineer – Quant Full Stack & Infrastructure (Team Lead)

cardiff, United Kingdom
Hybrid / WFH Options
Trireme
cloud-native deployment strategies. Hands-on with AWS, GCP, and Azure for compute, networking, and storage configurations. Familiarity with monitoring/logging tools (e.g., Prometheus, Grafana, ELK stack). Trading Systems & Finance: Solid understanding of trading infrastructure, latency optimization, execution systems, and market data feeds. Experience working in or with More ❯
Posted:

Site Reliability Engineer

cardiff, United Kingdom
Ubique Systems
technologies (AWS, GCP, or Azure). Strong understanding of Site Reliability Engineering (SRE) practices and principles. Experience with observability and monitoring tools such as Prometheus, Grafana, ELK, Splunk, or Datadog. Familiarity with containerization (Docker, Kubernetes) and infrastructure as code (Terraform, CloudFormation) is a plus. Excellent problem-solving, debugging, and communication More ❯
Posted:

Senior Site Reliability Engineer

cardiff, United Kingdom
Cipher7
. Deep understanding of Container Orchestration technologies such as Kubernetes and Docker . Proficiency in monitoring and logging tools including: Datadog , Splunk , Dynatrace , AppDynamics , Prometheus , Grafana , ELK Stack , CloudWatch , Gremlin , ThousandEyes . Experience with Terraform , Jenkins , GitLab CI , PostgreSQL , Redis , and Kong API Gateway . Solid understanding of networking , security More ❯
Posted:

Senior Software Engineer

cardiff, United Kingdom
Hybrid / WFH Options
Beazley Security
and cloud environments. Experience with CI/CD pipelines and tools (e.g., Jenkins, GitLab CI, CircleCI). Familiarity with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack). Strong problem-solving and analytical skills. Excellent communication and collaboration skills. Experience with version control systems (e.g., Git). Experience working More ❯
Posted:

Data Engineer ( DV Cleared )

cardiff, United Kingdom
Hybrid / WFH Options
LHH
or highly regulated sectors Familiarity with Apache Kafka, Spark, or Hadoop Experience with Docker and Kubernetes Use of monitoring/alerting tools such as Prometheus, Grafana, or ELK Understanding of machine learning algorithms and data science workflows Proven ability to deliver end-to-end data solutions Knowledge of Terraform, Ansible More ❯
Posted:

Senior DevOps Engineer - Interest/experience in Trading platforms

cardiff, United Kingdom
Nixor Resource Consulting Ltd
/CD pipelines, infrastructure as code, and cloud automation (Azure preferred). Expertise in Docker, Kubernetes (desirable), Terraform (desirable), Git, and monitoring tools (ELK, Prometheus, or Application Insights). Proficiency in scripting, ideally with PowerShell and Python. Strong communication skills—able to influence, mentor, and challenge the status quo. Be More ❯
Posted:

Principal Backend Engineer (Equity only) 2%

cardiff, United Kingdom
Luupli
e.g., RabbitMQ, Kafka). Deep understanding of API design and best practices (REST, gRPC). Experience with CI/CD pipelines, monitoring tools (e.g., Prometheus, Grafana), and logging systems (e.g., ELK stack).Strong problem-solving, organizational, and communication ski lls. Prefe rred: Experience with distributed systems, event-driven architectures, and More ❯
Posted:

Head of SRE and Production Engineering

cardiff, United Kingdom
SS&C Technologies
the SDLC. Ensure pipeline scalability and governance while maintaining developer velocity. Observability & Troubleshooting Lead the implementation and usage of modern observability stacks (e.g., OpenTelemetry, Prometheus, Grafana, Splunk, Datadog). Establish SLOs, SLIs, and error budgets with product and engineering teams. Drive root cause identification using distributed tracing, advanced log analysis More ❯
Posted:

Site Reliability Engineer

cardiff, United Kingdom
Hybrid / WFH Options
Durlston Partners
is a plus), infrastructure-as-code, and CI/CD tooling Strong scripting and automation experience in Python and Bash Familiarity with observability stacks (Prometheus, OpenTelemetry, eBPF) Cloud infrastructure experience (AWS/GCP/Azure), with attention to IAM and software supply chain security Curious, persistent, and comfortable experimenting at More ❯
Posted:

Site Reliability Engineer

cardiff, United Kingdom
DNSINFOLTD
in AWS Cloud (e.g., AWS, GCP, Azure) and Container Orchestration (e.g., Kubernetes, Docker). Proficiency in Monitoring and Logging Tools: Datadog, Splunk, Dynatrace, AppDynamics, Prometheus, Grafana, ELK Stack (Elasticsearch, Logstash, Kibana), Cloude Watch, Gremlin, Thousand Eyes. Terraform, Jenkins, GitLab CI, PostgreSQL, Redis, Kong API. Infrastructure skills, Networking and Security Skills More ❯
Posted:

Platform Engineer

cardiff, United Kingdom
Fruition Group
Hands-on experience with AWS, Kubernetes, Docker, and modern CI/CD pipelines Familiarity with infrastructure-as-code (e.g., Terraform) and observability tooling (e.g., Prometheus, Grafana) Comfortable working on distributed systems and improving developer workflows A product mindset and a collaborative approach to problem-solving Experience with Kafka, gRPC, or More ❯
Posted:

Network Engineer - Low Latency - AWS - Palo Alto - Financial Services

cardiff, United Kingdom
Rothstein Recruitment
GitLab CI/CD pipelines and GitOps principles Knowledge of container orchestration platforms like Kubernetes (EKS) Experience with monitoring and observability tools including OpenSearch, Prometheus, and Grafana Understanding of security best practices and AWS CIS Benchmark standards Experience with low-latency network design and optimization Strong verbal communication and documentation More ❯
Posted:

Cloud Platform Lead

cardiff, United Kingdom
SoCode Recruitment
API Management and DevOps Pipelines and AWS including EKS Lambda and CloudFormation Infrastructure as Code and GitOps : Terraform Bicep Pulumi ArgoCD and FluxCD Observability : Prometheus Grafana OpenTelemetry and Datadog Security and Compliance : HashiCorp Vault Azure Key Vault AWS KMS OPA Gatekeeper and Drata or similar AI Coding Tools : GitHub Copilot More ❯
Posted:

Senior Software Engineer

cardiff, United Kingdom
Signify Technology
hybrid team environment (3 days a week onsite in London) Experience with Terraform, Kubernetes, or CI/CD pipelines Familiarity with observability tooling (e.g. Prometheus, Grafana, Datadog) Experience mentoring or leading other engineers More ❯
Posted:

Senior Software Engineer

cardiff, United Kingdom
Hybrid / WFH Options
Ocho
data engineering tools such as Airflow, Pandas, or Spark Exposure to serverless architectures using AWS Lambda Familiarity with monitoring and logging tools (e.g. CloudWatch, Prometheus) Previous experience working in regulated or high-availability environments Location & Flexibility: This role can be fully remote, with optional visits to a UK-based office More ❯
Posted:

AI Tech Lead – Agentic AI, LangGraph, ML, Python, CI/CD, LLM’s, Startup, UK Remote

cardiff, United Kingdom
Hybrid / WFH Options
WMtech
GenAI, LLMs, and multimodal systems Architecture: Microservices, RESTful APIs, async programming Infrastructure: Docker, Terraform, GitHub Actions, GCP (preferred) Datastores: MongoDB, Redis Monitoring/Tooling: Prometheus, Grafana, Sentry The role is remote with occasional travel Ready to lead and build with purpose? If you're excited by the idea of applying More ❯
Posted:

Senior HPC Support Engineer

cardiff, United Kingdom
Hybrid / WFH Options
Nscale
HPC container runtimes (e.g., Singularity, Apptainer). Exposure to provisioning and automation tools (e.g., Ansible, PXE, Terraform). Experience with monitoring tools such as Prometheus, Grafana, and DCGM. Understanding of GPU/accelerator toolchains like CUDA or ROCm. A proactive, customer-first mindset with strong communication skills. Ability to work More ❯
Posted:

Java Backend Developer

cardiff, United Kingdom
Qualient Technology Solutions UK Limited
Kubernetes for container orchestration. · Strong knowledge of Git and version control practices · Experience with Kafka, RabbitMQ, or similar technologies. · Familiarity with monitoring tools (eg - Prometheus, Grafana) and logging frameworks · Knowledge of DevOps principles and practices. · Strong analytical and problem-solving skills. · Experience working in Agile/Scrum environments. · Ability to More ❯
Posted:

Senior Backend Engineer (Go) - AI startup

cardiff, United Kingdom
Hybrid / WFH Options
Few&Far
and observability tools Bonus Points For Contributions to open-source projects Contributions to an AI product ⚙️ Tech Stack: Golang, GCP, microservices, Kubernetes, Kafka, MongoDB, Prometheus If scalability, security, databases and performance is your thing, looking for high ownership and impact - this role is for you! Please apply with an up More ❯
Posted:

DevOps Engineer

cardiff, United Kingdom
Hybrid / WFH Options
Prism Digital
version, and manage infrastructure as code across multiple environments. GitHub Actions & OIDC – build and maintain automated CI/CD pipelines with secure authentication. Datadog, Prometheus or similar – implement logging, metrics, and alerting for robust observability – the interim CTO is keen to hear your recommendation(s) on tooling and implementation strategy. More ❯
Posted:

DevOps Engineer

cardiff, united kingdom
Hybrid / WFH Options
Ocho
teams to improve cloud architecture, security, and monitoring Maintain and improve containerised environments using Docker, Kubernetes, and Helm Implement logging, monitoring, and alerting solutions (Prometheus, Grafana, ELK, or similar) Drive best practices for Infrastructure as Code (IaC) and DevOps culture Requirements: 3-5 years of DevOps, Software or Cloud Engineering … with Terraform, CloudFormation, or Ansible Familiarity with cloud security best practices Solid scripting skills (Python, Bash, or Go) Experience with monitoring/logging tools (Prometheus, ELK, Grafana) Please apply now if you are meeting the above criteria, or reach out to Andrew Harrison directly for a further conversation. Skills: AWS More ❯
Posted:

Senior Data Engineer

cardiff, United Kingdom
Advanced Resource Managers
data flows and integration processes. Familiarity with containerization and orchestration tools such as Docker and Kubernetes. Experience with monitoring and alerting tools such as Prometheus, Grafana, or ELK for data infrastructure Knowledge of security practices for handling sensitive data, including encryption, anonymization, and access control. Familiarity with data governance, data More ❯
Posted:

Site Reliability Engineer

cardiff, United Kingdom
Hybrid / WFH Options
Harrington Starr
infrastructure. They’re looking to bring on a Site Reliability Engineer with deep experience in observability . If you’ve worked with tools like Prometheus in AWS , supported development teams with tracing and performance insights , and thrive in a high-scale, distributed environment - this could be a great next step. … What You’ll Be Doing: Managing and improving observability tools like Prometheus, Grafana, and CloudWatch Helping product teams with tracing and monitoring to improve performance and reliability Defining and improving SLIs/SLOs , automating tasks, and reducing operational noise Working with AWS (EKS, EC2, Lambda, RDS), Terraform, and CI/… CD tools What They’re Looking For: Experience in SRE or DevOps roles in a production environment Strong knowledge of observability tools , especially Prometheus in AWS Experience with tracing , metrics, and logs to support development teams Skills in Python or Go , and a good understanding of AWS and Kubernetes What More ❯
Posted:
Prometheus
Cardiff
10th Percentile
£53,000
25th Percentile
£53,750
Median
£57,500
75th Percentile
£61,250
90th Percentile
£62,000