Prometheus Jobs in the UK

251 to 275 of 301 Prometheus Jobs in the UK

Information Technology Support Engineer

East Sussex, England, United Kingdom
Hybrid / WFH Options
Areti Group | B Corp™
pressure. Desirable Skills Experience with cloud platforms (Microsoft Azure or AWS). Familiarity with configuration management tools (Ansible, Puppet, SCCM). Experience in monitoring and alerting systems (e.g., Zabbix, Prometheus, SolarWinds). Scripting experience (Bash, PowerShell, or Python). Knowledge of ITIL processes and service management best practices. Previous experience in utilities, energy, or critical infrastructure sectors. More ❯
Posted:

Platform Engineer - HPC, AI and ML

City of London, London, United Kingdom
Cloud People
NeuralMesh distributed AI storage for high-speed data access and resilience • Implementing CI/CD and MLOps pipelines using Argo Workflows, Jenkins and GitHub • Monitoring platform performance using Zabbix, Prometheus and Grafana • Integrating SAN and Infiniband networking to achieve high throughput and reliability • Creating detailed documentation and performing knowledge transfer to operations teams • Providing ongoing platform support, patching, troubleshooting and More ❯
Posted:

Platform Engineer - HPC, AI and ML

London Area, United Kingdom
Cloud People
NeuralMesh distributed AI storage for high-speed data access and resilience • Implementing CI/CD and MLOps pipelines using Argo Workflows, Jenkins and GitHub • Monitoring platform performance using Zabbix, Prometheus and Grafana • Integrating SAN and Infiniband networking to achieve high throughput and reliability • Creating detailed documentation and performing knowledge transfer to operations teams • Providing ongoing platform support, patching, troubleshooting and More ❯
Posted:

Deep Learning Engineer

Edinburgh, Scotland, United Kingdom
Predictiva
in AI, ML, Computer Science, or a related field. Understanding of Reinforcement Learning algorithms. Experience with cloud services (AWS, Azure, GCP). Familiarity with tools such as Kafka, Kubernetes, Prometheus, and Grafana. Interest or prior experience in financial markets. Why Join Us At Predictiva, you’ll have the chance to work at the intersection of AI research, financial innovation, and More ❯
Posted:

Information Technology Developer

City of London, London, United Kingdom
W1M Wealth & Investment Management
Services SQL Server (Including T-SQL) Angular (with Typescript) RabbitMQ/Kafka Various Azure Features (App Services, VMs, Config etc...) Git Snowflake Nuget (Producing and Consuming)Azure DevOps (CI) Prometheus & Grafana (Monitoring & Alerting) ELK Stack/Azure Log Analytics (Logging) We are also in the middle of a transformational migration to Azure. This role sits in the IT Development team More ❯
Posted:

Information Technology Developer

London Area, United Kingdom
W1M Wealth & Investment Management
Services SQL Server (Including T-SQL) Angular (with Typescript) RabbitMQ/Kafka Various Azure Features (App Services, VMs, Config etc...) Git Snowflake Nuget (Producing and Consuming)Azure DevOps (CI) Prometheus & Grafana (Monitoring & Alerting) ELK Stack/Azure Log Analytics (Logging) We are also in the middle of a transformational migration to Azure. This role sits in the IT Development team More ❯
Posted:

Reliability Engineer

London Area, United Kingdom
BGC Group
ensuring high availability, optimal performance, and reliability across production and non-production environments. This includes working on incident response, capacity planning, WAN optimization, and system observability using tools like Prometheus and Grafana . Key Responsibilities: Administer and maintain Solace PubSub+ appliances and software brokers across environments (on-prem and cloud). Provide production support for messaging-related incidents, including root … cause analysis and resolution. Monitor system performance and health using Prometheus and Grafana ; proactively identify and address anomalies. Configure and optimize Solace across WAN environments , ensuring low-latency, secure, and reliable messaging. Collaborate with development and application support teams to troubleshoot message flow issues and integration problems. Perform capacity planning , scaling, and tuning of Solace infrastructure to meet current and … background in production support , preferably in a 24x7 enterprise environment. Experience working with distributed systems over WAN , with an understanding of networking, latency, and failover strategies. Solid experience with Prometheus and Grafana for system monitoring and alerting. Proficiency in troubleshooting message delivery, persistence, and topic routing. Experience with capacity management , performance tuning, and system scaling. Familiarity with Linux/Unix More ❯
Posted:

Reliability Engineer

City of London, London, United Kingdom
BGC Group
ensuring high availability, optimal performance, and reliability across production and non-production environments. This includes working on incident response, capacity planning, WAN optimization, and system observability using tools like Prometheus and Grafana . Key Responsibilities: Administer and maintain Solace PubSub+ appliances and software brokers across environments (on-prem and cloud). Provide production support for messaging-related incidents, including root … cause analysis and resolution. Monitor system performance and health using Prometheus and Grafana ; proactively identify and address anomalies. Configure and optimize Solace across WAN environments , ensuring low-latency, secure, and reliable messaging. Collaborate with development and application support teams to troubleshoot message flow issues and integration problems. Perform capacity planning , scaling, and tuning of Solace infrastructure to meet current and … background in production support , preferably in a 24x7 enterprise environment. Experience working with distributed systems over WAN , with an understanding of networking, latency, and failover strategies. Solid experience with Prometheus and Grafana for system monitoring and alerting. Proficiency in troubleshooting message delivery, persistence, and topic routing. Experience with capacity management , performance tuning, and system scaling. Familiarity with Linux/Unix More ❯
Posted:

Platform Engineer

City of London, London, England, United Kingdom
Revybe IT Recruitment Ltd
and platform engineering. Tech Stack Cloud: AWS (EC2, RDS, S3, IAM, CloudWatch, Lambda) Infrastructure as Code: Terraform Containerisation & Orchestration: Docker, Kubernetes (EKS), Helm Configuration Management: Ansible Monitoring & Observability: Grafana, Prometheus CI/CD: GitHub Actions Automation & Scripting: Python, Bash, Go or Java What We’re Looking For Proven experience running AWS cloud infrastructure in a production or regulated (financial) environment. … Hands-on experience managing Kubernetes clusters (preferably EKS). Strong understanding of Infrastructure as Code using Terraform. Familiarity with monitoring and observability stacks such as Prometheus and Grafana. Experience building and maintaining CI/CD pipelines (GitHub Actions or similar). Strong scripting or automation skills using Python, Bash, Go or Java . A collaborative mindset — comfortable working alongside developers More ❯
Employment Type: Full-Time
Salary: £65,000 - £80,000 per annum
Posted:

Senior DevOps Platform Engineer

London, England, United Kingdom
CDW UK
Build and maintain Infrastructure as Code (IaC) using Terraform and Ansible. Design highly reliable, scalable, and secure infrastructure supporting performance-critical workloads. Build proactive monitoring, observability, and alerting with Prometheus, Grafana, Azure Monitor, DataDog, and Dynatrace. Troubleshoot complex system issues spanning applications, networks, and infrastructure. Define platform SLAs, SLOs, and governance standards for self-service use. Collaborate closely with Salesforce … and Ansible, along with scripting in PowerShell, Python, or Bash Experience implementing GitOps workflows and managing platform SLAs, SLOs, and governance standards Familiarity with observability and monitoring tools including Prometheus, Grafana, Azure Monitor, DataDog, or Dynatrace Preferred experience supporting Salesforce DevOps pipelines and working with Java, .NET, or Node.js application environments Exposure to AI/ML platforms, real-time data More ❯
Posted:

Cloud Architect

Oxford, England, United Kingdom
Experis UK
Apigee), messaging (SQS/SNS/Service Bus/PubSub), event‐driven design. Operations & Reliability Observability stack (CloudWatch/CloudTrail, Azure Monitor/Log Analytics, Cloud Logging/Monitoring; Prometheus/Grafana). DR/BCP architectures (cross‐region, multi‐region, backups, runbooks; tested failover). Performance testing, capacity planning, SLO/SLIs, error budgets. Governance & Cost Landing zone governance … Data/Integration: Event Hubs/Kafka/PubSub, API Gateway/APIM/Apigee, Data Factory/Glue/Cloud Data Fusion, BigQuery/Synapse/Redshift. Observability: Prometheus/Grafana, OpenTelemetry, CloudWatch, Azure Monitor, Cloud Monitoring, ELK/Elastic. Scripting: Python/Bash/PowerShell; strong Git and code review practices. Certifications (Nice to Have) Azure: AZ More ❯
Posted:

Senior Data Scientist

England, United Kingdom
Wyatt Partners
React on the Frontend. Tech & Data Science stack: Kubernetes & Docker on Google Cloud Python 3: Pandas, RabbitMQ, Celery, Flask, SciPy, NumPy, Dash, Plotly, Matplotlib Javascript, React, Redux PostgreSQL, Redis Prometheus, Alert Manager, DataDog If you joined the company in a Data Science role you would be working on sophisticated pricing algorithms which would enable companies in the entertainment industry to More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Technical Operations Associate, Equities

London Area, United Kingdom
Hybrid / WFH Options
ARC IT Recruitment
to a follow-the-sun model. Key Requirements: TechOps/Production Engineering/SRE experience supporting equities platforms. Tooling exposure: Kubernetes/containers, CI/CD, Terraform, Datadog/Prometheus/Splunk/Geneos. Practical understanding of market microstructure, exchange connectivity, and TCA/controls. Composed, commercially aware communicator with traders and senior leadership. Package & set-up Competitive base + More ❯
Posted:

Technical Operations Associate, Equities

City of London, London, United Kingdom
Hybrid / WFH Options
ARC IT Recruitment Ltd
and contribution to a follow-the-sun model. Key Requirements: TechOps/Production Engineering/SRE experience supportingequitiesplatforms. Tooling exposure: Kubernetes/containers, CI/CD, Terraform, Datadog/Prometheus/Splunk/Geneos. Practical understanding of market microstructure, exchange connectivity, and TCA/controls. Composed, commercially aware communicator with traders and senior leadership. Package & set-up Competitive base + More ❯
Employment Type: Permanent, Work From Home
Posted:

Founding Site Reliability Engineer (SRE)

England, United Kingdom
Hybrid / WFH Options
Gizmo
PostgreSQL, sharded MySQL). You have software engineering experience. Strong backend fundamentals around concurrency, caching, indexing and distributed systems trade-offs. Proven track record of setting SLOs, building dashboards (Prometheus/Grafana, OpenTelemetry, etc.) and tuning alerts. Comfort with Kubernetes, IaC and cloud-native patterns; can debug from network to application layer. Self-starter with a maker mindset. We're More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Software Engineer - Unified Client Experience (UCX)

England, United Kingdom
Hargreaves Lansdown PLC
a big plus. Capable of writing clean, maintainable and well-tested code. Comfortable working in on-prem and cloud-native environments with an interest in observability, using tools like Prometheus and Grafana to keep services healthy and maintainable. Familiarity with AWS services and how to integrate them into modern applications. A keen focus on quality and security, combining testing and More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Backend Engineer

City of London, London, United Kingdom
Hybrid / WFH Options
M-XR
models (MongoDB, PostgreSQL) Implement asset storage, retrieval, and management systems (AWS S3) Build job queue management for async ML workflows (SNS, SQS) Setup application monitoring and logging (CloudWatch, Grafana, Prometheus) Implement CI/CD for application deployment (Bitbucket Pipelines) Create API documentation and developer tools What we are looking for 5+ years backend development experience with production applications Track record More ❯
Posted:

Senior Backend Engineer

London Area, United Kingdom
Hybrid / WFH Options
M-XR
models (MongoDB, PostgreSQL) Implement asset storage, retrieval, and management systems (AWS S3) Build job queue management for async ML workflows (SNS, SQS) Setup application monitoring and logging (CloudWatch, Grafana, Prometheus) Implement CI/CD for application deployment (Bitbucket Pipelines) Create API documentation and developer tools What we are looking for 5+ years backend development experience with production applications Track record More ❯
Posted:

Java developer

Manchester, Lancashire, England, United Kingdom
Opus Recruitment Solutions Ltd
and deployment of these services all the way to production in a controlled and secure way. Tech stack - Java engineer needs experience with spring boot framework, TDD, Grafana and Prometheus for monitoring and alerting and understanding of the CI/CD process.All candidates must pass a BPSS.Immediate start.End March 2026.Weekly travel to Leeds/Newcastle/Manchester.£400 - £500 per More ❯
Employment Type: Contractor
Rate: £400 - £500 per day
Posted:

Java developer

Manchester, United Kingdom
Opus Recruitment Solutions
and deployment of these services all the way to production in a controlled and secure way. Tech stack - Java engineer needs experience with spring boot framework, TDD, Grafana and Prometheus for monitoring and alerting and understanding of the CI/CD process. All candidates must pass a BPSS. Immediate start. End March 2026. Weekly travel to Leeds/Newcastle/ More ❯
Employment Type: Contract
Rate: £400 - £500/day
Posted:

Site Reliability Engineer - Global Hedge Fund

London Area, United Kingdom
Paragon Alpha - Hedge Fund Talent Business
their aggressive growth plans, they are looking for a pragmatic and commercially oriented SRE to design, implement and maintain scalable and reliable systems. Tech Stack: Python/C++, Terraform, Prometheus, Kubernetes, Cloud Computing The core function of the role is to monitor and maintain uptime for trading systems, pricing engines and risk management tools. The client can offer market leading More ❯
Posted:

Site Reliability Engineer - Global Hedge Fund

City of London, London, United Kingdom
Paragon Alpha - Hedge Fund Talent Business
their aggressive growth plans, they are looking for a pragmatic and commercially oriented SRE to design, implement and maintain scalable and reliable systems. Tech Stack: Python/C++, Terraform, Prometheus, Kubernetes, Cloud Computing The core function of the role is to monitor and maintain uptime for trading systems, pricing engines and risk management tools. The client can offer market leading More ❯
Posted:

Senior Product Manager (SaaS)

England, United Kingdom
LinuxRecruit
rally teams around a plan. A strong preference for user experience and comfort with technical details. Technical experience with containerised platforms using Kubernetes, databases, and observability tools such as Prometheus and OpenTelemetry. This is a chance to shape the future of observability and security, build products people count on, and do it all with curiosity and creativity. More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Junior Site Reliability Engineer

Manchester, England, United Kingdom
Hybrid / WFH Options
Lorien
modern technologies. with clear progression routes available. Key Requirements: Strong troubleshooting and fault-resolution experience across infrastructure and applications Hands-on experience with monitoring tools such as Instana, Splunk, Prometheus, Grafana, or SolarWinds Confident supporting both Windows and Linux operating systems Experience working in ITIL-aligned support environments Understanding of web hosting technologies (DNS, HTTP/S, SSL Certs, and More ❯
Posted:

Junior Site Reliability Engineer

Manchester, Lancashire, England, United Kingdom
Hybrid / WFH Options
Lorien
modern technologies. with clear progression routes available. Key Requirements: Strong troubleshooting and fault-resolution experience across infrastructure and applications Hands-on experience with monitoring tools such as Instana, Splunk, Prometheus, Grafana, or SolarWinds Confident supporting both Windows and Linux operating systems Experience working in ITIL-aligned support environments Understanding of web hosting technologies (DNS, HTTP/S, SSL Certs, and More ❯
Employment Type: Full-Time
Salary: £35,000 - £45,000 per annum
Posted:
Prometheus
10th Percentile
£52,500
25th Percentile
£60,250
Median
£72,500
75th Percentile
£83,750
90th Percentile
£120,000