12 of 12 Prometheus Jobs in the City of London

DevOps / Platform Engineer

Hiring Organisation
Locai Labs
Location
City of London, London, United Kingdom
with relational databases in production environments (e.g., Postgres, MySQL), including basic performance troubleshooting, migrations, backups, and access control. Familiarity with observability tools such as Prometheus, Grafana, ELK stack, or OpenTelemetry Experience with container orchestration platforms, particularly Kubernetes Ability to systematically troubleshoot and debug distributed systems Comfortable reading, modifying, and writing ...

Senior DevOps Engineer - ArgoCD/GitOps

Hiring Organisation
Tec Partners
Location
City of London, London, United Kingdom
Employment Type
Permanent
Salary
£75000 - £85000/annum
tooling (GitHub Actions, GitLab CI, or similar) Solid Linux and scripting skills Nice to Have EKS at scale, Helm, multi-account AWS Observability tools (Prometheus, Grafana, CloudWatch) AWS or Kubernetes certifications ...

Site Reliability Engineer

Hiring Organisation
Revybe IT Recruitment Ltd
Location
City of London, London, England, United Kingdom
Employment Type
Full-Time
Salary
£65,000 - £75,000 per annum
platform engineering is done as the team continues to scale. Tech stack AWS (Core services - EC2, RDS, S3, IAM, etc.) Monitoring and Observability Grafana, Prometheus Kubernetes (building and managing production clusters) Terraform (IaC provisioning) Python, Bash or Go (scripting, automation) GitHub Actions (CI/CD pipelines) What They’re Looking ...

Principal DevOps Engineer

Hiring Organisation
TEC Partners - Technical Recruitment Specialists
Location
City of London, London, United Kingdom
CodePipeline. Strong scripting skills (e.g., Bash, Python, or PowerShell) for automation and tooling. Familiarity with monitoring and log management tools (e.g., Prometheus, Grafana, ELK stack). Knowledge of networking concepts and security best practices. Excellent problem-solving and troubleshooting skills. Strong communication and collaboration abilities, with a passion for working ...

Software Engineer

Hiring Organisation
Opus Enterprise Ltd T/A Real Recruitment
Location
City of London, London, United Kingdom
Employment Type
Permanent, Work From Home
Salary
£80,000
Server. Understanding of cloud infrastructure (preferably AWS) and containerisation (Docker, Kubernetes). Familiarity with DevOps automation (CI/CD, Helm, Terraform) and monitoring tools (Prometheus, Grafana, AWS CloudWatch). Experience in production support, debugging, and troubleshooting applications. Knowledge of agile methodologies (scrum, Jira, Kanban, Confluence). Effective communication skills ...

Systems/SRE Engineer

Hiring Organisation
Thurn Partners
Location
City of London, London, United Kingdom
more programming languages such as Python, Go, Ruby, or Perl. Strong experience with Linux system administration. Hands-on experience with observability tools like Prometheus, Grafana, Thanos, and the ELK stack. Familiarity with Kubernetes, Docker, AWS, and GCP. ...

Full Stack Software Engineer

Hiring Organisation
Firenze
Location
City of London, London, United Kingdom
Skills Interest in event-driven or distributed systems (Kafka, RabbitMQ, SQS). Exposure to DDD, CQRS, or hexagonal architectures. Experience with observability tools (OpenTelemetry, Prometheus, Grafana). Familiarity with multi-tenant SaaS, RBAC, or performance tuning (JVM, SQL). Why Join Us? Learn from experienced engineers: Work side-by-side ...

Site Reliability Engineer

Hiring Organisation
Revybe IT Recruitment Ltd
Location
City of London, London, England, United Kingdom
Employment Type
Full-Time
Salary
£65,000 - £90,000 per annum
Tech Stack Cloud: AWS (EC2, RDS, S3, IAM, Lambda, CloudWatch) Containerisation & Orchestration: Docker, Kubernetes (EKS) Infrastructure as Code: Terraform Configuration Management: Ansible Monitoring & Observability: Prometheus, Grafana, ELK Stack CI/CD: GitHub Actions Scripting & Automation: Python, Bash or Go What You’ll Be Doing Designing and maintaining reliable, scalable … cloud infrastructure (AWS preferred) in production. Proven background in Kubernetes operations (EKS, Helm, or similar). Solid knowledge of monitoring, alerting, and logging (Grafana, Prometheus, ELK). Hands-on experience with Terraform and CI/CD tooling. Strong scripting or development background (Python, Go, or similar). Excellent troubleshooting skills ...

Site Reliability Engineer - SRE

Hiring Organisation
Sanderson Recruitment
Location
City of London, London, United Kingdom
Employment Type
Permanent
root cause analysis programming experience Kubernetes and Docker Deploy and release services experience Experience with Greenfield projects ideally 6+ years relevant experience Grafana/Prometheus ideal Strong communication skills with the ability to proactively engage with a wide range of stakeholders If this sounds of interest to you, please ring ...

Principal Engineer

Hiring Organisation
Motive Group
Location
City of London, London, United Kingdom
orchestration. A strong grasp of Infrastructure-as-Code (Terraform) and configuration management tools (Ansible, Puppet, or similar). Strong observability experience using tools like Prometheus/Mimir, Loki, Tempo, Grafana, Alertmanager. Experience deploying and operating large-scale GPU clusters or HPC systems (Ideally). Working knowledge of ML infrastructure ...

Senior Backend Engineer

Hiring Organisation
M-XR
Location
City of London, London, United Kingdom
storage, retrieval, and management systems (AWS S3) Build job queue management for async ML workflows (SNS, SQS) Setup application monitoring and logging (CloudWatch, Grafana, Prometheus) Implement CI/CD for application deployment (Bitbucket Pipelines) Create API documentation and developer tools What we are looking for 5+ years backend development experience ...

Solace Administrator

Hiring Organisation
BGC Group
Location
City of London, London, United Kingdom
reliability across production and non-production environments. This includes working on incident response, capacity planning, WAN optimization, and system observability using tools like Prometheus and Grafana . Key Responsibilities: Administer and maintain Solace PubSub+ appliances and software brokers across environments (on-prem and cloud). Provide production support for messaging … related incidents, including root cause analysis and resolution. Monitor system performance and health using Prometheus and Grafana ; proactively identify and address anomalies. Configure and optimize Solace across WAN environments , ensuring low-latency, secure, and reliable messaging. Collaborate with development and application support teams to troubleshoot message flow issues and integration ...