26 to 50 of 51 Prometheus Jobs in London

Telemetry and Observability Engineer

Hiring Organisation
Oscar Associates (UK) Limited
Location
London, United Kingdom
Employment Type
Contract
Contract Rate
£400 - £500 per day
alerting, reliability engineering, and embedding observability across complex distributed systems and Kubernetes environments. Key experience needed: * Observability/SRE/Platform Engineering background * OpenTelemetry , Prometheus, Grafana, Splunk, Elastic, Loki, or Jaeger * Kubernetes, microservices, and cloud-native platforms * Python, Go, or Java * Terraform, Helm, and IaC * SLIs, SLOs, alerting, and reliability ...

Go Full Stack Developer

Hiring Organisation
itecopeople
Location
London, United Kingdom
Employment Type
Permanent, Work From Home
Salary
£60,000
engineering governance Experience with any of the following would be highly advantageous: Microsoft Azure Python GitOps tooling (Argo CD/Flux) Observability tooling (Prometheus, Grafana, OpenTelemetry) AI/LLM-enabled applications Event-driven architectures and messaging platforms What's on Offer Opportunity to work on cutting-edge AI and cloud ...

Site Reliability Engineer

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
Kubernetes. Familiarity with Linux operating systems and common commands, with scripting languages, Shell, Python, or Go. Knowledge of mainstream monitoring tools such as Prometheus, Grafana, and Zabbix. Experience in Java microservices architecture development or operations, with expertise in Java memory tuning and performance optimization. Experience with common middleware, including ...

Site Reliability Engineer

Hiring Organisation
Jobleads-UK
Location
City Of London, England, United Kingdom
Docker and Kubernetes.Familiarity with Linux operating systems and common commands, with scripting languages, Shell, Python, or Go.Knowledge of mainstream monitoring tools such as Prometheus, Grafana, and Zabbix.Experience in Java microservices architecture development or operations, with expertise in Java memory tuning and performance optimization.Experience with common middleware, including but not limited ...

Senior Software Engineer, Unified Platform - Reference Data

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
Python experience Ruby experience Big data technologies: Spark, Trino, Kafka Financial Markets experience SQL: Postgres, Oracle Cloud‐native deployments: AWS, Docker, Kubernetes Observability: Splunk, Prometheus, Grafana #J-18808-Ljbffr ...

Staff Site Reliability Engineer - Site Experience

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
Have Experience operating systems at internet scale traffic volumes. Experience with Kubernetes, containers, cloud infrastructure, and modern deployment platforms. Familiarity with technologies such as Prometheus, Grafana, OpenTelemetry, Envoy, Kafka, ClickHouse, Cassandra, Redis, or similar distributed infrastructure technologies. Experience with CDN optimization, edge reliability, traffic engineering, or global infrastructure. Contributions ...

Site Reliability Engineer

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
Terraform, Puppet, Chef, Ansible. You have a good understanding of one or more of the following areas: Database Administration, Networking, Observability Tools (such as Prometheus, Jaeger) or automation infrastructure. You have solid experience working with either GCP or AWS. Benefits: Highly competitive salary Pension plan (match up to 5%) Life ...

Principal Machine Learning Infrastructure Engineer London, United Kingdom

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
consume data Experience building model serving infrastructure with latency and throughput requirements Familiarity with experiment tracking tools (Weights & Biases, MLflow) and observability stacks (Prometheus, Grafana) What we offer Equity options – share in our success and growth. 10% employer pension contribution – invest in your future. Free office lunches – great food ...

Senior Software Engineer

Hiring Organisation
Permax Recruitment Limited
Location
West London, London, United Kingdom
Employment Type
Permanent, Work From Home
GitHub Actions, GitLab CI, Jenkins, or similar) Solid understanding of containerization and orchestration (Docker, Kubernetes, ECS) Experience with monitoring and observability tools (CloudWatch, Datadog, Prometheus, or similar) Proficiency in bash Experience supporting development teams with infrastructure and deployment needs Knowledge of microservices architecture and serverless patterns Leadership Experience working … GitHub Actions, GitLab CI, Jenkins, or similar) Solid understanding of containerization and orchestration (Docker, Kubernetes, ECS) Experience with monitoring and observability tools (CloudWatch, Datadog, Prometheus, or similar) Hands-on experience managing Snowflake: warehouses, RBAC, cost controls, and query performance Proficiency in bash Experience supporting development teams with infrastructure and deployment ...

Senior Lead Software Engineering - AI/ML Engineer

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
incident resolution Experienced in observability, including white and black box monitoring, service level objective alerting, and telemetry collection using tools such as Grafana, Dynatrace, Prometheus, Datadog, and Splunk Strong understanding of SLI/SLO/SLA and Error Budgets Proficient in Python or PySpark for AI/ML modeling Able ...

Site Reliability Engineer, iCloud

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
platforms with Splunk, Grafana, Prometheus. Demonstrable fluency in at least one of the following languages: Java, Python, or Go. Experience with Kubernetes, Nginx, Envoy, Prometheus, and/or Docker. Preferred Qualifications Understanding of standard networking protocols and components such as: HTTP, DNS, ECMP, TCP/IP, ICMP, the OSI Model ...

Senior Lead Software Engineering - AI/ML Engineer

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
incident resolution. Experienced in observability, including white and black box monitoring, service level objective alerting, and telemetry collection using tools such as Grafana, Dynatrace, Prometheus, Datadog, and Splunk. Strong understanding of SLI/SLO/SLA and Error Budgets. Proficient in Python or PySpark for AI/ML modeling. Able ...

Site Reliability Engineer - SRE

Hiring Organisation
Sanderson Recruitment
Location
City of London, London, United Kingdom
Employment Type
Permanent
root cause analysis programming experience Kubernetes and Docker Deploy and release services experience Experience with Greenfield projects ideally 6+ years relevant experience Grafana/Prometheus ideal Strong communication skills with the ability to proactively engage with a wide range of stakeholders If this sounds of interest to you, please ring ...

Site Reliability Engineer - SRE

Hiring Organisation
Sanderson
Location
London, South East, England, United Kingdom
Employment Type
Full-Time
Salary
£80,000 - £105,000 per annum
root cause analysis programming experience Kubernetes and Docker Deploy and release services experience Experience with Greenfield projects ideally 6+ years relevant experience Grafana/Prometheus ideal Strong communication skills with the ability to proactively engage with a wide range of stakeholders If this sounds of interest to you, please ring ...

Database Reliability Engineer

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
Distributed Systems enthusiast: enjoy the challenge of multi‐tenant, multi‐region, multi‐cloud scenarios with rigorous data integrity. Security & Observability mindset: build deep observability (Prometheus/Grafana/OpenTelemetry/Humio) and guardrails for secure operation. Engineering via code: deliver backend services in Java with clean relational modeling and performant ...

Lead Engineer (Routing Squad)

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
languages: Python, Go Tech infrastructure: AWS, CDK TypeScript, Lambda, SQS, EventBridge, RDS, DynamoDB Data tooling: GCP, BigQuery, Looker, Looker Studio Observability: Loki, Tempo, Grafana, Prometheus Event-driven architecture and domain-driven design How we reward our team Dynamic hybrid working environment with a diverse and driven team Huge opportunity ...

Senior Rust Software Engineer

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
bindings for Swift and Kotlin, and we plan to add Rust to avoid the need to have two SDK implementations. We use Grafana and Prometheus to instrument our SDK, Sentry to get details issues from the SDK usage, GitLab CI for running tests including e2e tests written using Playwright ...

Backend Engineer

Hiring Organisation
Revolut
Location
City of London, London, United Kingdom
Employment Type
Permanent, Work From Home
matters: clean, maintainable code, shipped fast with TDD, DDD, and continuous integration and delivery. Our stack includes Java 17/21, GCP, Kubernetes, Grafana, Prometheus, NewRelic, PostgreSQL, Redis, Spock, jOOQ, and Flyway. Up to shape what's next in finance? Let's get in touch. What youll be doing Building ...

Machine Learning Systems & Infrastructure Engineer

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
/CD pipelines, including self‐hosted GPU runners. Observability and reliability: Monitoring, logging, and alerting for job performance, data‐pipeline health, and cost (e.g., Prometheus/Grafana, OpenTelemetry); define SLOs and incident response for the systems you own. Security and access: Manage secrets, IAM, and network boundaries (e.g., Tailscale, cloud … storage with caching layers. Familiarity with ML workflow orchestration and experiment tracking (e.g., Kubeflow Pipelines, MLflow). Experience with monitoring and observability tooling (e.g., Prometheus/Grafana, OpenTelemetry) and CI/CD for infra and ML workflows (e.g., GitHub Actions). At SpAItial, we are committed to creating a diverse ...

Technical Architect - eSC/eDV Clearance

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
practices, CI/CD pipelines and modern automation approaches across delivery teams Define and implement observability strategies including logging, monitoring and alerting (e.g. CloudWatch, Prometheus, Grafana) Provide technical leadership and governance through design reviews, architecture boards and best practice guidance Support incident response, root cause analysis and continuous improvement across … experience Experience with multi‐account AWS environments and landing zone design (e.g. AWS Organizations, Control Tower) Exposure to observability and monitoring tooling (e.g. CloudWatch, Prometheus, Grafana, ELK stack) Experience designing or supporting container‐based application architectures beyond infrastructure (microservices, API‐driven systems) Familiarity with security tooling and identity models ...

Semantic Graph & Ontology Architect

Hiring Organisation
Adecco
Location
London, United Kingdom
Employment Type
Contract
graphs supporting workflows and audit trails. Exposure to vector retrieval and how graph context informs data re-ranking. Knowledge of observability tools like OpenTelemetry, Prometheus, and Grafana. Why Join Us? This is your opportunity to be at the forefront of data innovation in the energy sector! If you are eager ...

Software Engineer (Graduate to Experienced)

Hiring Organisation
ECM Selection (Holdings) Limited
Location
London, United Kingdom
Employment Type
Permanent
Salary
£40000 - £100000/annum DoE
company’s tech stack is Rust, Flutter/Dart, and Postgres – experience with these is highly beneficial. Additionally, any exposure with gRPC, Arrow, Prometheus, Grafana, or Docker would be desirable. As the sector is in aviation, any personal interest in this evidenced through flying lessons, flight simulators etc… would ...

Integration Developer FTC

Hiring Organisation
itecopeople
Location
London, United Kingdom
Employment Type
Permanent, Work From Home
Salary
£60,000
developers, data engineers, and stakeholders Technology Stack Kafka/Redpanda Docker & Kubernetes Microsoft Azure REST APIs & webhooks CI/CD & Infrastructure as Code OpenTelemetry, Prometheus & Grafana Required Skills Strong software engineering background Experience building integration or event-driven platforms Kafka, Redpanda, or similar streaming technologies Enterprise system integrations … design Agile development experience Strong communication and collaboration skills Desirable Skills Go and/or Python CDC pipeline development Azure cloud experience Observability tooling (Prometheus, Grafana, OpenTelemetry) Experience within regulated environments What's on Offer Hybrid working - 2 days per week in London Salary up to £60,900 Generous pension ...

Solace Messaging Administrator

Hiring Organisation
Searchability (UK) Ltd
Location
City of London, London, United Kingdom
Employment Type
Permanent
Strong background supporting enterprise production environments Experience with Solace PubSub+ appliances and software brokers Strong understanding of distributed systems and WAN environments Experience with Prometheus and Grafana monitoring tools Linux/Unix administration and scripting experience Strong troubleshooting and analytical problem-solving skills Experience supporting low latency, high throughput messaging … submit (subject to required skills) your application to our client in conjunction with this vacancy only. KEY SKILLS Solace, PubSub+, Messaging Administrator, Linux, Prometheus, Grafana, WAN, Low Latency Systems, Distributed Systems, Python, Bash, Infrastructure, Production Support, Kafka, RabbitMQ, Docker, Kubernetes, AWS, Azure, Messaging Systems ...

OSS/BSS Solution architect

Hiring Organisation
IBU CONSULTING LTD
Location
London, United Kingdom
Employment Type
Contract
data models) - Proficient in backend programming (Python, Java, C++) - Experienced in data persistence across SQL (Postgres, MySQL), NoSQL (Cassandra, MongoDB) and TSDB ( TimescaleDB , InfluxDB , Prometheus TSDB) - Skilled in schema design, optimisation and performance tuning Distributed Systems & State Management - Strong understanding of distributed systems, transaction management and telemetry pipelines - Understands … cloud-native patterns (PaaS, SaaS, serverless, containers, orchestration) - Cloud platforms (AWS or GCP): commonly used IaaS/PaaS services - Proficient with observability frameworks (Prometheus, ElasticSearch , Grafana, OpenTelemetry ) for metrics, logs and traces Platform Technologies & BSS Vendor Platforms - Open source workflow engines (Temporal) - Stream/batch processing (Flink, Spark) - Test automation ...