676 to 700 of 1,242 Observability Jobs

Site Reliability Engineer

Hiring Organisation
Computappoint
Location
City Of London, England, United Kingdom
where reliability genuinely isn't optional. The role blends application support, platform engineering and SRE practice. It suits someone who leans toward automation and observability over reactive firefighting. Responsibilities: Managing OpenShift and Kubernetes clusters across physical, virtual, and containerised environments Operating observability stacks ( Grafana , Prometheus, Splunk) and driving proactive monitoring … call rotation Key Requirements: Hands-on Kubernetes and/or OpenShift experience in production Scripting skills in Python , Bash, or PowerShell Familiarity with observability tooling and SRE principles SQL and database knowledge (MySQL, Oracle, or similar) Experience supporting .NET, Java, or microservices applications It would be great ...

Site Reliability Engineer

Hiring Organisation
VIQU IT
Location
United Kingdom, Whitechapel, Greater London
Employment Type
Permanent
Salary
£40000 - £50000/annum
Engineer to help improve the reliability, scalability and automation of their AWS estate. This is a hands-on engineering role working across cloud infrastructure, observability, CI/CD and platform tooling, helping development teams deliver faster and more reliably. You’ll be joining a collaborative engineering environment with the opportunity … scalable AWS infrastructure. Develop and manage Infrastructure as Code using AWS CDK. Support CI/CD pipelines and deployment automation. Improve monitoring, logging and observability across distributed systems. Support incident management, root cause analysis and platform reliability improvements. Work closely with engineering and architecture teams to improve operational performance ...

Senior Backend Engineer

Hiring Organisation
SecurityHQ
Location
London, England, United Kingdom
versioned and user-friendly API contracts. Participate in architecture design, code reviews and technical discussions, contributing to overall engineering quality and standards. Quality, Testing & Observability Build and maintain comprehensive test suites including unit, integration, contract and end-to-end testing. Ensure services are fully instrumented with logging, metrics and tracing … support observability in production. Treat testing, monitoring and CI signals as essential components of delivery. Agile Delivery & Continuous Improvement Contribute to agile ceremonies including refinement, estimation and retrospectives. Support continuous improvement across engineering practices, ways of working and use of AI-assisted development tools. Technical Experience & Skills Essential ...

Platform Engineer - Kubernetes / Azure

Hiring Organisation
Keystone Recruitment Partners Ltd
Location
United Kingdom
Employment Type
Permanent
Salary
GBP 450 - 550 Daily
operation of enterprise cloud-native services. The role will focus on Kubernetes-based platforms running on Microsoft Azure, including service mesh, application deployment, observability, security, and production support. Key responsibilities: Build, configure, and support Kubernetes environments on Azure, including AKS. Deploy and manage Spring Boot applications in containerized environments. Work … Ability to work independently and communicate effectively with both technical and non-technical stakeholders. Desirable experience: Terraform, Helm, GitOps, Azure DevOps, or GitHub Actions. Observability tools such as Prometheus, Grafana, OpenTelemetry, or Azure Monitor. Financial services or other regulated enterprise environments. ...

Platform Engineer - Kubernetes / Azure

Hiring Organisation
Keystone Recruitment Partners Ltd
Location
Nationwide, United Kingdom
Employment Type
Permanent, Contract
Salary
£450 - £550/day
operation of enterprise cloud-native services. The role will focus on Kubernetes-based platforms running on Microsoft Azure, including service mesh, application deployment, observability, security, and production support. Key responsibilities: Build, configure, and support Kubernetes environments on Azure, including AKS. Deploy and manage Spring Boot applications in containerized environments. Work … Ability to work independently and communicate effectively with both technical and non-technical stakeholders. Desirable experience: Terraform, Helm, GitOps, Azure DevOps, or GitHub Actions. Observability tools such as Prometheus, Grafana, OpenTelemetry, or Azure Monitor. Financial services or other regulated enterprise environments. ...

Staff Site Reliability Engineer

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
find and analyze reliability problems across our stack, then design and implement software and systems to create step-function improvements. You will design robust observability solutions, lead incident response, automate operational tasks, and continuously improve our infrastructure's reliability, all while mentoring and educating the broader engineering team to make … reliability a core value at Replit. You Will Architect and Implement Observability: Design, build, and lead the implementation of comprehensive monitoring, logging, and tracing solutions. Create dashboards and metrics that provide real-time visibility into system health and performance, enabling proactive issue detection. Define and Drive Reliability Standards: Work with ...

Integration Developer FTC

Hiring Organisation
itecopeople
Location
London, United Kingdom
Employment Type
Permanent, Work From Home
Salary
£60,000
Build connectors, event-processing services, and data pipelines Design scalable integration patterns, schemas, and event flows Develop CDC pipelines and resilient messaging solutions Improve observability through logging, metrics, and tracing Deploy containerised services using Docker and Kubernetes Contribute to architecture, code reviews, and engineering standards Collaborate with developers, data engineers … design Agile development experience Strong communication and collaboration skills Desirable Skills Go and/or Python CDC pipeline development Azure cloud experience Observability tooling (Prometheus, Grafana, OpenTelemetry) Experience within regulated environments What's on Offer Hybrid working - 2 days per week in London Salary up to £60,900 Generous pension ...

Senior Azure Platform Engineer

Hiring Organisation
Rebel Recruitment
Location
Salford, Greater Manchester, North West, United Kingdom
Employment Type
Permanent, Work From Home
Salary
£90,000
about being part of that journey. You'll be working in an environment where tools like GitHub Copilot, OpenAI models, Claude, Gemini, AI-powered observability platforms, intelligent deployment workflows, and internal AI tooling are actively being explored and introduced to improve how engineering teams work day to day. This … designing and improving Azure infrastructure, evolving Kubernetes platforms within AKS, building reusable Infrastructure-as-Code patterns using Terraform and Crossplane, and helping improve reliability, observability, and security across the wider platform estate. You'll also spend time improving developer tooling and CI/CD processes, helping engineering teams deploy faster ...

Senior DevOps Engineer

Hiring Organisation
Halian Technology Limited
Location
Reading, Berkshire, South East, United Kingdom
Employment Type
Permanent, Work From Home
Salary
£95,000
reliability, and availability Implement self-service tooling to empower development teams Drive DevOps best practices across the digital product lifecycle Develop and enhance monitoring, observability, and incident response processes Support global engineering teams delivering high-traffic platforms Key Requirements Proven experience supporting digital product delivery in a DevOps or platform … with Infrastructure as Code (Terraform, Ansible, Puppet or similar) Hands-on experience with Kubernetes, Docker, and cloud platforms (AWS preferred) Experience with monitoring/observability tools (Prometheus, Grafana, ELK, APM tools) Solid understanding of system performance, scalability, and resilience Strong collaboration and communication skills within cross-functional product teams Desirable ...

Principal Software Development Engineer

Hiring Organisation
Jobleads-UK
Location
Glasgow, Scotland, United Kingdom
Code, automation frameworks and database‐as‐code practices using tools such as Redgate Flyway. Take ownership of critical customer systems, ensuring operational resilience, observability, performance optimisation and rapid incident response. Collaborate closely with Product, Delivery, Operations and Commercial teams to shape technical solutions, delivery plans and strategic outcomes. Promote secure … Connect or Genesys Cloud. Proven ability to design and deliver secure, scalable and resilient cloud‐native solutions within complex enterprise environments. Strong understanding of observability, operational support, reliability engineering and end‐to‐end ownership practices. Knowledge of regulated financial services environments, including UK GDPR and FCA Consumer Duty requirements. Excellent ...

Senior Software Development Engineer in Test

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
from traditional testing approaches towards a modern, engineering‐led quality strategy: unit testing, contract testing, component testing, integration testing, E2E flows, synthetics, and strong observability across our microservices. We’re looking for a Senior SDET who is hands‐on, highly technical, and passionate about setting teams up for long‐term … teams adopt best practices confidently. Collaborate with Engineering and DevOps to evolve CI/CD pipelines and embed automation earlier in the lifecycle. Improve observability around testing and reliability, integrating logs, traces, metrics, synthetics, and alerts to increase confidence in releases. Promote good testing principles and high‐quality engineering practices ...

Ai Engineer

Hiring Organisation
Morgan McKinley
Location
Yorkshire and Humberside, England, United Kingdom
Employment Type
Full-Time
Salary
Salary negotiable
with Generative and Agentic AI patterns, including LLM integration, RAG architectures, prompt-driven workflows, and AI service orchestration. Integrate AI capabilities with enterprise systems, observability tooling, and security frameworks. Design and maintain CI/CD pipelines within cloud-native engineering environments. Support benchmarking, evaluation, experimentation, and cost optimisation … Skills Experience with Kong API Gateway, Kong Mesh, and Flux CD. RESTful API and microservices development. Terraform and GitOps workflows. Exposure to prompt evaluation, observability, or AI red-teaming tools. SQL and NoSQL database experience. Understanding of vector search technologies and Retrieval-Augmented Generation (RAG) patterns. About You A proactive ...

Lead AI Engineer

Hiring Organisation
Morgan McKinley
Location
Yorkshire and Humberside, England, United Kingdom
Employment Type
Full-Time
Salary
Salary negotiable
with Generative and Agentic AI patterns, including LLM integration, RAG architectures, prompt-driven workflows, and AI service orchestration. Integrate AI capabilities with enterprise systems, observability tooling, and security frameworks. Design and maintain CI/CD pipelines within cloud-native engineering environments. Support benchmarking, evaluation, experimentation, and cost optimisation … Skills Experience with Kong API Gateway, Kong Mesh, and Flux CD. RESTful API and microservices development. Terraform and GitOps workflows. Exposure to prompt evaluation, observability, or AI red-teaming tools. SQL and NoSQL database experience. Understanding of vector search technologies and Retrieval-Augmented Generation (RAG) patterns. About You A proactive ...

Senior Developer

Hiring Organisation
Addition
Location
Watford, Hertfordshire, England, United Kingdom
Employment Type
Full-Time
Salary
£80,000 per annum
Doing: Designing, deploying and managing automation and monitoring platforms that support large-scale applications and services Building and maintaining monitoring, alerting and observability tooling across the platform Creating dashboards that translate complex technical data into meaningful insights for stakeholders Developing automation to integrate new systems using existing frameworks Managing … Docker) Strong Python development skills , including scripting and Lambda functions Experience building and managing CI/CD pipelines , ideally with GitHub Actions Monitoring and observability tooling such as AppDynamics, Grafana, InfluxDB, Graphite, Sensu or similar Experience working with serverless architectures (Lambda, API Gateway, DynamoDB, EventBridge) Solid understanding of Linux/ ...

DevOps Engineer

Hiring Organisation
WTW
Location
Surrey, United Kingdom
Employment Type
Full Time
Managed Identities, Azure networking and Microsoft Entra ID. • Integrate and support security tooling and quality gates, including Mend, Snyk, Invicti, Wiz and GitLeaks. • Improve observability and feedback across build, deployment and environment health using tools such as Datadog, Azure Monitor and Log Analytics. • Help development teams diagnose delivery, deployment … troubleshooting. • Experience embedding security, quality and compliance checks into delivery pipelines, including vulnerability scanning, container scanning, secrets scanning and release evidence. • Good understanding of observability practices, including logs, metrics, dashboards, alerts and environment health checks. • Strong troubleshooting skills across pipelines, deployment automation, Kubernetes workloads, cloud configuration and environment issues. • Ability ...

Principal Machine Learning Engineer

Hiring Organisation
Jobleads-UK
Location
City Of London, England, United Kingdom
evolve the ML Platform, ensuring it supports: Reusable and scalable deployment patterns CI/CD for machine learning Full model lifecycle management Monitoring, observability, and alerting Secure and compliant operation Shape platform standards and interfaces that enable consistent ML delivery across squads and value streams Lead technical spikes and proof … fundamentals (OOP, testing, design patterns). Deep experience building, deploying, and operating production ML systems, including: online and batch model serving, monitoring, alerting, and observability, retraining and lifecycle management. Strong understanding of core data science concepts, sufficient to: review and challenge modelling approaches, ensure models are production‐ready and correctly ...

DevOps Engineer

Hiring Organisation
WTW
Location
Surrey, United Kingdom
Employment Type
Full Time
infrastructure. Automate environment provisioning across development and production. Manage backend state, pipelines, and state-change detection integrations. Platform Engineering & SRE Own and improve reliability, observability, and performance of the platform. Implement SLOs, alerting, dashboards, and auto remediation where possible. Troubleshoot cluster level, networking, and workload deployment issues. Lead root cause … endpoints, Certificate/Secret management etc Strong debugging and operational experience (SRE mindset). Solid experience of DevSecOps architecture, processes & tooling Solid understanding of Observability Process & Tooling Logging, metrics, traces, dashboards Other highly desirable, but not essential skills are: Experience with: GitOps - ArgoCD or GitOps workflows Zero downtime deployments (blue ...

Principal Software Development Engineer

Hiring Organisation
Jobleads-UK
Location
Manchester, England, United Kingdom
/CD pipelines, Infrastructure as Code, automation frameworks, and database-as-code practices using Redgate Flyway.Take ownership of critical customer systems, ensuring operational resilience, observability, performance optimisation, and rapid incident response.Collaborate closely with Product, Delivery, Operations, and Commercial teams to shape technical solutions, delivery plans, and strategic outcomes.Promote secure … Connect or Genesys Cloud.Proven ability to design and deliver secure, scalable, and resilient cloud-native solutions within complex enterprise environments.Strong understanding of observability, operational support, reliability engineering, and end-to-end ownership practices.Knowledge of regulated financial services environments, including UK GDPR and FCA Consumer Duty requirements.Excellent communication and stakeholder management ...

Devops Platform Engineer

Hiring Organisation
hireful
Location
Manchester / Work from home, Greater Manchester, United Kingdom
Employment Type
Permanent
Salary
£75000 - £85000/annum £80,000 - £85,000 + 10% Bonus + Exte
We are recruiting founding Platform Engineers on behalf of a fast-growing enterprise level (global, 500+ staff) software business with a strong engineering culture and a genuine commitment to doing things the right way. They ...

Devops Platform Engineer

Hiring Organisation
hireful
Location
London, United Kingdom
Employment Type
Permanent
Salary
£75000 - £85000/annum £80,000 - £85,000 + 10% Bonus + Exte
We are recruiting founding Platform Engineers on behalf of a fast-growing enterprise level (global, 500+ staff) software business with a strong engineering culture and a genuine commitment to doing things the right way. They ...

Infrastructure Engineer-Devops, SASE

Hiring Organisation
HCLTech
Location
Leeds, England, United Kingdom
HCLTech is a global technology company, spread across 60 countries, delivering industry-leading capabilities centered around digital, engineering, cloud and AI, powered by a broad portfolio of technology services and products. We work with clients ...

Infrastructure Engineer-Devops, Palo alto

Hiring Organisation
HCLTech
Location
Manchester Area, United Kingdom
HCLTech is a global technology company, spread across 60 countries, delivering industry-leading capabilities centered around digital, engineering, cloud and AI, powered by a broad portfolio of technology services and products. We work with clients ...

Senior Product Manager, FS Resilience & Market Data

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
ITRS is looking for a Senior Product Manager based in London to lead in delivering critical IT observability solutions. The role involves defining product strategy and engaging with Tier 1 financial institution customers to ensure the roadmap aligns with real needs. You will work on key projects including financial trading ...

Cloud SRE: Resilient, Scalable Infra & Automation

Hiring Organisation
Jobleads-UK
Location
Glasgow, Scotland, United Kingdom
Scotland. This role involves ensuring the reliability, availability, and performance of our healthcare platforms that support millions worldwide. You will work to improve system observability, automate processes, and lead initiatives to enhance platform resilience. The successful candidate will have at least 3 years of related experience, a passion for operational ...

Product Engineer

Hiring Organisation
Radley James
Location
London Area, United Kingdom
integrations end-to-end. The role involves: SuiteQL, SuiteTalk REST/SOAP, SuiteScript OneWorld + multi-entity accounting complexity Integration architecture, sync logic, observability Customer onboarding and troubleshooting Python backend systems and distributed workflows Experience with QBO, Xero, Sage Intacct, Ramp, Bill.com, or similar accounting platforms is highly valuable. Strong ...