676 to 700 of 1,235 Permanent Observability Jobs

Platform Engineer - Kubernetes / Azure

Hiring Organisation
Keystone Recruitment Partners Ltd
Location
United Kingdom
Employment Type
Permanent
Salary
GBP 450 - 550 Daily
operation of enterprise cloud-native services. The role will focus on Kubernetes-based platforms running on Microsoft Azure, including service mesh, application deployment, observability, security, and production support. Key responsibilities: Build, configure, and support Kubernetes environments on Azure, including AKS. Deploy and manage Spring Boot applications in containerized environments. Work … Ability to work independently and communicate effectively with both technical and non-technical stakeholders. Desirable experience: Terraform, Helm, GitOps, Azure DevOps, or GitHub Actions. Observability tools such as Prometheus, Grafana, OpenTelemetry, or Azure Monitor. Financial services or other regulated enterprise environments. ...

Platform Engineer - Kubernetes / Azure

Hiring Organisation
Keystone Recruitment Partners Ltd
Location
Nationwide, United Kingdom
Employment Type
Permanent, Contract
Salary
£450 - £550/day
operation of enterprise cloud-native services. The role will focus on Kubernetes-based platforms running on Microsoft Azure, including service mesh, application deployment, observability, security, and production support. Key responsibilities: Build, configure, and support Kubernetes environments on Azure, including AKS. Deploy and manage Spring Boot applications in containerized environments. Work … Ability to work independently and communicate effectively with both technical and non-technical stakeholders. Desirable experience: Terraform, Helm, GitOps, Azure DevOps, or GitHub Actions. Observability tools such as Prometheus, Grafana, OpenTelemetry, or Azure Monitor. Financial services or other regulated enterprise environments. ...

Staff Site Reliability Engineer

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
find and analyze reliability problems across our stack, then design and implement software and systems to create step-function improvements. You will design robust observability solutions, lead incident response, automate operational tasks, and continuously improve our infrastructure's reliability, all while mentoring and educating the broader engineering team to make … reliability a core value at Replit. You Will Architect and Implement Observability: Design, build, and lead the implementation of comprehensive monitoring, logging, and tracing solutions. Create dashboards and metrics that provide real-time visibility into system health and performance, enabling proactive issue detection. Define and Drive Reliability Standards: Work with ...

Integration Developer FTC

Hiring Organisation
itecopeople
Location
London, United Kingdom
Employment Type
Permanent, Work From Home
Salary
£60,000
Build connectors, event-processing services, and data pipelines Design scalable integration patterns, schemas, and event flows Develop CDC pipelines and resilient messaging solutions Improve observability through logging, metrics, and tracing Deploy containerised services using Docker and Kubernetes Contribute to architecture, code reviews, and engineering standards Collaborate with developers, data engineers … design Agile development experience Strong communication and collaboration skills Desirable Skills Go and/or Python CDC pipeline development Azure cloud experience Observability tooling (Prometheus, Grafana, OpenTelemetry) Experience within regulated environments What's on Offer Hybrid working - 2 days per week in London Salary up to £60,900 Generous pension ...

Senior Azure Platform Engineer

Hiring Organisation
Rebel Recruitment
Location
Salford, Greater Manchester, North West, United Kingdom
Employment Type
Permanent, Work From Home
Salary
£90,000
about being part of that journey. You'll be working in an environment where tools like GitHub Copilot, OpenAI models, Claude, Gemini, AI-powered observability platforms, intelligent deployment workflows, and internal AI tooling are actively being explored and introduced to improve how engineering teams work day to day. This … designing and improving Azure infrastructure, evolving Kubernetes platforms within AKS, building reusable Infrastructure-as-Code patterns using Terraform and Crossplane, and helping improve reliability, observability, and security across the wider platform estate. You'll also spend time improving developer tooling and CI/CD processes, helping engineering teams deploy faster ...

Google cloud Platform Infrastructure Engineer

Hiring Organisation
Adroit People Limited (UK)
Location
City of London, London, United Kingdom
reduce manual effor t.Supporting incident response and learning good operational practice s.Participating in agile ceremonies and contributing to continuous improveme ntBuilding foundational knowledge in observability, security and DevOps cultur e. Essential skills & experie nce1–3 years’ experience in DevOps, SRE or cloud engineeri ng.Basic understanding of cloud concepts and core … ns.Exposure to IaC tools, preferably Terrafo rm.Enthusiasm, initiative and a desire to grow technical ly. Desirable sk illsAwareness of Kubernetes/containerisa tionUnderstanding of observability tooling (Prometheus, Dynatrace, et c.).Awareness of agile ways of work ing. ...

Senior DevOps Engineer

Hiring Organisation
Halian Technology Limited
Location
Reading, Berkshire, South East, United Kingdom
Employment Type
Permanent, Work From Home
Salary
£95,000
reliability, and availability Implement self-service tooling to empower development teams Drive DevOps best practices across the digital product lifecycle Develop and enhance monitoring, observability, and incident response processes Support global engineering teams delivering high-traffic platforms Key Requirements Proven experience supporting digital product delivery in a DevOps or platform … with Infrastructure as Code (Terraform, Ansible, Puppet or similar) Hands-on experience with Kubernetes, Docker, and cloud platforms (AWS preferred) Experience with monitoring/observability tools (Prometheus, Grafana, ELK, APM tools) Solid understanding of system performance, scalability, and resilience Strong collaboration and communication skills within cross-functional product teams Desirable ...

Principal Software Development Engineer

Hiring Organisation
Jobleads-UK
Location
Glasgow, Scotland, United Kingdom
Code, automation frameworks and database‐as‐code practices using tools such as Redgate Flyway. Take ownership of critical customer systems, ensuring operational resilience, observability, performance optimisation and rapid incident response. Collaborate closely with Product, Delivery, Operations and Commercial teams to shape technical solutions, delivery plans and strategic outcomes. Promote secure … Connect or Genesys Cloud. Proven ability to design and deliver secure, scalable and resilient cloud‐native solutions within complex enterprise environments. Strong understanding of observability, operational support, reliability engineering and end‐to‐end ownership practices. Knowledge of regulated financial services environments, including UK GDPR and FCA Consumer Duty requirements. Excellent ...

Senior Software Development Engineer in Test

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
from traditional testing approaches towards a modern, engineering‐led quality strategy: unit testing, contract testing, component testing, integration testing, E2E flows, synthetics, and strong observability across our microservices. We’re looking for a Senior SDET who is hands‐on, highly technical, and passionate about setting teams up for long‐term … teams adopt best practices confidently. Collaborate with Engineering and DevOps to evolve CI/CD pipelines and embed automation earlier in the lifecycle. Improve observability around testing and reliability, integrating logs, traces, metrics, synthetics, and alerts to increase confidence in releases. Promote good testing principles and high‐quality engineering practices ...

Ai Engineer

Hiring Organisation
Morgan McKinley
Location
Yorkshire and Humberside, England, United Kingdom
Employment Type
Full-Time
Salary
Salary negotiable
with Generative and Agentic AI patterns, including LLM integration, RAG architectures, prompt-driven workflows, and AI service orchestration. Integrate AI capabilities with enterprise systems, observability tooling, and security frameworks. Design and maintain CI/CD pipelines within cloud-native engineering environments. Support benchmarking, evaluation, experimentation, and cost optimisation … Skills Experience with Kong API Gateway, Kong Mesh, and Flux CD. RESTful API and microservices development. Terraform and GitOps workflows. Exposure to prompt evaluation, observability, or AI red-teaming tools. SQL and NoSQL database experience. Understanding of vector search technologies and Retrieval-Augmented Generation (RAG) patterns. About You A proactive ...

Lead AI Engineer

Hiring Organisation
Morgan McKinley
Location
Yorkshire and Humberside, England, United Kingdom
Employment Type
Full-Time
Salary
Salary negotiable
with Generative and Agentic AI patterns, including LLM integration, RAG architectures, prompt-driven workflows, and AI service orchestration. Integrate AI capabilities with enterprise systems, observability tooling, and security frameworks. Design and maintain CI/CD pipelines within cloud-native engineering environments. Support benchmarking, evaluation, experimentation, and cost optimisation … Skills Experience with Kong API Gateway, Kong Mesh, and Flux CD. RESTful API and microservices development. Terraform and GitOps workflows. Exposure to prompt evaluation, observability, or AI red-teaming tools. SQL and NoSQL database experience. Understanding of vector search technologies and Retrieval-Augmented Generation (RAG) patterns. About You A proactive ...

GenAI Python Developer

Hiring Organisation
EMBS Technology
Location
London Area, United Kingdom
services with cloud-native architectures across Azure and AWS. Build and maintain CI/CD pipelines aligned with engineering and security standards. Implement observability, monitoring and performance optimisation across GenAI services. Support benchmarking, experimentation and evaluation of LLM performance, accuracy and cost. Collaborate with Architects, Platform Engineers, Product Teams … Retrieval-Augmented Generation (RAG) architectures. Vector databases and semantic search technologies. Kong API Gateway and Kong Mesh. FluxCD and GitOps workflows. Prompt evaluation and observability tools such as Promptfoo and Arize. SQL databases including PostgreSQL and MySQL. NoSQL database technologies. Distributed systems and cloud-native application development. What Success Looks ...

SR. SOFTWARE ENGINEER

Hiring Organisation
Widenet Consulting
Location
Bellevue, Washington, United States
Employment Type
Permanent
Salary
USD 7,000 Hourly
features and systems. Influence architectural decisions to ensure scalability, security, performance, and alignment with enterprise standards. Apply best practices related to performance tuning, observability, disaster recovery, and capacity planning. Innovation & Research Leverage AI code assistants to enhance developer velocity and code quality. Explore emerging technologies, frameworks, and methodologies to drive … work effectively in a collaborative, team-based environment. Preferred Qualifications Bachelor's degree in Computer Science, Information Systems, or equivalent experience. Experience with observability platforms such as Datadog. Familiarity with modern front-end concepts and frameworks (helpful but not required). Experience leading engineering initiatives or cross-team projects. Contributions ...

Senior Developer

Hiring Organisation
Addition
Location
Watford, Hertfordshire, England, United Kingdom
Employment Type
Full-Time
Salary
£80,000 per annum
Doing: Designing, deploying and managing automation and monitoring platforms that support large-scale applications and services Building and maintaining monitoring, alerting and observability tooling across the platform Creating dashboards that translate complex technical data into meaningful insights for stakeholders Developing automation to integrate new systems using existing frameworks Managing … Docker) Strong Python development skills , including scripting and Lambda functions Experience building and managing CI/CD pipelines , ideally with GitHub Actions Monitoring and observability tooling such as AppDynamics, Grafana, InfluxDB, Graphite, Sensu or similar Experience working with serverless architectures (Lambda, API Gateway, DynamoDB, EventBridge) Solid understanding of Linux/ ...

DevOps Engineer

Hiring Organisation
WTW
Location
Surrey, United Kingdom
Employment Type
Full Time
Managed Identities, Azure networking and Microsoft Entra ID. • Integrate and support security tooling and quality gates, including Mend, Snyk, Invicti, Wiz and GitLeaks. • Improve observability and feedback across build, deployment and environment health using tools such as Datadog, Azure Monitor and Log Analytics. • Help development teams diagnose delivery, deployment … troubleshooting. • Experience embedding security, quality and compliance checks into delivery pipelines, including vulnerability scanning, container scanning, secrets scanning and release evidence. • Good understanding of observability practices, including logs, metrics, dashboards, alerts and environment health checks. • Strong troubleshooting skills across pipelines, deployment automation, Kubernetes workloads, cloud configuration and environment issues. • Ability ...

Principal Machine Learning Engineer

Hiring Organisation
Jobleads-UK
Location
City Of London, England, United Kingdom
evolve the ML Platform, ensuring it supports: Reusable and scalable deployment patterns CI/CD for machine learning Full model lifecycle management Monitoring, observability, and alerting Secure and compliant operation Shape platform standards and interfaces that enable consistent ML delivery across squads and value streams Lead technical spikes and proof … fundamentals (OOP, testing, design patterns). Deep experience building, deploying, and operating production ML systems, including: online and batch model serving, monitoring, alerting, and observability, retraining and lifecycle management. Strong understanding of core data science concepts, sufficient to: review and challenge modelling approaches, ensure models are production‐ready and correctly ...

DevOps Engineer

Hiring Organisation
WTW
Location
Surrey, United Kingdom
Employment Type
Full Time
infrastructure. Automate environment provisioning across development and production. Manage backend state, pipelines, and state-change detection integrations. Platform Engineering & SRE Own and improve reliability, observability, and performance of the platform. Implement SLOs, alerting, dashboards, and auto remediation where possible. Troubleshoot cluster level, networking, and workload deployment issues. Lead root cause … endpoints, Certificate/Secret management etc Strong debugging and operational experience (SRE mindset). Solid experience of DevSecOps architecture, processes & tooling Solid understanding of Observability Process & Tooling Logging, metrics, traces, dashboards Other highly desirable, but not essential skills are: Experience with: GitOps - ArgoCD or GitOps workflows Zero downtime deployments (blue ...

Principal Software Development Engineer

Hiring Organisation
Jobleads-UK
Location
Manchester, England, United Kingdom
/CD pipelines, Infrastructure as Code, automation frameworks, and database-as-code practices using Redgate Flyway.Take ownership of critical customer systems, ensuring operational resilience, observability, performance optimisation, and rapid incident response.Collaborate closely with Product, Delivery, Operations, and Commercial teams to shape technical solutions, delivery plans, and strategic outcomes.Promote secure … Connect or Genesys Cloud.Proven ability to design and deliver secure, scalable, and resilient cloud-native solutions within complex enterprise environments.Strong understanding of observability, operational support, reliability engineering, and end-to-end ownership practices.Knowledge of regulated financial services environments, including UK GDPR and FCA Consumer Duty requirements.Excellent communication and stakeholder management ...

Infrastructure Engineer-Devops, Palo alto

Hiring Organisation
HCLTech
Location
Manchester Area, United Kingdom
HCLTech is a global technology company, spread across 60 countries, delivering industry-leading capabilities centered around digital, engineering, cloud and AI, powered by a broad portfolio of technology services and products. We work with clients ...

Devops Platform Engineer

Hiring Organisation
hireful
Location
Manchester / Work from home, Greater Manchester, United Kingdom
Employment Type
Permanent
Salary
£75000 - £85000/annum £80,000 - £85,000 + 10% Bonus + Exte
We are recruiting founding Platform Engineers on behalf of a fast-growing enterprise level (global, 500+ staff) software business with a strong engineering culture and a genuine commitment to doing things the right way. They ...

Devops Platform Engineer

Hiring Organisation
hireful
Location
London, United Kingdom
Employment Type
Permanent
Salary
£75000 - £85000/annum £80,000 - £85,000 + 10% Bonus + Exte
We are recruiting founding Platform Engineers on behalf of a fast-growing enterprise level (global, 500+ staff) software business with a strong engineering culture and a genuine commitment to doing things the right way. They ...

Infrastructure Engineer-Devops, SASE

Hiring Organisation
HCLTech
Location
Leeds, England, United Kingdom
HCLTech is a global technology company, spread across 60 countries, delivering industry-leading capabilities centered around digital, engineering, cloud and AI, powered by a broad portfolio of technology services and products. We work with clients ...

Platform Engineer

Hiring Organisation
Infinity Quest
Location
City of London, London, United Kingdom
Platform (Observability) Engineer: Hands-on experience in deploying and managing OTel collectors, instrumentation, telemetry pipelines, dashboards, alerting, and monitoring platform integrations. Strong understanding of cloud platforms, automation/IaC, log-metric-trace correlation, and operational monitoring best practices. ...

Platform Engineer

Hiring Organisation
N Consulting Global
Location
London Area, United Kingdom
Role: - Platform Engineer Duration :Fulltime Location: London, UK (Hybrid - 4 days office Mandatory) Platform (Observability) Engineer: Hands-on experience in deploying and managing OTel collectors, instrumentation, telemetry pipelines, dashboards, alerting, and monitoring platform integrations. Strong understanding of cloud platforms, automation/IaC, log-metric-trace correlation, and operational monitoring best ...

Cloud SRE: Resilient, Scalable Infra & Automation

Hiring Organisation
Jobleads-UK
Location
Glasgow, Scotland, United Kingdom
Scotland. This role involves ensuring the reliability, availability, and performance of our healthcare platforms that support millions worldwide. You will work to improve system observability, automate processes, and lead initiatives to enhance platform resilience. The successful candidate will have at least 3 years of related experience, a passion for operational ...