751 to 775 of 1,292 Observability Jobs

Engineering Manager - Platform Reliability

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
Lakebase Platform Reliability team’s footprint spans multiple stacks, systems, and stakeholders. They include AI‐powered tooling and workflows for customer management, real‐time observability during incidents, monitoring and auditing systems that underpin compliance requirements, and customer‐facing operational APIs and maintenance workflows. You’ll contribute to the wider platform ...

Regional Vice President

Hiring Organisation
Jobleads-UK
Location
United Kingdom
advantage of all structured and unstructured data — securing and protecting private information more effectively — Elastic’s complete, cloud‐based solutions for search, security, and observability help organizations deliver on the promise of AI. What is The Role: Elastic is searching for a Regional Vice President, Commercial Team, to lead ...

Head of Data & AI Platforms & Engineering

Hiring Organisation
Jobleads-UK
Location
City Of London, England, United Kingdom
against SLAs and OLAs across data platforms and services.Ensure compliance with security, privacy and regulatory requirements across all data platforms.Implement frameworks for data quality, observability, lineage and metadata in partnership with governance teams.Oversee operational monitoring, incident management and continuous optimisation of platform services.Build and lead high‐performing data engineering ...

Enterprise Solutions Architect

Hiring Organisation
Jobleads-UK
Location
Oxford, England, United Kingdom
Success Factors Strong integration design capability: Domain Driven Design, event‐based integration, API design principles, resilience patterns, operational considerations (SLAs, observability, incident readiness) Excellent stakeholder management and communication: can influence at exec level, simplify complex trade‐offs, and align diverse teams behind common patterns and outcomes Desirable attributes: TOGAF certification ...

Solution Architecture Manager

Hiring Organisation
Vaco LLC
Location
Traverse City, Michigan, United States
Employment Type
Permanent
Salary
USD 170,000 Annual
like policy, claims, underwriting, forms/doc mgmt. Technical breadth: C#, TypeScript, Terraform, REST/RPC/GraphQL, messaging, OAuth2/OIDC/SAML, observability, cloud/container deployments. Responsibilities Set vision and quarterly plans; communicate roadmap and outcomes. Govern architecture quality; lead design reviews and maintain solution artifacts. Hire ...

Regional Vice President

Hiring Organisation
Jobleads-UK
Location
United Kingdom
advantage of all structured and unstructured data — securing and protecting private information more effectively — Elastic’s complete, cloud‐based solutions for search, security, and observability help organizations deliver on the promise of AI. What Is The Role Elastic, the Search AI company, is looking for a high‐energy Regional Vice ...

Regional Vice President

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
advantage of all structured and unstructured data — securing and protecting private information more effectively — Elastic’s complete, cloud-based solutions for search, security, and observability help organizations deliver on the promise of AI. What is The Role: Elastic, the Search AI company, is looking for a high-energy Regional Vice ...

Staff AI Engineer - Design Automation

Hiring Organisation
Sandisk
Location
Milpitas, California, United States
Employment Type
Permanent
Salary
USD Annual
scale. Essential Duties and Responsibilities: - Design and build AI agents that automate multi-step semiconductor workflows - Develop the Agent Platform: orchestration, tool integration, observability, and evaluation infrastructure - Work closely with design and verification engineers to deeply understand pain points and identify high-impact automation opportunities - Create techniques for grounding agents ...

Platform Engineer

Hiring Organisation
itecopeople
Location
London, United Kingdom
Employment Type
Permanent
Salary
£54000 - £60900/annum
large-scale enterprise environment. An exciting opportunity working on a greenfield Kubernetes platform built using modern engineering practices across Azure, GitOps, service mesh, observability and event-driven architecture. The Role You will be responsible for building, operating and improving a shared Kubernetes platform used by application, AI and integration engineering … teams. Hands-on role covering infrastructure as code, Kubernetes operations, CI/CD, networking, observability and platform reliability. Working closely with architects and engineering teams shaping the future of the platform while helping maintain high standards across automation, security, scalability and operational excellence. Key Responsibilities Build and operate Azure Kubernetes ...

Platform Engineer

Hiring Organisation
itecopeople
Location
London, England, United Kingdom
enterprise environment. This is an exciting opportunity to work on a greenfield Kubernetes platform built using modern engineering practices across Azure, GitOps, service mesh, observability and event-driven architecture. The Role As Platform Engineer, you will be responsible for building, operating and improving a shared Kubernetes platform used by application … integration engineering teams. This is a hands-on role covering infrastructure as code, Kubernetes operations, CI/CD, networking, observability and platform reliability. You'll work closely with architects and engineering teams to shape the future of the platform while helping maintain high standards across automation, security, scalability and operational ...

Lead Devops Engineer

Hiring Organisation
Venquis
Location
Essex, England, United Kingdom
engineers Drive infrastructure automation and Infrastructure as Code Manage and improve CI/CD pipelines Own cloud infrastructure and platform reliability Improve scalability, monitoring, observability and security Work closely with software engineering and architecture teams Influence DevOps strategy and technical direction Technical Environment AWS and/or Azure Kubernetes Terraform … Docker CI/CD pipelines Linux Monitoring & observability tooling Infrastructure as Code Experience Required Proven experience in a DevOps, Platform Engineering or SRE leadership role Strong cloud infrastructure background Experience leading or mentoring technical teams Excellent automation and CI/CD knowledge Strong Kubernetes and Terraform experience preferred Ability ...

Senior Platform Engineer

Hiring Organisation
Accenture
Location
Manchester Area, United Kingdom
code generation, testing, documentation, and analysis, while understanding model limitations, protecting client data, and improving delivery quality and speed through pragmatic automation SRE & Observability You’ll bring a reliability mindset to delivery, designing services that are operable by default and measured through meaningful SLIs/SLOs. You’ll help teams … implement pragmatic observability—logging, metrics, and distributed tracing—with actionable alerting, and you’ll contribute to (or lead) incident response and post-incident reviews that drive learning and measurable improvements. Job qualifications We are looking for experience in the following skills: Strong understanding of cloud platforms (preferably AWS/Azure ...

Site Reliability Engineer

Hiring Organisation
Jobleads-UK
Location
Manchester, England, United Kingdom
expect you to be involved as well in the operation of AWS-based Kubernetes platforms (EKS) while contributing to monitoring, alerting, and observability implementations using tools like Grafana and Prometheus. You’ll also assist in incident management, troubleshooting, and root cause analysis. In addition, you’ll be: Participating … operations. You must have experience working with AWS and Kubernetes (EKS) in a production or pre-production environment, along with familiarity with monitoring and observability tools such as Grafana and Prometheus. To succeed in this role, you should also have a good understanding of CI/CD pipelines ...

AI Platform/ DevOps Engineer

Hiring Organisation
The Portfolio Group
Location
City of London, London, Castle Baynard, United Kingdom
Employment Type
Permanent
Salary
£70000 - £80000/annum + Benefits
Bedrock Knowledge Bases) and embedding pipelines Build and maintain CI/CD pipelines for inference services, retrievers, ingestion workflows, and RAG components Implement observability across AI workloads using CloudWatch, MLflow, and OpenTelemetry - covering latency, throughput, cost, and system health Apply secure-by-design principles including IAM, encryption, network controls … Terraform experience for infrastructure-as-code, provisioning and managing cloud infrastructure at scale Experience operating containerised services, managing CI/CD pipelines, and owning observability and reliability Familiarity with vector databases or search infrastructure (OpenSearch, Algolia) is a strong advantage Python proficiency for scripting, automation, and deploying production services Solid ...

Go Full Stack Developer

Hiring Organisation
itecopeople
Location
London, United Kingdom
Employment Type
Permanent, Work From Home
Salary
£60,000
event-driven services Contribute to CI/CD pipelines and cloud-native deployments Review code and champion engineering best practices Improve application performance, observability and reliability Collaborate within Agile delivery teams across multiple projects Support technical decision-making and continuous improvement Skills & Experience We are looking for candidates with strong … reviews, testing and engineering governance Experience with any of the following would be highly advantageous: Microsoft Azure Python GitOps tooling (Argo CD/Flux) Observability tooling (Prometheus, Grafana, OpenTelemetry) AI/LLM-enabled applications Event-driven architectures and messaging platforms What's on Offer Opportunity to work on cutting-edge ...

Site Reliability Engineer

Hiring Organisation
Pertemps London
Location
London, United Kingdom
Employment Type
Permanent
Salary
GBP 50,000 Annual
Jenkins, GitLab CI) Develop and maintain Terraform modules for infrastructure-as-code Build automation tools (CLI tools, scripts, GitHub Apps, self-service tooling) Own observability: dashboards, alerts, monitoring, and runbooks Continuously improve platform processes and reduce operational toil What We're Looking For Essential Skills & Experience 2-3 years … GitHub Actions, GitLab CI, Jenkins) Ability to write production-quality code in Python or Bash Solid networking fundamentals (DNS, load balancers, CDNs) Experience with observability tools (NewRelic, Datadog, Prometheus, Grafana) Comfortable participating in on-call rotations Experience using AI tools (e.g. ChatGPT, Copilot, Cursor) to enhance productivity Desirable Go, Ansible ...

AKS DevOps Engineer - Azure Kubernetes

Hiring Organisation
Reed
Location
London Gatwick Airport, Gatwick, West Sussex, England, United Kingdom
Employment Type
Full-Time
Salary
£70,000 per annum, Inc benefits
/CD pipelines using Azure DevOps with YAML. Implement and maintain secure networking patterns and apply cloud security best practices. Create and maintain platform observability using Azure Monitor, Analytics, and Application Insights. Collaborate with engineering teams to ensure service reliability on the platform. Promote best practice in cloud engineering … private endpoints, load balancing, etc. Scripting proficiency in Bash, PowerShell, or Python. Linux operating system knowledge and troubleshooting capability. Experience implementing monitoring, logging, and observability solutions in Azure. Ability to communicate platform issues like risk, platform health, cost etc to non-technical audiences. Desirable Skills: Experience contributing to architecture ...

Principal Artificial Intelligence (AI) Platform Engineer/Architect

Hiring Organisation
WTW
Location
Greater London, United Kingdom
Employment Type
Full Time
engagement—building credibility and driving adoption across the organization Provide escalation pathways for architecture questions and unblock teams on complex integration challenges Implement monitoring, observability, and governance systems that provide transparency without creating bottlenecks Collaborate with security, compliance, and data teams to embed safety guardrails into platform capabilities Participate … experience) Proven ability to design systems that abstract complexity and enable teams to self-serve at scale Strong software engineering fundamentals (system design, testing, observability, operational excellence, SDLC practices) Experience building or maintaining developer-facing platforms, SDKs, or internal tools Comfortable articulating technical architecture, vision, and strategy to both technical ...

Platform Engineer

Hiring Organisation
Accenture
Location
Glasgow, Scotland, United Kingdom
code generation, testing, documentation, and analysis, while understanding model limitations, protecting client data, and improving delivery quality and speed through pragmatic automation SRE & Observability You’ll bring a reliability mindset to delivery, designing services that are operable by default and measured through meaningful SLIs/SLOs. You’ll help teams … implement pragmatic observability—logging, metrics, and distributed tracing—with actionable alerting, and you’ll contribute to (or lead) incident response and post-incident reviews that drive learning and measurable improvements. We are looking for experience in the following skills: Strong experience with the AWS cloud platform and core services. Hands ...

ML Infrastructure Lead

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
versioning, reproducibility, experimentation, feature management and release management Own and improve the production environment for machine learning systems, ensuring strong standards for availability, performance, observability and resilience Define and implement monitoring across model and platform layers, including system health, data quality, drift, latency, throughput and cost efficiency Build or optimise … pipelines, infrastructure-as-code and workflow orchestration Experience with tools such as Airflow or similar platform and orchestration technologies Good understanding of model observability, data quality, feature pipelines, lineage and reproducibility Experience designing scalable infrastructure for ML workloads, including training, batch inference and real-time serving Strong appreciation of reliability ...

Senior Software Engineer II - Data Engineering

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
ensure technical consistency.* Design, develop, and maintain generative AI services and reusable components using Python.* Define and promote best practices in engineering, including scalability, observability, testing, and CI/CD.* Contribute to system designs spanning multiple services and modules, aligning with architectural best practices.* Collaborate with product, platform, and research … work collaboratively across functions in an Agile or Kanban environment.**Nice to have:*** Experience operationalizing LLMs or building an internal AI platform.* Familiarity with observability practices (metrics, logging, alerts).* Exposure to knowledge graphs or semantic search systems.Join our team and contribute to a culture of innovation, collaboration, and excellence. ...

Lead DevOps Engineer

Hiring Organisation
Vaco LLC
Location
Dublin, Ohio, United States
Employment Type
Permanent
Salary
USD Annual
secure, and cost-effective infrastructure across production, development, and test environments. This is a deeply hands-on position responsible for executing and improving deployments, observability, and core operational practices to reduce risk caused by opaque processes, undocumented knowledge, and single points of failure. The Lead DevOps Engineer transforms deployment … application architecture, infrastructure, and deployment workflows. Proven ability to troubleshoot complex issues across infrastructure, CI/CD pipelines, and runtime environments. Solid understanding of observability, including metrics, logging, alerting, and root-cause analysis. Strong security mindset, including secrets management, access controls, encryption, patching, and vulnerability management. Deep understanding of network ...

Splunk Developer

Hiring Organisation
Infoplus Technologies UK Ltd
Location
Edinburgh, Midlothian, Scotland, United Kingdom
Employment Type
Contract
Contract Rate
From £350 to £400 per day
application teams to deliver scalable monitoring, service health, and analytics solutions. ________________________________________ Key Responsibilities Technical Leadership Act as Technical Lead for Splunk implementations across monitoring, observability, and service intelligence use cases. Own end to end Splunk solution design including data onboarding, data models, dashboards, alerts, and ITSI objects. Review and govern … oSplunk Dashboard Studio/Classic dashboards Design meaningful alerts using: oCorrelation searches oRisk based alerting principles Translate operational and business requirements into actionable insights. Observability & Production Support Integrate Splunk with enterprise observability tools (APM, infrastructure monitoring, cloud platforms). Support production incidents using Splunk, driving root cause analysis and post ...

Site Reliability Engineer

Hiring Organisation
Fuel Recruitment
Location
Farnborough, Hampshire, United Kingdom
Employment Type
Permanent
Salary
GBP 60,000 Annual
Site Reliability Engineer to help design, deploy and optimise secure, resilient platforms across internal and customer environments. The role is focused on automation, observability and taking new solutions from proof-of-concept through to full p click apply for full job details ...

Database Reliability Engineer | Postgres, Kubernetes & Cloud

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
leading UK fintech company seeks data professionals to modernize its database systems. Roles involve enhancing PostgreSQL and Kubernetes setups, establishing observability through monitoring tools, and ensuring data integrity across multi-cloud frameworks. Working hybrid, the ideal candidates are innovative and collaborative, with strong backgrounds in backend development and infrastructure provisioning. ...