626 to 650 of 1,265 Observability Jobs

MuleSoft & Salesforce Agentic Engineer

Hiring Organisation
Arbuthnot Latham & Co., Limited
Location
Wolverhampton, West Midlands, United Kingdom
Employment Type
Permanent
Gateway (LLM/MCP/A2A) to integrate agents into existing flows, data models and processes. Agent Control, Monitoring & Governance Implement control, monitoring and observability for Salesforce agents, including usage, decisioning outcomes, errors and exceptions. Ensure agent behaviour aligns with internal policies, regulatory expectations and audit requirements appropriate to asset ...

Quantitative Developer

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
keep: research workflows, client-reporting drafts, commentary support, meeting prep. Write production code that other quants want to build on. Own reliability, testing, and observability for what you ship. Mentor teammates on effective AI-augmented engineering practice. Carry out other duties as assigned. What to Expect When You Join ...

Senior Software Engineer I (Android) London, UK

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
Data and Product to interpret results, and iterate based on real user behaviour. Quality & Reliability: Maintain high standards for testing, crash‐free sessions and observability, and contribute to incident investigation and prevention. Qualifications Experience: 4+ years of Android engineering experience building and shipping consumer products in Kotlin. Architectural Depth: Comfortable ...

Head of Application Operations

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
driving effective RCAs. Strong Problem Management and RCA facilitation with a track record of implementing preventative actions that reduce operational risk. Proficient with observability and ITSM tooling to enable proactive monitoring, SLO/SLA definition and data‐driven operational dashboards. Strong people leadership with experience organising teams for fast execution ...

UKI Solutions Engineering Director — AI-Driven Growth

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
within the UKI region . Experience working across multiple customer segments (Enterprise, Public Sector, Service Provider, Commercial). Strong domainexpertisein enterprise software (e.g., Cybersecurity, Observability, Cloud & AI, IT Operations, Application Performance Management, or Big Data). Exceptional communication and articulation skills; ability to translate complex technical ideas into clear business ...

Senior Software Engineer

Hiring Organisation
Jobleads-UK
Location
United Kingdom
INSHUR engineers work by contributing to squad, collective, or discipline-level initiatives, especially those advancing AI-augmented engineering practices across the organisation. Own Observability: You'll ensure systems stay healthy and visible by identifying monitoring gaps and independently managing escalations, building confidence that your area runs smoothly. Collaborate Across Functions ...

Site Reliability Engineer (AWS)

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
place for you. What You'll Do Own reliability – Maintain and improve our AWS infrastructure using Terraform, bringing your expertise and best practices Champion observability – Partner with developers to implement effective monitoring, logging, and tracing strategies Strengthen security – Work closely with the CISO to implement security best practices and ensure … compliance Optimise costs – Monitor cloud spend and implement FinOps best practices Maintain CI/CD pipelines – Implement and maintain reliability and observability aspects of GitHub workflows and deployment pipelines Incident response – Lead incidents, run blameless post-mortems, and drive continuous improvement Enable developers – Mentor teams on SRE and observability practices ...

Monitoring & Observability Engineer

Hiring Organisation
COMPUTACENTER (UK) LIMITED
Location
London, United Kingdom
Employment Type
Permanent
Salary
GBP Annual
Life on the team Location: UK Wide At Computacenter, youll be joining a world-class team of over 1,000 skilled professionals within Group Professional Services (GPS). Our teams operate across the UK, Germany ...

Network Reliability Engineer – Observability & Automation

Hiring Organisation
Jobleads-UK
Location
United Kingdom
Genesys is seeking a Network Engineer for Operations Reliability in the United Kingdom. The role focuses on maintaining the reliability, stability, and performance of enterprise network services, including LAN, WAN, and cloud connectivity. Candidates should ...

Solutions Architect - AI, Observability & Security Presales

Hiring Organisation
Jobleads-UK
Location
United Kingdom
Elasticsearch B.V. is looking for a Solutions Architect to serve as a technical authority and trusted advisor. This role involves understanding customer goals, guiding sales efforts, and building relationships. The ideal candidate should have a ...

Staff Site Reliability Engineer - Cloud

Hiring Organisation
Jobleads-UK
Location
Newcastle upon Tyne, England, United Kingdom
Newcastle: UK - London: UK - Leedstime type: Full timeposted on: Posted Todayjob requisition id: R55272**Elevate Global Operations as our Next Cloud Site Reliability Engineer (Observability Expert)!**Trimble is an industrial technology company transforming the way the world works by delivering solutions that enable our customers to thrive. We create technologies … progress with connected hardware and software solutions.**What Makes This Role Great:**In this role, you will be the primary architect of our Observability Centre of Excellence, directly influencing the reliability and uptime of global platforms that keep world industries moving.**Key Exciting Responsibilities:*** Lead a global "OTel First" strategy ...

DevOps Engineer

Hiring Organisation
Twinstream Limited
Location
Bristol, United Kingdom
Employment Type
Contract
Contract Rate
£500 - £600/day
container services and AMQP messaging. Working closely with feature delivery teams, you’ll help drive reliable production releases, maintain CI/CD pipelines, improve observability and ensure systems continue to meet demanding SLA/SLO targets. This is an excellent opportunity for a seasoned engineer who enjoys solving complex operational … promote releases into production efficiently and safely Maintaining highly available services using real-time monitoring and system metrics Building and improving monitoring, alerting and observability capabilities Investigating alerts and incidents, implementing preventative and remedial actions Working with customer stakeholders to coordinate releases and evolving service requirements Driving automation to reduce ...

AI Native Software Engineer

Hiring Organisation
TekWissen UK
Location
London Area, United Kingdom
invocation, and policy‐based routing Build cloud‐native backend services and APIs to support AI‐driven applications and enterprise integrations Implement evaluation, monitoring, and observability frameworks to ensure accuracy, latency, reliability, and system health across AI agent lifecycles Optimize AI and system performance across cost, scalability, and latency dimensions … Frameworks: LangGraph, AutoGen, CrewAI (or similar) Cloud & DevOps Tooling: Docker, Kubernetes, Terraform, Helm, CI/CD pipelines Enterprise Integration: APIs, enterprise platforms, monitoring and observability tools Why You’ll Love This Role Build real, enterprise‐grade AI systems that move beyond experimentation into production Remain deeply technical ...

Site Reliability Engineer

Hiring Organisation
Jobleads-UK
Location
Birmingham, England, United Kingdom
pipelines to facilitate smooth deployments and automate workflows. Collaborate with development teams to establish best practices in system architecture, deployment, and monitoring. Implement observability solutions to gain insights into system performance and user experience. Participate in on-call rotations to respond to system alerts, perform root cause analysis, and implement … code tools (Terraform, Ansible, etc.) for automating deployments. Proficiency in scripting and programming languages such as Python, Go, or Bash. Familiarity with monitoring and observability tools (Prometheus, Grafana, ELK stack). Excellent problem-solving skills and the ability to work effectively in high-pressure situations. Health Care Plan (Medical, Dental ...

OSS/BSS Solution architect

Hiring Organisation
IBU CONSULTING LTD
Location
London, United Kingdom
Employment Type
Contract
async patterns - Deep understanding of TMF SID (Product/Service domains) and TMF Open APIs used across BSS/OSS HIGHLY DESIRABLE Cloud Services & Observability - Experience designing for multi-cloud, hybrid and on-prem environments - Skilled in cloud-native patterns (PaaS, SaaS, serverless, containers, orchestration) - Cloud platforms … commonly used IaaS/PaaS services - Proficient with observability frameworks (Prometheus, ElasticSearch , Grafana, OpenTelemetry ) for metrics, logs and traces Platform Technologies & BSS Vendor Platforms - Open source workflow engines (Temporal) - Stream/batch processing (Flink, Spark) - Test automation: Robot Framework and BDD - can review BDD robot files - BSS vendor platforms: Salesforce ...

Artificial Intelligence (AI) DevOps

Hiring Organisation
WTW
Location
Greater London, United Kingdom
Employment Type
Full Time
Role The responsibilities will include: Help to design, build, and maintain AI‐augmented DevOps pipelines, integrating LLM‐powered tooling, automated testing, code generation, observability, and environment provisioning. Develop automation for operational workflows (permissions, tagging, remediation tasks, infrastructure housekeeping, monitoring pipelines) Help to build foundational components that allow delivery teams … necessary any and all of the security processes required for operational suitability within WTW for solutions (including SAST and DAST processes) Ensure operational stability, observability, and controlled evolution of AI and agentic systems for the ICT Consultancy business Maintain & support AI tools and AI based systems once deployed and help ...

Principal AI Engineer

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
data models, service integrations, and internal tools. Architect systems using modern cloud patterns such as microservices, event‐driven design, and managed services, ensuring reliability, observability, and scalability. Provide architectural leadership across integrations with enterprise systems and third‐party platforms. ML Ops, Reliability & Engineering Best Practices Productionize AI and machine learning … solutions using modern ML Ops and software engineering practices. Establish standards for testing, deployment, observability, drift detection, retraining, and documentation. Drive quality, automation, and performance in systems where accuracy, resilience, and reliability are critical. Leadership, Mentorship & Execution Serve as a hands‐on technical leader and player‐coach, mentoring engineers while ...

Kubernetes Linux AIOps Engineer – Elite Quant Hedge Fund

Hiring Organisation
Winston Fox
Location
City of London, London, United Kingdom
Infrastructure DevOps Engineer/SRE with expertise in Kubernetes, Linux, Observability, IaC and AIOps sought by a market-leading Quantitative Hedge Fund to further aide further business growth. Our client is one of the World's Elite Quant Hedge Fund Managers with large-scale, massively Distributed Systems, and ample opportunity … Terraform, C...) Must be able to write high quality Automation/scripts from scratch. Configuration Management Tools (Ansible/Puppet/Kapitan/Terraform....) Observability: Experience within the modern open-source ecosystem (ELK, OpenTelemetry, LGTM stack, Prometheus, Grafana, Loki...) CI/CD and GitLab/GitOps : working with Development teams. ...

Site Reliability Engineer (Kubernetes / Multi-Cloud) UK Based

Hiring Organisation
Jobleads-UK
Location
Hereford, England, United Kingdom
Cluster Autoscaler, KEDA, Karpenter) Help improve workload reliability and performance Support networking, identity, compute, and storage services Assist with maintaining secure and scalable environments Observability & Monitoring Work with Prometheus, Grafana, OpenTelemetry, Azure Monitor, and CloudWatch Build dashboards, alerts, and logging/tracing pipelines Support monitoring aligned to SLIs/SLOs … networking, and scaling Cloud Experience with Azure and/or AWS Familiarity with networking, IAM, and core services Infrastructure as Code Experience with Terraform Observability Familiarity with monitoring/logging tools (Prometheus, Grafana, loki) Other Technical Skills Helm Charts/Kustomize creation and maintenance Containers (Docker) Exposure to both Azure ...

AI Engineering Enablement Director

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
/ML, software, or platform engineering, with exposure to automated testing and infrastructure‐as‐code or policy‐as‐code.* Working knowledge of AI observability (logs, metrics, traces, behavioural signals) and practical methods to evaluate or improve AI system behaviour.* Familiarity with AI risk and governance frameworks (e.g., NIST … FinOps, such as cost‐aware model selection, unit economics, or prompt‐efficiency practices.* Experience with MLOps or AI delivery tooling, or with AI‐specific observability systems.* Participation in industry communities or standards bodies, with the ability to translate external practice into internal adoption.* Experience facilitating workshops or engineering enablement events. ...

Engineering Manager (DevOps)

Hiring Organisation
iProov
Location
London, England, United Kingdom
/CD pipelines, and deployment practices across GCP (primary), AWS, and Azure Set and enforce engineering standards for infrastructure-as-code, GitOps, DevSecOps, and observability across the team and the wider engineering organisation Lead the design and improvement of containerised deployment workflows using Docker, Kubernetes, and Helm … Azure Key Vault), and security integration into the delivery pipeline as a first-class concern Identify and address tooling gaps across monitoring, alerting, observability, and incident response; own the on-call process, runbooks, escalation paths, and post-incident reviews People Management & Team Leadership Directly manage 4/5 DevOps engineers ...

Sr. Distinguished Machine Learning Engineer (Remote-Eligible)

Hiring Organisation
Capital One
Location
Mc Lean, Virginia, United States
Employment Type
Permanent
Salary
USD Annual
Sr. Distinguished Machine Learning Engineer (Remote-Eligible) Overview: At Capital One, we are creating responsible and reliable AI systems, changing banking for good. For years, Capital One has been an industry leader in using machine ...

Data Engineer

Hiring Organisation
Athsai
Location
United Kingdom
We Are Hiring – Data Engineer Location: Remote Job Type: Full-Time Salary : 77K GBP Preference would be given to SC eligible candidates. About the Role We are seeking an experienced and highly motivated Data Engineer ...

Technical Architect

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
Significant experience as a technical or solution architect in complex digital or enterprise environments. Strong software engineering foundation, with practical knowledge of modern application architectures (e.g., microservices, APIs, distributed systems). Proven ability to design ...

OpenTelemetry Architect

Hiring Organisation
Ampstek
Location
London Area, United Kingdom
Summary We are seeking an experienced OpenTelemetry Architect to lead the design and implementation of enterprise observability solutions using OpenTelemetry. The ideal candidate will have strong expertise in observability architecture, telemetry pipelines, distributed tracing, and monitoring platform integrations across cloud and hybrid environments. Key Responsibilities Design and implement enterprise-wide … OpenTelemetry architecture and observability frameworks. Define telemetry standards, governance, and best practices for logs, metrics, traces, and events. Architect scalable OpenTelemetry Collector deployments and telemetry pipelines. Lead integration of OpenTelemetry with monitoring and observability platforms such as Dynatrace, Datadog, Grafana, Splunk, and New Relic. Design telemetry routing, enrichment, filtering ...