601 to 625 of 658 Observability Jobs in the UK

Staff Reliability Engineer (Full Stack)

Hiring Organisation: Feeld
Location: Greater London, United Kingdom
Employment Type: Full Time
Salary: 100000 to 130000 GBP Annually

Native). Lead technical problem-solving during incidents: coordinate response, diagnose root causes, communicate status, and drive to resolution. Build and evolve monitoring/observability (dashboards, alerts, tracing, logging) that enables fast detection and diagnosis. Drive post‐incident reviews (blameless) and ensure learnings become durable fixes (tech changes, runbooks, automation … comfort working across services and APIs. Proven incident response leadership: on-call participation, triage, mitigation, and root-cause analysis (RCA) with follow-through. Solid observability skills: practical experience with logging/metrics/tracing and turning signals into actionable alerts and dashboards. Experience collaborating with mobile teams and understanding mobile ...

Lead Software Engineer - Proxy/SSE Network Security

Hiring Organisation: Jobleads-UK
Location: Greater London, England, United Kingdom

resilience outcomes. Drive operational excellence at scale for perimeter, proxy, and SSE services in the US, including incident, change, and problem management rigor, observability and resiliency validation practices, automation to improve repeatability and evidence quality, reduction of client and partner impact, and execution of Technology Lifecycle Management (TLM) and modernization … design, exception frameworks, audit-ready traceability, and measurable risk reduction reporting. Experience with large-scale operations for externally facing or security enforcement services, including observability strategy, resilience testing, incident response alignment, and reduction of repeat incidents and client-impacting events. Experience designing and operating hybrid edge architectures and cloud interconnect ...

Staff SRE: Observability, Automation & Global Reliability

Hiring Organisation: Jobleads-UK
Location: Greater London, England, United Kingdom

London. This role focuses on the reliability, scalability, and performance of Replit's infrastructure serving millions of users worldwide. You will work on designing observability solutions, leading incident response, and automating operational tasks while mentoring other engineers. The ideal candidate has extensive experience in Site Reliability Engineering, strong programming skills ...

Grafana Observability Engineer (6-month contract)

Hiring Organisation: Shaw Daniels Solutions
Location: United Kingdom
Employment Type: Contract
Contract Rate: GBP Annual

Grafana Observability Engineer (6-month contract) Location: Fully remote Our client They are delivering a company-wide Digital Transformation (DX) Programme that will modernise their technology landscape and transform how they deliver services. As part of this journey, their IT and Portfolio Delivery teams are implementing a new enterprise technology ...

GenAI Python Engineer/Hybrid

Hiring Organisation: iBSC
Location: Sheffield, Yorkshire, United Kingdom
Employment Type: Contract
Contract Rate: GBP Annual

against ground truth. Work closely with architects, platform teams, and business stakeholders to deliver scalable and secure solutions. Follow enterprise standards for security, governance, observability, and performance. Required Skills and Experience Strong experience in AI/ML engineering, with hands-on exposure to Generative AI use cases. Experience in building … based AI stack. Experience with high-volume document processing. Familiarity with enterprise architecture, security, and compliance controls. Exposure to monitoring, model evaluation, and AI observability tools. Preferred Profile Able to independently build and deploy GenAI applications from ingestion to retrieval and evaluation. Strong problem-solving skills with a practical implementation ...

Staff Engineer

Hiring Organisation: 17918
Location: London, United Kingdom

engineering standards, best practices and reusable patterns while partnering with Enterprise Architecture and influencing technical direction Drive engineering excellence by improving code quality, testing, observability, reliability and operational practices Support end-to-end delivery by guiding teams through complex technical challenges, improving decision-making, and contributing to planning and risk … data lakes/lakehouse architectures, Iceberg or similar table formats, as well as batch and streaming processing Knowledge of data quality, governance, cataloguing and observability tools (e.g. Datadog), with DBT or AI-assisted engineering practices as a plus Additional Information Your benefits Werea community here that cares as much about ...

Staff Engineer

Hiring Organisation: Stepstone UK
Location: South East London, London, United Kingdom
Employment Type: Permanent

Senior Software Product Strategy & Product Marketing Lead — Data Center AI and Personal AI - Qu[...]

Hiring Organisation: Jobleads-UK
Location: Greater London, England, United Kingdom

center offerings. Define European requirements for AI infrastructure software, including model serving, orchestration, workload management, developer tools, runtime environments, compilers, SDKs, containerization, Kubernetes integration, observability, benchmarking, security, and deployment workflows. Develop European messaging around performance, power efficiency, total cost of ownership, software maturity, deployment flexibility, openness, sovereignty, and integration with … serving, and performance optimization across latency, throughput, and power. Understanding of AI infrastructure software stack, including Linux, containers, Kubernetes, and cloud‐native deployment patterns, observability frameworks (e.g., OpenTelemetry), CNCF ecosystem, and integration with enterprise or hyperscaler data center control planes. Solid knowledge of server and system architecture, including ...

Senior Director, Master Data Management

Hiring Organisation: Jobleads-UK
Location: Northampton, England, United Kingdom

manage the MDM product/platform team (product, engineering, data quality, metadata/lineage). Implement DataOps for MDM (CI/CD, automated testing, observability, change control, incident/problem management). Deliver golden record services (match/merge/survivorship, hierarchy management) and reference data services. Define integration architecture … merge/survivorship, hierarchy & reference data management, quality management, metadata & lineage. Hands‐on familiarity with DataOps (CI/CD for data, automated data testing, observability), microservices, and event streaming patterns (e.g., CDC, pub/sub). Experience with enterprise data catalogs, lineage tooling, and at least one MDM platform (commercial ...

Principal Machine Learning Engineer

Hiring Organisation: Jobleads-UK
Location: Greater London, England, United Kingdom

Anaplan\'s platform and third-party integrations Optimise model inference pipelines for performance, cost, and scalability in production environments Implement monitoring, logging, and observability for GenAI systems to track usage, errors, and model behaviour Collaborate with data scientists to productionise ML models and forecasting algorithms Your Skills Extensive hands … Experience with A/B testing and experimentation frameworks for AI features Contributions to open-source ML projects or research publications Experience with model observability tools (LangSmith, W&B;, MLflow) DEIB Our Commitment to Diversity, Equity, Inclusionand Belonging (DEIB) We believe attracting and retaining the best talent and fostering ...

Lead Software Engineer - Proxy/SSE Network Security

Hiring Organisation: Jobleads-UK
Location: Greater London, England, United Kingdom

resilience outcomes. Drive operational excellence at scale for perimeter, proxy, and SSE services in the US, including incident, change, and problem management rigor, observability and resiliency validation practices, automation to improve repeatability and evidence quality, reduction of client and partner impact, and execution of Technology Lifecycle Management (TLM) and modernization … design, exception frameworks, audit‐ready traceability, and measurable risk reduction reporting. Experience with large‐scale operations for externally facing or security enforcement services, including observability strategy, resilience testing, incident response alignment, and reduction of repeat incidents and client‐impacting events. Experience designing and operating hybrid edge architectures and cloud interconnect ...

Staff Frontend Engineer (React Native / Mobile)

Hiring Organisation: Feeld
Location: Greater London, United Kingdom
Employment Type: Full Time
Salary: 80000 to 110000 GBP Annually

performance, stability/crash rates, startup time, build/release velocity, or app size . Increased confidence in production through better observability, incident response practices, and ownership . Enabled other FE engineers to move faster through documentation, pairing/mentorship, reviews, and reusable platform components . What … traffic/user counts, complex feature sets) with a focus on reliability and performance. Demonstrated production ownership : incident response, debugging complex issues, and improving observability (metrics/logs/traces). Experience improving delivery systems (CI/CD, automated testing strategy, release process) and keeping teams moving. Staff-level ...

Staff Data Engineer: Data Quality & Observability

Hiring Organisation: Jobleads-UK
Location: Greater London, England, United Kingdom

Depop is building a world-class Data Quality, Observability & Governance team. As a Staff Data Engineer, you will lead design and delivery of frameworks, tools, and processes that make our data reliable, auditable, and governable. You will establish contracts between data producers and consumers to maintain schema integrity and data ...

ML Research Engineer

Hiring Organisation: Jobleads-UK
Location: Greater London, England, United Kingdom

TLDR: We are looking for several ML Engineers to train, post-train, and evaluate the LLMs at the core of our platform. This is hands-on modern model training work: large-scale data pipelines, SFT ...

Senior Ruby Engineer

Hiring Organisation: Jobleads-UK
Location: Manchester, England, United Kingdom

What You’ll Do Design and implement scalable, high-performance backend systems to power our e-commerce experience. Build and maintain interfaces that support our frontend, mobile, and third‐party integrations. Be very experienced working ...

Director of Software Engineering - AIOps & Observability

Hiring Organisation: Jobleads-UK
Location: City Of London, England, United Kingdom

JPMorgan Chase & Co. is seeking a Director of Software Engineering in London. This role involves leading engineering teams and overseeing the development of Observability Platforms, influencing strategic decisions and governance. Your responsibilities will include driving AIOps innovation and delivering technical solutions across various business units. The ideal candidate will have ...

Senior Enterprise AE — Observability Growth & Equity

Hiring Organisation: Jobleads-UK
Location: Greater London, England, United Kingdom

hybrid basis. This hunter-led role focuses on creating pipeline, managing strategic accounts, and closing six- or seven-figure deals in the observability space. You will work with SDRs and marketing, draw on a strong playbook, and benefit from leadership development, strong metrics, and equity participation as the company scales. ...

Principal AI Observability Solutions Engineer

Hiring Organisation: Jobleads-UK
Location: Greater London, England, United Kingdom

products, and supporting the sales process with technical know-how. The ideal candidate will have over 10 years of experience, proficiency in modern Observability tools and cloud technologies, and excellent communication skills. This position plays a critical role in helping Snowflake redefine technology deployment. #J-18808-Ljbffr ...

Senior Product Manager

Hiring Organisation: Jobleads-UK
Location: Greater London, England, United Kingdom

About ITRS At ITRS, we make society's critical technology work. Our mission is to deliver automated and holistic IT observability solutions that safeguard critical applications and enable innovation. We are the only monitoring and observability platform designed for the most demanding and regulated industries — trusted by 90% of Tier … trading resilience and Market Data Observability. These workstreams sit at the heart of the Geneos and ITRS Analytics (IAX) product line, a monitoring and observability platform used by 90% of Tier 1 capital markets firms tp ensure resilience of low-latency trading, core banking, payments, and market data infrastructure. This ...

Remote Observability Engineering Manager

Hiring Organisation: Jobleads-UK
Location: Douglas, Northern Ireland, United Kingdom

Canonical Group Ltd is hiring an Observability Engineering Manager to lead the development of distributed tracing and service mesh products. The role requires managing a team of engineers and enhancing their processes to achieve objectives. We seek a candidate with a strong background in software delivery, leadership, and a passion … open-source technologies. This position offers a unique opportunity to work in a collaborative, distributed environment while making significant contributions to the development of observability tools. #J-18808-Ljbffr ...

Lead Engineer

Hiring Organisation: Jobleads-UK
Location: Greater London, England, United Kingdom

continuous improvement* Define and evolve engineering standards, frameworks and best practices across the entire engineering organisation* Drive improvements in software quality, testing strategy, observability and release confidence* Partner with engineering, platform, product and security to deliver large-scale, cross-functional improvements* Shape and deliver internal tooling and AI-assisted engineering … engineering-first mindset and influencing people with a broader organisational impact* Previous experience on building enterprise level, highly scalable projects, focused on performance, observability and security best practices.* Experience in a technology-driven organisation with strong engineering standards**You’ll also bring:*** Strong experience improving engineering quality, reliability and operational ...

Performance and Monitoring Engineer

Hiring Organisation: Solus Accident Repair Centres
Location: Birchanger, Hertfordshire, United Kingdom
Employment Type: Permanent
Salary: GBP 40,000 - 50,000 Annual

talented Performance and Monitoring Engineer to help us strengthen the stability, reliability and performance of our systems. If you're passionate about monitoring, observability and using data to proactively improve service health, this is a great opportunity to make a real impact across a large, modern technology estate. Responsibilities … improve speed, accuracy and consistency Supporting major changes, deployments and post-incident reviews with data-driven evidence Qualifications Strong experience with monitoring and observability tools (LogicMonitor, Azure Monitor, App Insights, Log Analytics, Defender for Cloud) Excellent understanding of cloud performance, IaaS/PaaS, networking fundamentals, API performance and capacity modelling ...

Performance and Monitoring Engineer

Hiring Organisation: Solus Accident Repair Centres
Location: Stansted, Essex, South East, United Kingdom
Employment Type: Permanent
Salary: £50,000

AI Engineering Manager

Hiring Organisation: Jobleads-UK
Location: Greater London, England, United Kingdom

business needs into technical deliverables. Drive agentic workflows and AI tooling adoption across the product development lifecycle to deliver tangible value. Establish robust evaluation, observability, and quality practices for AI systems, balancing speed with reliability. Guide teams through ambiguity and rapid change, making pragmatic decisions and removing blockers. Measure success … development. Hands‐on experience with AI models, tools, and frameworks, including agent orchestration, prompt engineering, RAG pipelines, evaluation frameworks, LangChain, Codex, Claude, Gemini, and observability tools and best practices. Strong technical problem‐solving skills and the ability to guide teams through ambiguous, fast‐changing environments. Excellent communication skills across technical ...

OAT Quality Engineer

Hiring Organisation: Hays Technology
Location: London, United Kingdom
Employment Type: Contract
Contract Rate: £398/day £398 p/d Inside IR35

procedures, including incident, problem and change management readiness. Verify backup, restore, deployment, patching and infrastructure connectivity processes. Evaluate application and infrastructure monitoring, alerting and observability solutions. Analyse system logs, monitoring outputs and performance data to identify operational risks. Support release activities and ensure operational acceptance criteria are met. Work with … server testing. Experience testing cloud-hosted applications, ideally within AWS environments. Strong understanding of operational readiness, service transition and production supportability. Experience with monitoring, observability and log analysis tools such as Splunk, Dynatrace, New Relic or ELK. Background working within IT Operations, Service Management or mission-critical production environments. Strong ...