326 to 350 of 418 Observability Jobs in London

C Engineer (Real-Time Full Tick Re-platform

Hiring Organisation: Hays
Location: London, United Kingdom
Employment Type: Contract
Contract Rate: GBP Annual

Your new role You'll step in as a Senior C++ Engineer, leading the design and rollout of observability across real-time, latency-sensitive platforms. This is a hands-on role where you'll shape how customer experience, system reliability, and operational insight are measured end-to-end. ...

Engineering Manager: Real-Time Quant Frameworks

Hiring Organisation: Jobleads-UK
Location: Greater London, England, United Kingdom

shape platforms that support high-impact research. This role includes overseeing systems that simplify workloads across large-scale compute environments, guiding the development of observability tools, and driving the strategy of critical scheduling platforms. The ideal candidate will have experience in building scalable production systems and leading technical teams. ...

Principal Product Manager

Hiring Organisation: bloom
Location: London Area, United Kingdom

faster, at higher quality and with less friction. A senior IC role owning the internal platform end to end: developer-facing products, quality and observability workflows that scale without adding headcount, building for a user base that's increasingly agentic as much as human. You'll prototype rather than just ...

RVP Europe Sales

Hiring Organisation: Jobleads-UK
Location: Greater London, England, United Kingdom

improve quality, efficiency, security, and profitability. Our software combines application intelligence, experience visibility, contextual insights, and real-time control to help customers elevate observability and do more with the networks they already run. As AI reshapes how the world works, connects, and communicates, AppLogic Networks helps ensure modern applications ...

Principal Engineer - Edge Delivery & Observability

Hiring Organisation: Financial Times
Location: Greater London, United Kingdom
Employment Type: Full Time

unique opportunities to support every step of your career. The FT is looking for a Principal Engineer (Individual Contributor) to lead our Edge Delivery & Observability work. About the teams There are two teams in this area. The Edge Delivery & Observability (EDO) team looks after the cloud edge capabilities like CDNs … along with the monitoring and observability infrastructure at the FT. Examples of the kind of work this team tackles are: Managing and improving our central solution for observability tools like Graphite, Grafana, Splunk, Prometheus and Cloudflare. Providing self service APIs and tools that enable other delivery teams to utilise ...

Lead AI Platform Engineer (Contract)

Hiring Organisation: GlobalLogic
Location: London Area, United Kingdom

month assignment (inside IR35), to start 2-4 weeks. This is a handson, high impact role at the intersection of AI governance, distributed systems, observability, and platform engineering to lead technical delivery for an AI centralised platform - Control Tower. We’re looking for a Technical Lead to drive the endtoend … Java, and modern data processing frameworks. Expertise in cloud-based AI/ML ecosystems, particularly AWS SageMaker (required). Proven experience developing monitoring frameworks, observability pipelines, and dashboards. Deep understanding of event-driven architectures and messaging systems (Kafka, Vert.x, or similar). Knowledge of security engineering, IAM principles, encryption ...

Site Reliability Engineer

Hiring Organisation: Huxley Associates
Location: City of London, London, United Kingdom
Employment Type: Permanent
Salary: £90000/annum + Bonus & Benefits Package

scalability, and operational excellence across a complex, regulated environment. Key Responsibilities Lead the implementation of SRE best practices across cloud infrastructure Drive improvements in observability, alerting, and capacity planning (SLA/SLO/SLI) Identify and reduce operational toil through automation and remediation frameworks Build and enhance GitOps and Infrastructure … cloud environments (AWS/GCP) Strong scripting skills (Python, Ansible, or PowerShell) Experience with Infrastructure as Code and GitOps methodologies Hands-on knowledge of observability/APM tools (e.g. Grafana, Datadog, Dynatrace) Proven experience managing incidents, root cause analysis, and on-call support Understanding of SLA/SLO/ ...

Forward Deployed AI Engineer

Hiring Organisation: WTW
Location: Greater London, United Kingdom
Employment Type: Full Time

enabled systems. You’ll bring deep expertise across modern full-stack technologies (.NET, Azure, SQL, React/Angular), along with experience in distributed systems, observability, and AI tooling such as LLMs, retrieval pipelines, and agentic workflows. Acting as a bridge between business and technology, you’ll work across product, data … orchestration, evaluation loops, and human-in-the-loop controls. Enterprise integration: Integrate AI solutions with enterprise systems, APIs, data platforms, document repositories, workflow tools, observability platforms, and identity and access management services. Production engineering: Ensure AI solutions meet enterprise standards for reliability, scalability, latency, maintainability, cost control, logging, monitoring ...

Software Engineering Manager

Hiring Organisation: 17918
Location: London, United Kingdom

best practice, reduce duplication, and promote maintainable, secure and performant systems. Enhance delivery capability through platform reliability and DevOps maturity - Continuously improve deployment pipelines, observability, alerting, incident handling, recovery procedures and operational readiness across Field Ops engineering teams. Manage stakeholders and ensure transparent communications - Build strong relationships across product, operations … decisions Funding for technical enablers Field Ops workflow design and data requirements Use of Data/Insight/Automation Uses engineering metrics, performance insights, observability data and AI[1]assisted diagnostics to guide decisions. Ensures human judgement remains central. Constraints Centrica architectural principles, engineering guardrails, data privacy/security policies ...

AI Engineer

Hiring Organisation: Elsevier
Location: Greater London, United Kingdom
Employment Type: Full Time

within a defined problem, building and testing tool use, retrieval pipelines and agent workflows, integrating AI capabilities into enterprise systems, and contributing to evaluation, observability and guardrails. You will hold a high bar on code quality, flag risks and blockers early, and work alongside host-function stakeholders to make sure … agentic AI solutions to production standard within a defined technical approach. Implement and test tool use, retrieval pipelines, and agent workflows. Contribute to evaluation, observability and guardrails for agentic systems. Integrate AI capabilities into existing enterprise workflows and systems. Maintain high code quality and documentation so patterns can be reused. ...

Data Engineer-Must have strong GCP experience-Inside IR35

Hiring Organisation: Reed Technology
Location: London, United Kingdom
Employment Type: Temporary
Salary: £425/day POSSIBLY NEGOTIABLE

Standardise ingestion and transformation using configuration-driven frameworks Embed data quality checks by default (schema validation, completeness, freshness, thresholds, alerting) Improve pipeline resilience, monitoring, observability and recovery mechanisms Integrate AI/ML capabilities where appropriate (e.g. anomaly detection, intelligent monitoring) Support delivery of a wider Data Strategy programme , improving consistency … Cloud Run/App Engine Experience with CI/CD, automated testing and infrastructure as code Data Quality & Monitoring Experience implementing data quality frameworks, observability tooling and monitoring solutions Preferred Experience Building reusable pipeline frameworks for large, multi-domain platforms Delivery within enterprise data transformation programmes with strong SLAs Exposure ...

Site Reliability Engineer (SRE) - Cloud & Automation

Hiring Organisation: Spencer Rose Ltd
Location: London, United Kingdom
Employment Type: Permanent
Salary: GBP 60,000 - 70,000 Annual

implementation of SRE practices across the organisation, working closely with infrastructure teams to optimise deployment processes and embed automation and operational excellence. Enhance observability and reliability , defining and implementing SLAs, SLOs and SLIs to improve alerting, monitoring, and capacity planning. Identify and eliminate toil , developing frameworks to analyse recurring issues … beneficial). Experience supporting and building multi-environment, multi-region cloud platforms (AWS or GCP), using IaC and GitOps workflows. Hands-on experience with observability/APM tooling such as Grafana, Datadog or Dynatrace. Background working in regulated financial services or banking environments. Excellent troubleshooting, analytical and communication skills, able ...

Platform Engineer

Hiring Organisation: Axon Labs
Location: London Area, United Kingdom

Reliable systems for live trading, multi-venue market data ingestion, and research compute Deployment pipelines that ship strategy and model changes quickly and safely Observability across data quality, execution, strategy, and infrastructure Resilience: failover, disaster recovery, and operational readiness for systems that lose money when they’re down The path … cost, and you have supported researchers or traders. Required Skills Deep Linux and networking fundamentals Strong cloud experience, ideally AWS (compute, networking, IAM, storage, observability) Strong Python; C++, Rust, or Go for latency-critical paths is a plus Container orchestration with Kubernetes (or equivalent) Infrastructure as code (Terraform or equivalent ...

Senior Java Engineer - FX eTrading

Hiring Organisation: Pontoon
Location: London, United Kingdom
Employment Type: Contract
Contract Rate: £800 - £900/day

data, order/risk workflows, and real-time streaming capabilities. Optimise Performance: Focus on improving latency, throughput, and reliability across the entire stack. Implement observability practises (metrics, tracing, logging) and conduct performance profiling. Establish Best Practises: Champion engineering excellence through code standards, testing strategies (unit/integration/… including market data, order flows, and execution workflows. Hands-On Skills: Proficiency with CI/CD, containerisation, cloud/on-prem deployments, and observability practises. AI Integration: Comfortable integrating AI coding tools into daily development workflows. Communication Skills: Excellent communication and stakeholder engagement abilities, with a track record of leading ...

Forward Deployed Engineer (FDE), Customer Solutions

Hiring Organisation: DaVinci Commerce
Location: London Area, United Kingdom

grade AI agents for commerce and BrandStore use cases. Implement orchestration logic, state management, workflow automation, and service integrations. Optimize AI agent performance, reliability, observability, and fault tolerance. Support hosting, deployment, monitoring, and debugging of mission-critical customer-facing systems. Customer Onboarding & Training Lead technical onboarding sessions for enterprise customers … cloud infrastructure and deployment environments (AWS/GCP/Azure). Familiarity with databases, authentication systems, queues, monitoring, and distributed systems. Ability to use observability and monitoring tools to diagnose production issues in AI deployments. Experience with frontend/UI prototyping frameworks is a plus. Familiarity with MCP (Model Context ...

Principal Data Engineer

Hiring Organisation: WTW
Location: Greater London, United Kingdom
Employment Type: Full Time

foundational governance capabilities: access security (Entra ID, Unity Catalog), data lineage tooling, CI/CD for data (Github Actions, Terraform, DBT Cloud), and observability practices. AI Fluency AI fluency is a core requirement of this role — in two distinct dimensions. First, you will design and build data infrastructure that powers … connect, where coupling creates risk, and how today's decisions constrain tomorrow's options. You hold a high bar for engineering quality — correctness, testability, observability, and documentation are non-negotiable, not nice-to-haves. You are pragmatic under pressure; you know when to build the right thing and when ...

Staff .Net Backend Engineer

Hiring Organisation: Jobleads-UK
Location: Greater London, England, United Kingdom

performance. Championing a high‐quality engineering culture – test coverage, peer review, CI/CD discipline via GitHub Actions, Infrastructure as Code, secure coding, observability and performance – aligned to the Reapit Global Technology Strategy, Reapit Connect and agentic tooling. Mentoring and up‐levelling engineers around you through pairing, PR review, architectural …/CD (ideally GitHub Actions), Infrastructure as Code (AWS CDK or Terraform), comprehensive testing (unit, integration and contract), and a genuine commitment to observability, performance and secure‐by‐default coding. Technical leadership without the title – a track record of lifting teams through pairing, mentoring, PR review and example, rather than ...

Senior Onboarding Engineering | 6 month Contract

Hiring Organisation: Novatus
Location: City of London, London, United Kingdom

Novatus is a Series B scale-up RegTech SaaS provider and boutique advisory practice, enabling financial services firms to solve complex challenges and redefine what’s possible through expert-led technology and consulting. Across both ...

Lead Splunk Engineer

Hiring Organisation: Meritus
Location: London, United Kingdom
Employment Type: Contract
Contract Rate: £500 - £600/day

MERITUS are recruiting for a Splunk Lead Engineer to join a Consulting Organisation working into a Central Government Client supporting enterprise-wide observability and monitoring capabilities. This is a 12-month contract role based in London, paying £600 per day (Inside IR35), with 2 days per week required on-site. … SPLUNK LEAD ENGINEER - OBSERVABILITY & MONITORING - LONDON (HYBRID) - 12-MONTH CONTRACT - £600 PER DAY (INSIDE IR35) - SC CLEARANCE REQUIRED As a Splunk Lead Engineer, you will act as the technical authority for monitoring and observability, driving standards, automation, and scalable solutions across a complex enterprise environment. You will work closely with ...

Observability Engineer - Bigpanda

Hiring Organisation: 17918
Location: London, United Kingdom

Observability Engineer AIOps & Observability Permanent Remote-first (UK) Confidential Client Placed exclusively by Morela - Specialist AI & Engineering Recruitment THE OPPORTUNITY Morela is working exclusively with one of the fastest-growing startups in the UK - a specialist AIOps boutique that is changing how enterprise organisations deal with alert noise and incident ...

Data Reliability Engineer

Hiring Organisation: Ashdown Group
Location: City of London, London, United Kingdom
Employment Type: Permanent, Work From Home
Salary: £95,000

work from home 2 days per week. This is a high-impact role focused on improving data quality, reducing incidents, and building scalable observability across a modern enterprise data platform. Youll help ensure data across the organisation is accurate, reliable, and trusted for critical business decision-making. Youll take ownership … style roles, with strong SQL and Python skills and experience working in modern cloud-based data environments. Hands-on experience with data observability tools such as Grafana, Monte Carlo, or Acceldata, and data governance/quality platforms like Informatica, Collibra or Microsoft Purview is highly desirable. Experience within the Azure ...

Data Reliability Engineer

Hiring Organisation: Ashdown Group
Location: London, South East, England, United Kingdom
Employment Type: Full-Time
Salary: £80,000 - £95,000 per annum

able to work from home 2 days per week.This is a high-impact role focused on improving data quality, reducing incidents, and building scalable observability across a modern enterprise data platform. You’ll help ensure data across the organisation is accurate, reliable, and trusted for critical business decision-making. … style roles, with strong SQL and Python skills and experience working in modern cloud-based data environments. Hands-on experience with data observability tools such as Grafana, Monte Carlo, or Acceldata, and data governance/quality platforms like Informatica, Collibra or Microsoft Purview is highly desirable. Experience within the Azure ...

Staff DevOps Engineer - 3, 4 & 5 day Work Week Option!

Hiring Organisation: Albany Growth
Location: City of London, London, United Kingdom

their infrastructure, pipelines, and reliability function as the platform scales. Main headlines: Own the full infrastructure stack end-to-end: IaC, CI/CD, observability, and incident response in a regulated environment AWS-native setup with Terraform, Docker, GitHub Actions and Octopus Deploy: mature tooling, no legacy mess to untangle … team where your decisions have direct impact on product delivery velocity Regulated domain - meaningful compliance work, not box-ticking Greenfield opportunity to build out observability and automation properly from the ground up Further details: King's Cross, London: Flexibility to work 3, 4 or 5 days per week with salary ...

Senior Java Full Stack Developer

Hiring Organisation: Atrium Workforce Solutions Ltd
Location: London, South East, England, United Kingdom
Employment Type: Contractor
Contract Rate: £675 - £800 per day

reduce build times while increasing signal quality and reliability. Improve developer experience: streamline build pipelines, optimize caching, clarify branching/release processes, and enhance observability for debugging failed tests. Shift-left quality mindset and risk-based test planning Strong debugging/troubleshooting habits (logs, traces), observability awareness Please feel free ...

Founding Platform Engineer

Hiring Organisation: Albert Bow
Location: London Area, United Kingdom

live trading, multi-venue market data ingestion, and research compute Deployment pipelines that let the team ship strategy and model changes quickly and safely Observability across data quality, execution, strategy, and infrastructure Resilience: failover, disaster recovery, and operational readiness for systems that lose money when they are down The route … experiment tooling for our research and ACA work Technical depth Deep Linux and networking fundamentals Strong cloud experience, ideally AWS (compute, networking, IAM, storage, observability) Strong Python; C++, Rust, or Go for latency-critical paths is a plus Infrastructure as code (Terraform or equivalent) CI/CD and release engineering ...