226 to 250 of 257 Observability Jobs in London

Principal AI Architect

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
experimenting with cutting‐edge technologies. Preferred Requirements Advanced Integration - Experience integrating Salesforce with external agents via APIs and open standards (MCP, A2A). Governance & Observability - Familiarity with prompt governance, observability, monitoring frameworks, responsible AI and compliance best practices Cross‐Platform Background - Background in cross‐platform integrations (e.g., Hyperscaler SDKs ...

Senior Software Engineer – AI / Agentic Systems

Hiring Organisation
MA (Montreal Associates)
Location
City of London, London, United Kingdom
grade AI platform. You’ll operate at the core of the product engineering function—designing systems that power autonomous agents, orchestrate workflows, and enable observability at scale. This is not just another backend role. You’ll influence architecture, mentor engineers, and help define the technical direction of a rapidly growing … Lead design and code reviews , ensuring high standards of quality and security Collaborate closely with AI research, product, and infrastructure teams Improve system reliability, observability, and scalability Mentor engineers and act as a technical multiplier across teams Champion best practices, tooling, and engineering excellence Proactively identify and resolve technical debt ...

Director - Principal Engineer (Java/Angular/AI)

Hiring Organisation
Robert Walters
Location
London, South East, England, United Kingdom
Employment Type
Full-Time
Salary
£140,000 - £170,000 per annum
volumes of financial and transactional data Contribute directly to architecture, system design, and hands-on software development Drive engineering best practices across automation, testing, observability, and performance Build resilient, production-grade systems with a strong focus on reliability and scalability Work across the full software development lifecycle from design through … scalability, and high-availability systems Experience building automated, production-grade platforms with minimal manual intervention Familiarity with cloud-native technologies, CI/CD, and observability tooling Strong engineering mindset with a hands-on approach to development Interest in modern engineering tooling, including AI-assisted development workflows Robert Walters Operations Limited ...

Platform Engineer: £120k + Bonus/benefits (AI Trading)

Hiring Organisation
Hunter Bond
Location
London Area, United Kingdom
global trading platform. The successful candidate will be involved in every layer of the technology stack—from hardware and operating systems to automation and observability—while gaining exposure to how a world-class investment firm manages its technology infrastructure. Key Responsibilities Manage a distributed compute environment and several petabyte-scale … agile methodologies) Familiarity with infrastructure automation and configuration management tools (Chef, Puppet, or Ansible) Exposure to distributed storage systems and related protocols Experience with observability and monitoring tools (Elasticsearch, Logstash, Kibana, Datadog, Prometheus, Grafana) Strong written and verbal communication skills Demonstrated ability to learn quickly and adapt to evolving technologies ...

Lead Software Engineer

Hiring Organisation
5V Video
Location
City of London, London, United Kingdom
+ AWS (Lambda, API Gateway, S3, DynamoDB) Handling event-driven architectures (Kafka, SNS/SQS, etc.) Driving system design decisions across distributed systems Improving observability, reliability, and performance in production Debugging complex issues and leading resolution across teams Staying hands-on while setting technical direction and standards Tech Stack Python … Lambda, API Gateway, S3, DynamoDB, IAM) Event-driven systems (Kafka, SNS/SQS) CI/CD (Concourse, Git workflows) Databases (Postgres, DynamoDB, Couchbase) Observability (Prometheus, Grafana, CloudWatch) What You’ll Bring Strong backend engineering experience (Python preferred) Proven experience building distributed systems at scale Deep understanding of microservices + event ...

Head of Infrastructure

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
cloud architecture, operational resilience, developer experience and infrastructure team leadership. You will be responsible for shaping the long term infrastructure roadmap, improving reliability and observability, strengthening incident response and ensuring the platform can support a growing customer base and increasingly critical product suite. This is a role for someone … platform strategy Design and evolve the AWS cloud architecture to support scale, resilience and performance Set standards across infrastructure, CI/CD, environments and observability Lead production reliability, uptime, incident response and post incident reviews Improve monitoring, alerting and on call practices to ensure they are effective and sustainable Partner ...

Platform Storage Engineer

Hiring Organisation
Ncounter
Location
East London, London, England, United Kingdom
Employment Type
Full-Time
Salary
£160,000 - £190,000 per annum
vendor storage tooling into a unified platform • Improve storage throughput, data locality and platform efficiency for research workloads • Collaborate closely with compute, networking and observability teams across the wider platform estate • Support troubleshooting, tuning and reliability engineering for production storage systems What we’re looking for: • Strong backend or systems … Rust, C++ or Java • Experience building or supporting distributed systems at scale • Strong Linux knowledge and an interest in infrastructure engineering • Exposure to observability tooling such as Prometheus, Grafana, Datadog or ELK • Understanding of cloud and infrastructure automation, ideally AWS, GCP or Terraform • Any experience with Ceph, MinIO, JuiceFS, FUSE ...

Remote Network Monitoring Specialist - Streaming Telemetry

Hiring Organisation
Akkodis
Location
London, United Kingdom
Employment Type
Permanent
Salary
£70000 - £75000/annum
ensure the environment is fully visible, measurable and supportable from day one. The role would suit someone with strong experience across network observability, alerting, telemetry, dashboards, service health, performance baselining and operational handover. The client is open to different monitoring backgrounds, particularly where candidates have worked with tools such … solutions across newly delivered network infrastructure. Build monitoring capability that provides clear visibility of network health, performance and service availability. Work with monitoring and observability platforms such as VictoriaMetrics, Prometheus, Grafana, Nagios, Zabbix, InfluxDB, SolarWinds, PRTG, Datadog, Elastic or similar. Support metrics ingestion, retention, alerting, dashboarding and performance visibility. Build ...

Platform Engineer

Hiring Organisation
Gravitas Recruitment Group (Global) Ltd
Location
City of London, London, United Kingdom
responsibilities include: Scaling serverless cloud infrastructure for growth and multi-region reliability Building and improving CI/CD pipelines and deployment systems Enhancing observability, monitoring, and incident response Developing internal tooling to improve engineering productivity Contributing to production code (TypeScript) across infrastructure and product Tech Environment AWS (serverless-first architecture … Pulumi (or similar infrastructure-as-code tools) GitHub Actions for CI/CD Datadog for observability TypeScript across the stack What They’re Looking For Strong platform engineering experience in cloud-native SaaS environments Hands-on experience with AWS serverless architecture (e.g. Lambda, DynamoDB, event-driven systems) Experience building ...

Remote Network Monitoring Engineer - VictoriaMetrics

Hiring Organisation
Akkodis
Location
London, United Kingdom
Employment Type
Permanent
Salary
£70000 - £75000/annum
VictoriaMetrics in a production environment, including configuration, optimisation, ingestion, retention and performance tuning. You will also work across streaming telemetry, Nagios, Grafana and wider observability tooling. This would suit someone with strong network monitoring experience who is comfortable taking ownership of a critical technical workstream in a project-led environment. … Looking for: Strong hands-on experience with VictoriaMetrics in a production environment. Previous experience in a senior network monitoring, network engineering or observability-focused role. Experience working in a telecoms, ISP, managed network or large-scale infrastructure environment. Good understanding of time-series monitoring, metrics ingestion, retention and performance tuning. ...

SRE Consultant

Hiring Organisation
Akkodis
Location
City of London, London, United Kingdom
Employment Type
Permanent
Salary
£90000 - £100000/annum
include: Define and embed SRE engagement models aligned to modern engineering and traditional ITSM/ITIL practices Establish SLIs, SLOs, and Error Budgets Shape observability strategies using metrics, logs, and traces Design incident response models and post-incident learning loops Reduce toil through automation and engineering excellence Deliver SRE capability … Looking For Extensive experience in SRE, cloud operations, or DevOps Proven consulting or advisory background Experience with AWS, Azure, or GCP Strong observability and incident management expertise Ability to obtain UK SC clearance Modis International Ltd acts as an employment agency for permanent recruitment and an employment business ...

AI Engineer

Hiring Organisation
MarkIT Placements
Location
West London, London, United Kingdom
Employment Type
Contract, Work From Home
execution Deploy AI systems into cloud, on-premises, and air-gapped environments Build production-ready pipelines from data ingestion through to inference Experience with observability for AI systems, including agent behaviour, model performance, and failure modes Collaborate with engineers, product leads, and customers to translate requirements into working systems Contribute … with edge or offline AI deployments Familiarity with Kubernetes (EKS/OpenShift) for monitoring and managing deployed applications MLOps experience - model evaluation, monitoring, reproducibility Observability tooling for agentic systems (model drift, agent behaviour, performance monitoring) Experience with agent orchestration patterns and inter-agent communication protocols (e.g. A2A) Familiarity with MCPs ...

RVP, EMEA Sales - Observability

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
just to execute a function, but to help redefine the future of how work gets done. Observe by Snowflake brings AI-native observability to the Snowflake AI Data Cloud, helping engineering and data teams debug, optimize, and understand systems operating at massive scale. Traditional observability tools were not built … strong judgment, and the ability to align people, strategy, and execution across functions. WHAT WE LOOK FOR 10+ years of experience selling cloud, infrastructure, observability, data platforms, or enterprise software. 2+ years of experience managing high-performing enterprise sales teams. Experience selling to senior technical and business stakeholders, including CIOs ...

Vice President - Full Stack Engineer (Java/Angular/AI)

Hiring Organisation
Robert Walters
Location
London, South East, England, United Kingdom
Employment Type
Full-Time
Salary
£85,000 - £100,000 per annum
APIs using modern engineering practices Contribute to system design, architecture discussions, and technical decision-making Build resilient, automated systems with strong focus on reliability, observability, and performance Work closely with engineering and product teams to deliver production-grade solutions Contribute to CI/CD, testing, monitoring, and operational improvements across … working with high-volume, scalable systems Familiarity with event-driven architecture and messaging systems Experience with cloud-native technologies, CI/CD pipelines, and observability tooling Strong hands-on engineering mindset and interest in modern development tooling, including AI-assisted workflows Robert Walters Operations Limited is an employment business ...

Monitoring SME

Hiring Organisation
CBSbutler Holdings Limited trading as CBSbutler
Location
London, United Kingdom
Employment Type
Contract
Contract Rate
£480 - £515/day
design, implementation, and optimisation of monitoring capabilities across Microsoft Fabric and Azure ecosystems. The role focuses on Microsoft Purview, Azure monitoring services, and unified observability across data platforms including Power BI. You will act as a technical authority, guiding stakeholders on best practices for monitoring, compliance, data governance, and security … Monitoring (FUAM) Insight Manager Ensure effective monitoring integration within Microsoft Fabric and Power BI environments. Technical Leadership & Consultancy Act as the SME for monitoring, observability, and governance solutions. Provide best practice guidance on data security, compliance, and monitoring frameworks. Advise stakeholders across technical and non-technical teams. Support decision-making ...

Site Reliability Engineer (SRE)

Hiring Organisation
UA Consulting
Location
City of London, London, United Kingdom
Employment Type
Contract
Contract Rate
From £300 to £400 per day
platform. Key Responsibilities Partner with development teams to define and manage SLOs/SLIs, and use error budgets to guide engineering decisions. Enhance observability ensuring metrics, logs, and tracing are in place to detect and fix issues proactively. Lead cost optimisation initiatives: monitor spend, rightsize workloads, tune autoscaling, and drive … with Kubernetes (on-prem and AWS EKS). Proven track record defining and working with SLOs/SLIs in production environments. Deep understanding of observability (metrics, logging, tracing, telemetry ...

EMEA VP of AI-Observability Sales

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
Snowflake is seeking a Sales Leader for the EMEA region to build and lead a high-performing sales team focused on AI-driven observability solutions. The ideal candidate will have over 10 years of experience in cloud and enterprise software sales, with a track record of managing successful sales teams. … This role offers a unique opportunity to shape the future of data observability in a fast-growing environment. A BA/BS degree is required, alongside strong leadership and coaching skills. #J-18808-Ljbffr ...

Data Engineer

Hiring Organisation
HCLTech
Location
London Area, United Kingdom
HCLTech is a global technology company, home to more than 220,000 people across 60 countries, delivering industry-leading capabilities centered around digital, engineering, cloud and AI, powered by a broad portfolio of technology services ...

MuleSoft Architect

Hiring Organisation
Capita Shared Services Limited
Location
West London, London, United Kingdom
Employment Type
Permanent
Purpose of the role: Capita's MuleSoft Centre for Enablement & Excellence (C4EE) is expanding its architectural capability to support large-scale integration and API transformation programmes for major UK Government and private-sector clients. As ...

Performance and Monitoring Engineer

Hiring Organisation
Solus Accident Repair Centres
Location
North London, London, United Kingdom
Employment Type
Permanent
Salary
£50,000
talented Performance and Monitoring Engineer to help us strengthen the stability, reliability and performance of our systems. If you're passionate about monitoring, observability and using data to proactively improve service health, this is a great opportunity to make a real impact across a large, modern technology estate. Responsibilities … improve speed, accuracy and consistency Supporting major changes, deployments and post-incident reviews with data-driven evidence Qualifications Strong experience with monitoring and observability tools (LogicMonitor, Azure Monitor, App Insights, Log Analytics, Defender for Cloud) Excellent understanding of cloud performance, IaaS/PaaS, networking fundamentals, API performance and capacity modelling ...

AI Architect

Hiring Organisation
Tata Consultancy Services
Location
London Area, United Kingdom
operated safely over time. Key responsibilities: Architect and govern multi-agent and agent-swarm systems at enterprise scale. Define agent safety, governance, observability, and testing standards. Establish AI guardrails, frameworks, governance models, and safety controls. Design human-in-the-loop optimisation to balance autonomy, reliability, and performance. Own patterns … native and agent-based design principles. Design and govern enterprise-scale distributed systems with embedded AI capabilities. Architect and evolve agent orchestration platforms. Own observability, reliability, security, scalability, performance, and cost management (FinOps). Ensure platforms are production-ready, secure, auditable, and compliant. Partner with CTOs and senior leadership ...

Machine Learning Ops Engineer

Hiring Organisation
CMC Markets UK Plc
Location
City of London, London, United Kingdom
Employment Type
Permanent
meeting availability, latency, and freshness targets for ML services Debugging production issues across data, infrastructure, and model layers Improving system robustness through automation and observability Collaborating with platform and security teams on access, secrets, and compliance Engineering rigor Writing production-grade Python used in long-running services and pipelines Establishing … frameworks, experiment tracking, structured datasets Pipelines & Orchestration: Workflow schedulers for batch and near-real-time processing Deployment: Containers, model serving frameworks, infrastructure-as-code Observability: Metrics, logging, and alerting across data and model layers Cloud: Managed compute, storage, and networking (provider-agnostic mindset) The stack will evolve. We value engineers ...

SAP Data Architect

Hiring Organisation
HCLTech
Location
Greater London, England, United Kingdom
Exposure to managing and maintaining meta-data and data architecture lineages with data catalog such as Alation/Collibra/Purview and DQ and observability platforms Soft Skills · Excellent problem-solving and analytical skills. · Strong communication and stakeholder management abilities. · Ability to work in a fast-paced, collaborative environment. · Self … live of the solution · Continuously improve with Burberry governance frameworks, including logging and tracking tech-debt and actions and improve pain points around observability and maintenance of data architecture ...

Head of Brand & Content: Observability Thought Leader

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
Itrs Insights is seeking a Head of Brand & Content to lead brand evolution and establish thought leadership in the observability space. This role involves creating a cohesive brand narrative, managing content strategy, and positioning ITRS executives as industry authorities. The ideal candidate will have over 10 years of B2B marketing ...

Dynatrace Consultant

Hiring Organisation
McGregor Boyall
Location
London, South East, England, United Kingdom
Employment Type
Contractor
Contract Rate
£650 - £800 per day
Group's business-critical applications. We are seeking a skilled Dynatrace Admin/Consultant to play a key role in the enablement of observability across complex, hybrid cloud environments. The ideal candidate will have deep expertise in Dynatrace implementation (SaaS and On-Premises), monitoring configuration, and AI-driven insights … identify opportunities for enhancement to monitoring configuration and capabilities across critical applications.* Participate in the review of roles and responsibilities between teams for observability and make recommendations for improvement of the standards with an emphasis on Operational Resilience.* Play a key part in providing an automatically maintained ...