51 to 75 of 92 Observability Jobs in the City of London

Machine Learning Engineer

Hiring Organisation
Block MB
Location
City of London, London, United Kingdom
high-throughput, low-latency workloads Improve performance and reliability across GPU-based environments Design and implement model serving and deployment workflows Develop monitoring and observability tools to track system performance, errors and utilisation Support data preparation and model integration as part of the wider development lifecycle Collaborate with research, engineering ...

Full Stack Engineer

Hiring Organisation
Prism Digital
Location
City of London, London, United Kingdom
prompt engineering Data ingestion pipelines including APIs, RSS feeds, and web scraping Vector databases and modern AI data pipelines Cloud infrastructure and SaaS deployment Observability, testing frameworks, and CI pipelines Proven ability shipping real products such as SaaS platforms, production data pipelines, open-source tools with users, or meaningful side ...

Observability SRE / Platform Engineer – Elite Quant Hedge Fund

Hiring Organisation
Winston Fox
Location
City of London, London, United Kingdom
Observability SRE/Platform Engineer/Production Engineer sought to join an elite Multi-Strategy Quant Hedge Fund with $40BN+ in Assets Under Management. Our client is one of the world’s top Hedge Funds who are especially renowned for innovation and investment in Data & Technology. They are now looking … recruit a Low Latency Trading & Observability SRE/Platform Engineer/Production Engineer into their core Production Engineering team, who are accountable for the reliability, operability and performance of the firm’s trading-critical systems, in an environment where availability, correctness and latency directly impact outcomes. The successful hire will ...

Lead AI Automation Engineer

Hiring Organisation
PaymentGenes
Location
City of London, London, United Kingdom
across email, Teams, APIs, and databases Architect Agent Systems Modular task decomposition Guardrails and validation layers Structured output validation (JSON schemas, rule checks) Logging, observability, fallback logic, and confidence scoring Secure credential and secrets management Governance & Scale Ensure explainability and traceability Design compliant automation aligned to enterprise standards Identify high ...

Head of Engineering

Hiring Organisation
Lightdash
Location
City of London, London, United Kingdom
push velocity through AI-assisted workflows. DevX focus: You obsess over developer experience because you know great tooling = more speed. Execution focus: Monitoring, observability, and performance aren’t “nice to haves”, they’re how you ship fast without breaking trust. What you'll do Set the bar for engineering velocity ...

Automation Engineer

Hiring Organisation
Hlx Life Sciences
Location
City of London, London, United Kingdom
instrument SDKs. Background in building simulation environments to de-risk hardware integrations before deployment. Systems-minded: you think about reliability, failure modes, and observability from day one, not as an afterthought. Desirable Experience with fabrication and prototyping, including 3D printing custom parts to adapt or extend lab equipment. Familiarity with ...

Advisory Engineer

Hiring Organisation
WorkGenius Group
Location
City of London, London, United Kingdom
fallback chains, and safety constraints for agent systems. Own incident response processes and post-mortem frameworks. Drive production-readiness standards and operational excellence. Build Observability Infrastructure Architect tracing, logging, and monitoring systems for agent behavior at scale. Develop dashboards, metrics, and alerting pipelines. Enable continuous performance evaluation and optimization. Mentor ...

Javascript Developer

Hiring Organisation
Better Placed Ltd - A Sunday Times Top 10 Employer!
Location
City of London, London, United Kingdom
required Nice to have (not essential) React Experience building Shopify apps or working with Shopify APIs Event-driven architectures, serverless patterns, or modern observability tooling Package £60,000 to £80,000 depending on experience Fully remote role (UK-based) Please apply for an immediate interview ...

Network Engineer

Hiring Organisation
Ncounter Technology Recruitment
Location
City of London, London, United Kingdom
essential, alongside confidence working with modern data centre technologies. Nice to Haves: • Experience with automation using Python, Ansible, or similar tools • Exposure to observability and monitoring platforms • Understanding of network security and secure routing design • Hands-on experience with Arista and or Cisco in production environments • Industry certifications such ...

Artificial Intelligence Specialist

Hiring Organisation
HCLTech
Location
City of London, London, United Kingdom
secret management, and auditing Collaborate with risk/compliance to drive: model governance, content safety, bias/quality monitoring, and regulatory alignment 6) Evaluation, Observability & Continuous Improvement Create evaluation frameworks: offline evals (golden sets), automated regression, and scenario-based testing Instrument systems for observability: traces, prompt/versioning, retrieval diagnostics … practices: evaluation, content moderation, privacy, and compliance-by-design Architecture & Systems Skills (Must Have) Distributed system design: scalability, fault tolerance, caching, performance tuning Observability: logging/metrics/tracing, prompt/version tracking, monitoring SLIs/SLOs Cost management and performance optimization: model selection/routing, token reduction, caching, batching ...

Senior Full-Stack Engineer (Ref: 195894)

Hiring Organisation
Forsyth Barnes
Location
City of London, London, United Kingdom
trading platform Taking end-to-end ownership of features from design through to production Collaborating closely with platform and DevOps engineers on build pipelines, observability, and operational improvements Communicating directly with stakeholders to clarify requirements and propose technical solutions Contributing to and improving automated testing practices Participating in peer code … Tech Stack You’ll primarily be working with: TypeScript (Node.js & React) Monorepo tooling, GitHub, GitHub Actions Jest, Playwright Redis, MS SQL, WebSockets Docker, Kubernetes Observability tooling (Grafana, Prometheus, SonarQube) Requirements 8+ years of professional software development experience 3+ years hands-on TypeScript experience , including Node.js and React Strong experience building ...

Engineering Manager

Hiring Organisation
La Fosse
Location
City of London, London, United Kingdom
understanding of CI/CD pipelines , including build automation, testing, and deployment Familiarity with modern engineering practices: automated testing, infrastructure as code, monitoring, and observability Technology Stack Backend development across modern JVM frameworks including Spring , Spring Boot , and Micronaut , primarily using Java Cloud-native services deployed on Azure , with orchestration … Kubernetes and system monitoring/observability using tools such as Dynatrace Data persistence and storage using a mix of relational and NoSQL technologies, including SQL Server and MongoDB Frontend applications built with contemporary JavaScript frameworks and languages such as React , Next.js , Angular , and TypeScript In-memory data grids and caching ...

Platform Engineering Manager

Hiring Organisation
Prism Digital
Location
City of London, London, United Kingdom
cloud environments Architecture governance and design authority Security-by-design and Zero Trust Terraform or Bicep (production IaC) CI/CD and infrastructure automation Observability (SLOs, monitoring, incident management) Disaster recovery and resilience planning Vendor and third-party management Strong stakeholder communication What You’ll Work With Azure (landing zones … shared services) Terraform/Bicep CI/CD pipelines Kubernetes (AKS) Observability tooling (logs, metrics, tracing) Networking (VNets, ExpressRoute, private endpoints) Security controls and compliance frameworks Event Hubs, Service Bus, API Management Hybrid Windows/Linux infrastructure Nice to Haves FinOps (cost control, budgeting, optimisation) Financial services or regulated environments ...

Senior Lead Engineer

Hiring Organisation
Investigo
Location
City of London, London, United Kingdom
change management tools like Liquibase into automated pipelines Apply DevSecOps best practices across the lifecycle: static analysis, dependency scanning, and secure credential management Ensure observability, monitoring, and performance using GCP Operations Suite or New Relic Mentor engineers and collaborate across global, distributed teams What We’re Looking For Proven experience … expertise : BigQuery, Dataproc, Cloud Composer Deep data architecture and engineering knowledge : Spark, DBT, Oracle, BigQuery Experience designing scalable architectures (Microservices, Monoliths, Batch) Skilled in observability, monitoring, and DevSecOps integration Excellent communication with a record of collaborating globally Why You’ll Love It Combine architecture, coding, and leadership in one role ...

Senior / Lead Data Engineer (AI-Focused)

Hiring Organisation
PaymentGenes
Location
City of London, London, United Kingdom
inference (batch and real-time) Evaluate and integrate emerging AI tooling where strategically valuable 🔧 Technical Leadership Set best practices for testing, documentation, lineage, and observability Lead code reviews and mentor data & analytics engineers Drive CI/CD and infrastructure-as-code adoption Own platform reliability, performance optimisation, and cost efficiency … Infrastructure Feature engineering architecture ML pipeline and deployment workflows Experience supporting production ML systems Familiarity with embeddings, vector databases, LLM orchestration (desirable) Data observability and model monitoring Platform & DevOps CI/CD for data workflows Git-based engineering standards Docker/containerisation Infrastructure-as-code (e.g., Terraform) Monitoring and alerting ...

Senior Backend Engineer

Hiring Organisation
Xapien
Location
City of London, London, United Kingdom
MongoDB and ElasticSearch running on Kubernetes in GCP. You'll work with modern patterns including event-driven architectures, gRPC and REST APIs, and comprehensive observability with GrafanaCloud. Depending on the team, you might focus on our Investigations domain (web scraping, data provider integrations, LLM-powered analysis) or our Core … LLMs and AI integration (particularly for Investigations team) Background in SaaS platforms or B2B products Experience in fintech, compliance, or regulated industries Familiarity with observability tools (Grafana, Prometheus, etc.) Understanding of authorisation patterns and security best practices Our Tech Stack Languages: Go Databases: MongoDB Messaging & Orchestration: Temporal, Kafka APIs: REST ...

Software Engineer

Hiring Organisation
Hydrogen Group
Location
City of London, London, United Kingdom
Code Scaling and managing large fleets of IoT devices in the field Developing CI/CD pipelines and automation across the stack Implementing observability, monitoring, and telemetry (cloud + edge) Supporting security and compliance standards (e.g. SOC2, HIPAA) Improving developer workflows and engineering productivity What We Are Looking For 5+ … Docker & Kubernetes (EKS preferred) Proficiency in Python, Go, or another modern language Experience building CI/CD pipelines and automation Hands-on experience with observability tools (Grafana, Prometheus) Nice to Have Experience with IoT/edge infrastructure (device provisioning, OTA updates) Hybrid or multi-cloud environments SOC2 compliance exposure High ...

Back End Developer

Hiring Organisation
NearTech Search
Location
City of London, London, United Kingdom
backend initiatives end-to-end, from architecture to rollout • Strengthen testing strategy across unit and integration layers • Improve data and integration workflows with observability and resilience • Optimise Postgres (RDS) and MongoDB performance, modelling and migrations The role requires... • Strong commercial experience with Node.js and TypeScript • Deep API design expertise, including ...

Partner Manager

Hiring Organisation
Timebeat
Location
City of London, London, United Kingdom
written communication and CRM discipline Nice to have Experience with channel models (reseller, referral, MSP, SI), co-sell motions, or marketplace partnerships Familiarity with observability/monitoring, networking, infrastructure tooling, or developer-facing products Experience building partner programs from scratch (tiering, enablement, certification, MDF) Success metrics (examples) Number ...

AWS Site Reliability Engineer ( Data Platform)

Hiring Organisation
FBI &TMT
Location
City of London, London, United Kingdom
Employment Type
Permanent
Salary
£450 - £455 per day
cloud-native data platform built on AWS, Snowflake, and Databricks. This role focuses on enhancing reliability through automation, disaster recovery testing, resiliency engineering, observability, and proactive SLO/SLI/SLA management. Key Responsibilities: Design, build, and maintain automation for infrastructure provisioning, platform operations, and incident response using … manage SLIs, SLOs, and SLAs for critical data pipelines and platform services; utilise error budgets to guide reliability improvements. Build and operate robust observability solutions (metrics, logs, traces, alerts) for AWS services, Snowflake, and Databricks workloads. Partner with data engineering and platform teams to embed reliability-by-design into architecture ...

AI Engineer – Production LLM Systems

Hiring Organisation
Redimeer
Location
City of London, London, United Kingdom
orchestration . You will work on: Multi‐agent architectures Intelligent tool and API integrations RAG pipelines and vector‐based retrieval Evaluation frameworks and AI observability Production workflows that ensure reliability, consistency, and scale You’ll play a critical role in crafting the orchestration layer that makes LLM systems trustworthy—handling … improving robustness across diverse use cases. Key Responsibilities Build production AI systems using LLMs, RAG pipelines, vector databases, and agentic frameworks Design evaluation and observability frameworks to measure performance, accuracy, and reliability Develop clean, scalable applications with proper error handling, APIs, and data pipelines Implement and maintain retrieval systems (vector ...

GenAI Architect

Hiring Organisation
HCLTech
Location
City of London, London, United Kingdom
pipelines, including chunking, embedding, vector search (e.g., Azure Cognitive Search, pgvector), and reranking. Ensure designs include standards for lineage, observability, and evaluation (e.g., RAGAS). Cross-Cloud & Vendor Integration : Create and maintain decision frameworks for platform selection (e.g., Copilot Studio for Teams integration, Vertex AI for GCP workloads). Advise … clients on balancing vendor lock-in risks with integration benefits. GenAIOps & Observability : Define the architectural standards for GenAIOps, including CI/CD, IaC, and observability. Establish standard metrics to track agent decision traces, latency, token consumption, hallucination, and cost. Safety, Security & Governance : Architect enterprise-wide guardrails for safety (hallucination mitigation ...

Senior DevOps Engineer

Hiring Organisation
REVYBE IT RECRUITMENT LIMITED
Location
City of London, London, United Kingdom
Employment Type
Permanent, Work From Home
Senior DevOps Engineer-FinTech £100,000+Bonus(£15k+) CentralLondon-Hybridworking,2/3daysperweekintheoffice WereworkingwithahighlysuccessfulFinTechbusinessinCentralLondonwhoarelookingtohireaSenior DevOps Engineertohelpshapethefutureoftheirinfrastructureandplatformstrategy. Thisisahigh impactrolewithinagrowingengineeringteamwhereyoullhavetheopportunitytoinfluencearchitecturaldecisions,mentorengineers,andremaindeeplyhands-onwithmoderninfrastructuretooling.Thecompanybuildsallit'ssoftwarein-houseandhasbeeninvestingheavilyinitsplatform,observability,andcloudcapabilitiesastheycontinuetoscale. TheOpportunity: YoulljoinastheSenior DevOps Engineer,workingcloselywithengineeringleadershiptodriveimprovementsacrossinfrastructure,reliability,anddeveloperexperience.Thisrolesitsattheintersectionofhands-onengineering,mentoring,andstrategy.Youllguideplatformdirectionwhilecontinuingtobuildandimprovetheinfrastructurethatpowersthebusiness. Youllalsomentoroneplatformengineer,helpingthemgrowwhileensuringtheteamcontinuesdeliveringhigh-qualityinfrastructureandautomation. Environment: Theplatformcurrentlyoperatesinahybridenvironment: ~60%on-premiseinfrastructure ~40%MicrosoftAzure Thelong-termstrategyisfocusedonmodernisingtheplatform,improvingobservability,andevolvingcloudcapabilities,makingthisanexcellentopportunityforsomeonewhoenjoysbuildingandshapingsystems. TechStack … premise+Azure) CI/CD&Automation GitHubActions Python AzureServices AzureKubernetesService(AKS) AzureVirtualMachines AzureVirtualNetworks AzureLoadBalancer AzureApplicationGateway AzureStorageAccounts AzureBlobStorage AzureKeyVault AzureMonitor AzureLogAnalytics AzureActiveDirectory AzureContainerRegistry AzureDNS AzureDevOpsintegrations Observability Logging,monitoring,andtracingacrossdistributedsystems Buildingmeaningfultelemetryandplatformvisibility Whatyou'llbedoing: Leadingtheevolutionofthecompanysplatformandinfrastructurestrategy DesigningandimprovinghybridAzure+on-premiseenvironments DrivingKubernetesplatformimprovements BuildingautomationwithTerraformandPython Improvingobservabilityandmonitoringacrosssystems MentoringaPlatformEngineerandhelpingshapeplatformbestpractices Workingcloselywithengineeringteamstoimprovedeveloperexperienceandreliability Whythisroleisexciting: Hugeimpactonthefutureplatformarchitecture Opportunitytoshapethehybridcloudstrategy Combinationoftechnicalleadershipandhands-onengineering ModernDevOpstoolingandcloudtechnologies Directinfluenceonplatformreliabilityandscalability Package: Salary:Upto ...

Head of Quality Engineering

Hiring Organisation
Infoplus Technologies UK Ltd
Location
City of London, London, United Kingdom
Employment Type
Contract
Contract Rate
From £650 to £700 per day
Karate, Postman) o Performance (JMeter, LoadRunner) o Cloud native testing (Azure, AWS, GCP) o Test orchestration, CI/CD pipelines o TDM, environment strategy, observability tools Skills: OpenText Microfocus ALM QC Job Description: Quality Engineering Transformation Leader The Quality Engineering (QE) Technical & Transformational Leader is a pivotal senior responsible … data modernization, environment optimization, and productivity uplift. Partner with domain leaders to introduce modern testing practices (in sprint automation, service virtualization, early performance testing, observability, etc.). T echnical Leadership & Engineering Excellence Strong Executive Communication & Stakeholder Management Data Driven Delivery, Productivity & Value Tracking Leadership of Large, Multi Disciplinary QE Teams ...

Python Software Engineer

Hiring Organisation
Arcus Search
Location
City of London, London, United Kingdom
Python Software Engineer - Observability Engineering - Quant Research (You must have the right to work in the UK to be considered for this position) London (Hybrid) | c.£150k base + bonus • I’m working with a top-tier quant trading firm building one of the most advanced automated research and trading … platforms globally. • They’re looking for a Python Software Engineer to join their Observability Engineering team , building the telemetry and monitoring infrastructure that underpins hundreds of services across the organisation. The Role • Build and maintain systems that handle cloud-scale volumes of telemetry data • Extend and maintain OpenTelemetry collectors, SDKs ...