51 to 75 of 82 Observability Jobs in the City of London

Scala Developer (Remote)

Hiring Organisation
Stealth iT Consulting
Location
City of London, London, United Kingdom
Agile environment (Scrum/Kanban). Participate in code reviews, architecture discussions and pair programming. Troubleshoot and resolve production issues; contribute to reliability and observability (logging, metrics, alerts). Help define CI/CD pipelines and deployment processes (e.g., Jenkins/GitHub Actions/Concourse). Produce concise technical documentation ...

Machine Learning Engineer MLOps Python LLM AWS

Hiring Organisation
Client Server
Location
City of London, London, United Kingdom
Employment Type
Permanent, Work From Home
Salary
£90,000
office once a month for team collaboration and meetings. About you: You are an experienced Machine Learning Engineer who prioritises system reliability, latency and observability as much as model accuracy You have strong Python skills and experience of leveraging the AWS stack to move models from development environments to robust ...

Head of Engineering

Hiring Organisation
Lightdash
Location
City of London, London, United Kingdom
push velocity through AI-assisted workflows. DevX focus: You obsess over developer experience because you know great tooling = more speed. Execution focus: Monitoring, observability, and performance aren’t “nice to haves”, they’re how you ship fast without breaking trust. What you'll do Set the bar for engineering velocity ...

Software Engineer

Hiring Organisation
Acceler8 Talent
Location
City of London, London, United Kingdom
training efficiency across 1,000+ GPU clusters Improve utilisation, throughput, and reliability across distributed training infrastructure Build tooling for orchestration, monitoring, scheduling, and observability Work closely with research teams to accelerate large-scale model training 🔧 What They’re Looking For Deep GPU infrastructure/distributed systems experience Strong knowledge ...

Java Consultant

Hiring Organisation
Stanford Black Limited
Location
City of London, London, United Kingdom
with portfolio managers and traders to deliver real-time, business-critical technology 🔷 Architect event-driven, distributed systems with strong focus on performance, resilience, and observability 🔷 Drive technical direction across microservices, data streaming, and system design in a fast-moving environment The Role: Join a high-calibre Investment Engineering team embedded ...

Head of Data & AI Platforms & Engineer

Hiring Organisation
SANS Consulting Services, Inc
Location
City of London, London, United Kingdom
management, and access controls. • Ensure platforms are designed for performance, scalability, reliability, and security. Data Governance, Quality & Compliance • Implement frameworks for data quality, lineage, observability, and metadata management. • Ensure compliance with security, privacy, and regulatory requirements across all data platforms. • Oversee remediation of data quality issues within BAU operations. Change ...

Artificial Intelligence Engineer

Hiring Organisation
Prism Digital
Location
City of London, London, United Kingdom
Next.js Nice to Haves Voice and telephony systems (Twilio, SIP, WebRTC) STT/TTS pipelines (Deepgram, ElevenLabs or similar) Messaging integrations (WhatsApp Business API) Observability tooling (Datadog, Sentry, LangSmith) Previous founding engineer experience Being a founding engineer or an ex-founder from any company backed by Antler or Entrepreneurs First ...

DataOps Architect – Commodities / Financial Services – Up to £140k – Hybrid, London

Hiring Organisation
VirtueTech Recruitment Group
Location
City of London, London, United Kingdom
ensuring controlled promotion across environments, while embedding best practices across software engineering, testing, and release management. The role will also involve defining data quality, observability, and reliability frameworks aligned to modern architectures such as Medallion (Bronze, Silver, Gold), as well as building reusable tooling, templates, and standards to drive consistency …/or Azure) in production Strong programming skills in Python and SQL Experience with Infrastructure as Code (e.g. Terraform) Deep understanding of data reliability, observability, and data quality frameworks Strong grounding in software engineering best practices and SDLC Ability to operate both hands-on and at a strategic level Experience ...

Senior DevOps Engineer (Azure / Terraform)

Hiring Organisation
INTEC SELECT LIMITED
Location
City of London, London, England, United Kingdom
Employment Type
Contractor
Contract Rate
£600 - £650 per day
Azure and Terraform expertise, who is comfortable operating in a hands-on capacity, while also mentoring others and driving improvements across CI/CD, observability, and security. Role & Responsibilities Design, build, and manage Azure infrastructure using Terraform, including modules, state management, and pipelines Develop and maintain CI/CD workflows … GitHub Actions, Azure DevOps or similar) Improve platform reliability, observability, and security across environments Take ownership of infrastructure and deployment processes within a fast-moving delivery team Collaborate closely with engineers to embed DevOps best practices and scalable patterns Mentor team members on infrastructure, automation, and platform engineering principles Identify ...

Site Reliability Engineer

Hiring Organisation
Arrows
Location
City of London, London, United Kingdom
/CircleCI) 🔄 Operate and optimise Kubernetes environments (EKS primarily, GKE exposure a bonus) ☸️ Build and manage Infrastructure as Code using Terraform 🏗️ Champion reliability engineering: observability 👀, incident response 🚨, performance & cost optimisation 💡, and security best practices 🔐 Drive automation across environments and collaborate with cross-functional teams 🤝 ✅ What You’ll Bring Strong hands … pipelines end-to-end 🚀 A senior, self-sufficient communicator who can mentor and work across multiple teams 💬 ⭐ Nice to Have Experience with service mesh & observability tools (Istio, Prometheus, Grafana, Datadog) 📊 Policy as code exposure 📜 Scripting skills (Bash/Python/Go) 💻 Experience with GKE or multi-cloud environments 🌍 👉 Interested ...

DevOps Manager

Hiring Organisation
Harvey Nash
Location
City of London, London, United Kingdom
optimisation Drive CI/CD strategy using GitHub and modern DevOps tooling Champion Infrastructure as Code using Terraform/ARM Implement and maintain observability and monitoring solutions Partner closely with security teams to meet regulatory and cyber‐security standards Manage third‐party vendors and ensure service delivery standards Mentor engineers … background in DevOps and Azure cloud operations Proven experience leading engineering teams CI/CD, Git, GitHub pipelines Infrastructure as Code (Terraform, ARM, Ansible) Observability tools such as Prometheus and Grafana Containers and orchestration (Docker, Kubernetes) Scripting (PowerShell) Experience in regulated environments such as banking, trading, financial services or similar ...

Back End Developer

Hiring Organisation
NearTech Search
Location
City of London, London, United Kingdom
backend initiatives end-to-end, from architecture to rollout • Strengthen testing strategy across unit and integration layers • Improve data and integration workflows with observability and resilience • Optimise Postgres (RDS) and MongoDB performance, modelling and migrations The role requires... • Strong commercial experience with Node.js and TypeScript • Deep API design expertise, including ...

Principal Product Manager

Hiring Organisation
ZEREN
Location
City of London, London, United Kingdom
serious scale. What you'll be working on: • 0-1 build of the strategy and roadmap for a GenAI platform spanning infrastructure, tooling, and observability • Designing platform capabilities that make experimentation and deployment of features frictionless - measured through DORA and Core4 metrics • Partnering with security, compliance, and data governance teams ...

Senior Frontend Developer

Hiring Organisation
SEEKR
Location
City of London, London, United Kingdom
bridges so builders can wire their products into hundreds of third‐party tools without hand‐rolling every integration. It handles managed auth, real‐time observability and connector sprawl so product teams can focus on great agent experiences instead of glue code. Your job is to make the surface they ...

Engineering Manager (.NET) - Contract

Hiring Organisation
La Fosse
Location
City of London, London, United Kingdom
resource/capacity management and delivery ownership. - Experience writing executive updates and technical summaries for senior stakeholders. - Strong knowledge of CI/CD, automation, observability, and DevOps maturity models. - Evidence of driving adoption of new tools, frameworks, or processes across multiple teams. Technical Skills & Tools - Languages & Frameworks: C#/.NET … Framework and Core), React - Platforms & Infrastructure: Azure, AKS, Docker, on-prem Windows Server, SQL Server. - IAM and App Gateways: Okta, APIM, Apigee - Monitoring & Observability: Dynatrace, Application Insights - CI/CD & DevOps: Azure DevOps pipelines, SonarCloud, Github - Architecture & Patterns: Microservices, event-driven architecture, domain-driven design, modern scalable design principles ...

Platform Engineer

Hiring Organisation
Albert Bow
Location
City of London, London, United Kingdom
preparation, turning compliance into a competitive advantage Build and maintain robust CI/CD pipelines across backend, frontend, and data services Establish company-wide observability — logging, metrics, tracing, alerting, and on-call culture Take ownership of cloud cost management, optimising spend without compromising performance Champion operational excellence across the engineering … What You'll Bring Technical Cloud & IaC: Azure (AWS a bonus), Terraform, AKS/Kubernetes, Docker, GitHub Actions Observability: Hands-on experience with logging, metrics, and distributed tracing frameworks Security: Secrets management, security scanning, and infrastructure hardening best practices Networking: VPCs, DNS, load balancers, VPNs, firewalls — you know your ...

Full Stack Engineer

Hiring Organisation
develop
Location
City of London, London, United Kingdom
customer-facing web platform built on a Next.js stack. The organisation is investing heavily in platform quality, developer experience, CI/CD, testing, and observability to support long-term scalability. You will contribute to strengthening the platform that enables multiple product squads to deliver features reliably and release with confidence. … capabilities, and supporting reliable releases. You will collaborate closely with senior engineers while taking ownership of well-defined areas, helping improve testing, CI pipelines, observability, and overall developer workflows. The role suits an engineer who enjoys solving practical platform problems, building scalable web applications, and continuously improving how teams deliver ...

Senior Site Reliability Engineer

Hiring Organisation
Realm
Location
City of London, London, United Kingdom
High-growth infrastructure company focused on delivering large-scale compute, data centre capacity, and power solutions for advanced machine learning workloads. Platforms support leading research and industry teams requiring high-performance computing at significant scale. ...

SRE Observability Engineer

Hiring Organisation
Access Computer Consulting
Location
City of London, London, United Kingdom
Employment Type
Contract
Contract Rate
£350 - £450/day
recruiting for an SRE Observability Engineer to work in London 2-3 days a week, remaining time remote. The role falls inside IR35 so you will be required to work through an umbrella company for the duration of the contract. This is a 6 month contract which will transfer … permanent role after the initial contract term. You will be responsible for collaborating across various organisations within the client to understand and develop observability solutions for enterprise-wide deployment at scale. You will also manage the legacy monitoring stack across the Production Management organisation within the client. You must have ...

DevOps Engineer

Hiring Organisation
Autonomai Recruitment
Location
City of London, London, United Kingdom
performance and resilience. Build and extend network automation workflows to configure and manage trading infrastructure (routers, switches, security, and connectivity). Define and implement observability for services and infrastructure using metrics, logging, and alerting (e.g., Prometheus, Grafana, and related tooling). Key requirements Strong backend development experience with Python , including … experience building APIs (e.g., FastAPI or similar frameworks). Experience with Prometheus ‐style observability: metrics, alerting, and dashboards; familiarity with Grafana is a plus. Hands‐on experience with ClickHouse or similar high‐performance data stores is a strong advantage. Practical experience with network automation ; Ansible or similar configuration‐management tools ...

Data Reliability Engineer

Hiring Organisation
Ashdown Group
Location
City of London, London, United Kingdom
Employment Type
Permanent, Work From Home
work from home 2 days per week. This is a high-impact role focused on improving data quality, reducing incidents, and building scalable observability across a modern enterprise data platform. Youll help ensure data across the organisation is accurate, reliable, and trusted for critical business decision-making. Youll take ownership … style roles, with strong SQL and Python skills and experience working in modern cloud-based data environments. Hands-on experience with data observability tools such as Grafana, Monte Carlo, or Acceldata, and data governance/quality platforms like Informatica, Collibra or Microsoft Purview is highly desirable. Experience within the Azure ...

DevOps Engineer

Hiring Organisation
Few&Far
Location
City of London, London, United Kingdom
hands on DevOps/Infrastructure Engineer who thrives in early-stage environments and loves building from the ground up. You’ll own reliability, observability, incident response, and infrastructure automation across a modern AI-native platform. 🔥 Tech stack includes: *GCP *Terraform *Cloud Run/Kubernetes *GitHub Actions *Python & Kotlin *Temporal … particularly keen to speak with engineers who have: ✅ Strong Terraform & production infrastructure experience ✅ Deep observability & monitoring expertise ✅ Incident management/on-call experience ✅ Security-first mindset ✅ CI/CD pipeline expertise ✅ Startup or greenfield experience This is a brilliant opportunity to shape what “good” looks like in a fast-moving ...

Senior Software Engineer – AI / Agentic Systems

Hiring Organisation
MA (Montreal Associates)
Location
City of London, London, United Kingdom
grade AI platform. You’ll operate at the core of the product engineering function—designing systems that power autonomous agents, orchestrate workflows, and enable observability at scale. This is not just another backend role. You’ll influence architecture, mentor engineers, and help define the technical direction of a rapidly growing … Lead design and code reviews , ensuring high standards of quality and security Collaborate closely with AI research, product, and infrastructure teams Improve system reliability, observability, and scalability Mentor engineers and act as a technical multiplier across teams Champion best practices, tooling, and engineering excellence Proactively identify and resolve technical debt ...

Platform Engineer: £120k + Bonus/benefits (AI Trading)

Hiring Organisation
Hunter Bond
Location
City of London, London, United Kingdom
global trading platform. The successful candidate will be involved in every layer of the technology stack—from hardware and operating systems to automation and observability—while gaining exposure to how a world-class investment firm manages its technology infrastructure. Key Responsibilities Manage a distributed compute environment and several petabyte-scale … agile methodologies) Familiarity with infrastructure automation and configuration management tools (Chef, Puppet, or Ansible) Exposure to distributed storage systems and related protocols Experience with observability and monitoring tools (Elasticsearch, Logstash, Kibana, Datadog, Prometheus, Grafana) Strong written and verbal communication skills Demonstrated ability to learn quickly and adapt to evolving technologies ...

Lead Software Engineer

Hiring Organisation
5V Video
Location
City of London, London, United Kingdom
+ AWS (Lambda, API Gateway, S3, DynamoDB) Handling event-driven architectures (Kafka, SNS/SQS, etc.) Driving system design decisions across distributed systems Improving observability, reliability, and performance in production Debugging complex issues and leading resolution across teams Staying hands-on while setting technical direction and standards Tech Stack Python … Lambda, API Gateway, S3, DynamoDB, IAM) Event-driven systems (Kafka, SNS/SQS) CI/CD (Concourse, Git workflows) Databases (Postgres, DynamoDB, Couchbase) Observability (Prometheus, Grafana, CloudWatch) What You’ll Bring Strong backend engineering experience (Python preferred) Proven experience building distributed systems at scale Deep understanding of microservices + event ...