Observability Jobs in the City of London

176 to 200 of 293 Observability Jobs in the City of London

Senior Software Engineer, Frontend

City of London, London, United Kingdom
Hybrid / WFH Options
Travelex
to have: Experience with Headless CMS (we use Sanity) Exposure to backend or full-stack development (Node.js, Express, etc.) Developing white-label applications Building internationalised applications Familiarity with frontend observability best practices (we use Datadog) Experience in agile software development methodologies Why Travelex? To remain the world’s leading foreign exchange specialist, we are focused on making our customers’ lives More ❯
Posted:

Senior Software Engineer, Frontend

london (city of london), south east england, united kingdom
Hybrid / WFH Options
Travelex
to have: Experience with Headless CMS (we use Sanity) Exposure to backend or full-stack development (Node.js, Express, etc.) Developing white-label applications Building internationalised applications Familiarity with frontend observability best practices (we use Datadog) Experience in agile software development methodologies Why Travelex? To remain the world’s leading foreign exchange specialist, we are focused on making our customers’ lives More ❯
Posted:

Senior AI Engineer (LLMs, Agents, Python)

City of London, London, United Kingdom
Evergrowth
and providers. Integrate and operate multiple LLM providers (OpenAI primarily, also Anthropic and Google today, evaluate Groq/Cerebras and others for cost/latency/perf). Build observability for LLM calls: tracing, logging, cost accounting, and experiment tracking Make architecture choices with the CTO: orchestration patterns, state machines vs. DAGs, caching/memory, vector stores, background jobs. Continuously More ❯
Posted:

Senior AI Engineer (LLMs, Agents, Python)

london (city of london), south east england, united kingdom
Evergrowth
and providers. Integrate and operate multiple LLM providers (OpenAI primarily, also Anthropic and Google today, evaluate Groq/Cerebras and others for cost/latency/perf). Build observability for LLM calls: tracing, logging, cost accounting, and experiment tracking Make architecture choices with the CTO: orchestration patterns, state machines vs. DAGs, caching/memory, vector stores, background jobs. Continuously More ❯
Posted:

Senior Platform Engineer

City of London, London, United Kingdom
Harrington Starr
Experience with Docker containerisation for software deployment and scalability. CI/CD Expertise: Automate software build, test, and deployment pipelines following agile methodologies. Terraform Exposure: Beneficial experience with Terraform. Observability Tools: Experience with Grafana and Splunk is beneficial, particularly in developing and applying an observability strategy across a large organisation. Learn More For more information, contact George Harris at Harrington More ❯
Posted:

Senior Platform Engineer

london (city of london), south east england, united kingdom
Harrington Starr
Experience with Docker containerisation for software deployment and scalability. CI/CD Expertise: Automate software build, test, and deployment pipelines following agile methodologies. Terraform Exposure: Beneficial experience with Terraform. Observability Tools: Experience with Grafana and Splunk is beneficial, particularly in developing and applying an observability strategy across a large organisation. Learn More For more information, contact George Harris at Harrington More ❯
Posted:

Cloud Platform Engineer

City of London, London, United Kingdom
Harrington Starr
architectures, focusing on Kubernetes. Support critical engineering platforms: multi-tenant Airflow, BigQuery, and PostgreSQL clusters. Enhance developer experience via remote dev environments, self-hosted CI/CD pipelines, and observability tools. What They're Looking For: Hands-on experience with a major cloud provider (preferably GCP), Kubernetes, and Infrastructure as Code (Terraform). Strong Python skills for automation and operational … velocity, and leveraging AI tools for problem-solving. 2–4 years in Cloud, Platform, or DevOps roles, with excellent communication skills. Familiarity with CI/CD and data/observability platforms is a plus. Minimum 2:1 STEM degree or equivalent professional experience. Tech Stack: Cloud: GCP Orchestration: Kubernetes IaC: Terraform/Terramate Dev Environments: Remote development tools CI/… CD: Self-hosted pipelines, Argo Workflows Data Platforms: Airflow, BigQuery, PostgreSQL Observability: Grafana, Prometheus Deployments: Helm For more information, contact Maria Ciprini at Harrington Starr, or click "Apply" to start your application. More ❯
Posted:

Cloud Platform Engineer

london (city of london), south east england, united kingdom
Harrington Starr
architectures, focusing on Kubernetes. Support critical engineering platforms: multi-tenant Airflow, BigQuery, and PostgreSQL clusters. Enhance developer experience via remote dev environments, self-hosted CI/CD pipelines, and observability tools. What They're Looking For: Hands-on experience with a major cloud provider (preferably GCP), Kubernetes, and Infrastructure as Code (Terraform). Strong Python skills for automation and operational … velocity, and leveraging AI tools for problem-solving. 2–4 years in Cloud, Platform, or DevOps roles, with excellent communication skills. Familiarity with CI/CD and data/observability platforms is a plus. Minimum 2:1 STEM degree or equivalent professional experience. Tech Stack: Cloud: GCP Orchestration: Kubernetes IaC: Terraform/Terramate Dev Environments: Remote development tools CI/… CD: Self-hosted pipelines, Argo Workflows Data Platforms: Airflow, BigQuery, PostgreSQL Observability: Grafana, Prometheus Deployments: Helm For more information, contact Maria Ciprini at Harrington Starr, or click "Apply" to start your application. More ❯
Posted:

Senior Python Engineer | AI Start Up

City of London, London, United Kingdom
Oho Group Ltd
support AI-driven features Work closely with ML and product teams to deploy AI models into production Optimize systems for reliability, scalability, and performance Implement robust monitoring, logging, and observability practices Contribute to architectural decisions and guide best practices across the engineering team 🧰 Tech Stack Core: Python (FastAPI, Django, or Flask) Infrastructure: AWS/GCP, Docker, Kubernetes Data: Postgres, Redis … integrating AI models into real-world products Knowledge of LLM frameworks or retrieval-augmented generation (RAG) systems Exposure to event-driven architectures or real-time data streaming Familiarity with observability tools (Prometheus, Grafana, OpenTelemetry) 🚀 Why Join You’ll be part of a small, high-impact team tackling ambitious technical challenges at the intersection of AI and scalable infrastructure . We More ❯
Posted:

Agile Business Analyst - Market Data Monitoring & Alerting

City of London, Greater London, UK
Hays
experiences. Proven experience as a Business Analyst in an Agile environment Strong knowledge of market data and market data supervision Financial Services experience is mandatory Strong understanding of monitoring, observability, and telemetry (metrics, logs, traces) Ability to translate technical concepts into actionable business requirements Hands-on experience with tools such as Datadog, BigPanda, Grafana would be desirable Excellent stakeholder management More ❯
Posted:

Agile Business Analyst - Market Data Monitoring & Alerting

london (city of london), south east england, united kingdom
Hays
experiences. Proven experience as a Business Analyst in an Agile environment Strong knowledge of market data and market data supervision Financial Services experience is mandatory Strong understanding of monitoring, observability, and telemetry (metrics, logs, traces) Ability to translate technical concepts into actionable business requirements Hands-on experience with tools such as Datadog, BigPanda, Grafana would be desirable Excellent stakeholder management More ❯
Posted:

Back End Developer

City of London, London, United Kingdom
Hybrid / WFH Options
InfoSec People Ltd
develop and secure RESTful APIs and serverless microservices (AWS). Optimise data access across MySQL/RDS and NoSQL (DynamoDB); write efficient queries. Automate deployments (CI/CD), add observability, and ship with high quality. Collaborate on architecture, testing and documentation; contribute across the full lifecycle. What you’ll bring 3+ years’ commercial Node.js engineering. Hands-on AWS Serverless (Lambda More ❯
Posted:

Software Engineer

City of London, London, United Kingdom
Zenith Search
UIs: time series views, blotters, RFQ/axe views, portfolio & risk dashboards. End-to-end systems: you design it, build it, productionise it, support it. CI/CD, testing, observability, the lot. What you're gaining: Direct line to PMs/traders/researchers. Zero “just keep the lights on” work: everything is about edge, speed, and better decisions. Top More ❯
Posted:

Software Engineer

london (city of london), south east england, united kingdom
Zenith Search
UIs: time series views, blotters, RFQ/axe views, portfolio & risk dashboards. End-to-end systems: you design it, build it, productionise it, support it. CI/CD, testing, observability, the lot. What you're gaining: Direct line to PMs/traders/researchers. Zero “just keep the lights on” work: everything is about edge, speed, and better decisions. Top More ❯
Posted:

Solutions Architect – Payment Platforms & POS Integration

City of London, London, United Kingdom
Hybrid / WFH Options
YQN Pay
and post-launch operations. Develop and maintain reference architectures, documentation, and governance processes for ongoing platform enhancements. Guide adoption of modern infrastructure approaches, including cloud-native deployments, microservices, and observability frameworks. Contribute directly to business growth through hands-on architecture while mentoring junior engineers as the team scales. Align technology designs with compliance, regulatory, and security requirements (e.g., PCI DSS More ❯
Posted:

Solutions Architect – Payment Platforms & POS Integration

london (city of london), south east england, united kingdom
Hybrid / WFH Options
YQN Pay
and post-launch operations. Develop and maintain reference architectures, documentation, and governance processes for ongoing platform enhancements. Guide adoption of modern infrastructure approaches, including cloud-native deployments, microservices, and observability frameworks. Contribute directly to business growth through hands-on architecture while mentoring junior engineers as the team scales. Align technology designs with compliance, regulatory, and security requirements (e.g., PCI DSS More ❯
Posted:

Staff Software Engineer

City of London, London, United Kingdom
Hybrid / WFH Options
Albany Growth
and system architecture across multiple teams. Tackle complex engineering challenges end-to-end, delivering the most critical components yourself. Champion engineering best practices—from design and implementation to testing, observability, and scaling. Collaborate with product, data, and business teams to align priorities and ensure technical solutions meet real business goals. Mentor and coach engineers and technical leads, helping them grow More ❯
Posted:

Staff Software Engineer

london (city of london), south east england, united kingdom
Hybrid / WFH Options
Albany Growth
and system architecture across multiple teams. Tackle complex engineering challenges end-to-end, delivering the most critical components yourself. Champion engineering best practices—from design and implementation to testing, observability, and scaling. Collaborate with product, data, and business teams to align priorities and ensure technical solutions meet real business goals. Mentor and coach engineers and technical leads, helping them grow More ❯
Posted:

Founding AI Engineer

City of London, London, United Kingdom
Higher - AI recruitment
that create real-world value Integrate AI models into operational workflows Ensure reliability through fail-safes, self-healing, and fallback mechanisms Monitor & improve AI performance with feedback loops and observability tools Collaborate with Data Engineers to ensure AI has accurate, real-time data Implement human-in-the-loop systems where needed Skills and Qualifications Exceptional Python skills and experience with More ❯
Posted:

Founding AI Engineer

london (city of london), south east england, united kingdom
Higher - AI recruitment
that create real-world value Integrate AI models into operational workflows Ensure reliability through fail-safes, self-healing, and fallback mechanisms Monitor & improve AI performance with feedback loops and observability tools Collaborate with Data Engineers to ensure AI has accurate, real-time data Implement human-in-the-loop systems where needed Skills and Qualifications Exceptional Python skills and experience with More ❯
Posted:

Cloud Engineer DV Cleared

City of London, London, United Kingdom
Damia Group
automation, and container orchestration. You will be instrumental in shaping enterprise-ready cloud solutions by applying deep technical expertise in AWS alongside knowledge of multi-cloud environments, identity management, observability, and cost optimisation. Key Responsibilities Design and implement secure, scalable AWS cloud architectures Drive Infrastructure as Code (IaC) adoption using Terraform and CloudFormation Build, optimise, and automate CI/CD … GitHub Actions, and related tools Deploy and manage containerised solutions with Docker, Kubernetes, and Helm Implement strong security and access controls using IAM, Vault, and Secrets Manager Enhance platform observability using Prometheus, Grafana, and ELK Stack Collaborate with cross-functional teams to deliver robust, high-availability solutions Key Skills & Experience Extensive hands-on experience with AWS (Azure knowledge beneficial) Expertise … in Terraform, CloudFormation, and automation tooling Strong containerisation skills with Kubernetes, Docker, and related platforms Proven background in cloud security, IAM, and governance Solid understanding of monitoring and observability stacks Ability to influence architecture decisions and align solutions to best practices Desired Certifications AWS Certified Solutions Architect – Associate/Professional AWS Certified Security – Specialty HashiCorp Certified: Terraform Associate Kubernetes Certified More ❯
Posted:

Cloud Engineer DV Cleared

london (city of london), south east england, united kingdom
Damia Group
automation, and container orchestration. You will be instrumental in shaping enterprise-ready cloud solutions by applying deep technical expertise in AWS alongside knowledge of multi-cloud environments, identity management, observability, and cost optimisation. Key Responsibilities Design and implement secure, scalable AWS cloud architectures Drive Infrastructure as Code (IaC) adoption using Terraform and CloudFormation Build, optimise, and automate CI/CD … GitHub Actions, and related tools Deploy and manage containerised solutions with Docker, Kubernetes, and Helm Implement strong security and access controls using IAM, Vault, and Secrets Manager Enhance platform observability using Prometheus, Grafana, and ELK Stack Collaborate with cross-functional teams to deliver robust, high-availability solutions Key Skills & Experience Extensive hands-on experience with AWS (Azure knowledge beneficial) Expertise … in Terraform, CloudFormation, and automation tooling Strong containerisation skills with Kubernetes, Docker, and related platforms Proven background in cloud security, IAM, and governance Solid understanding of monitoring and observability stacks Ability to influence architecture decisions and align solutions to best practices Desired Certifications AWS Certified Solutions Architect – Associate/Professional AWS Certified Security – Specialty HashiCorp Certified: Terraform Associate Kubernetes Certified More ❯
Posted:

Senior Software Engineer

City of London, Greater London, UK
Creo Recruitment
hosted on AWS. Architect and optimise systems: Define service boundaries, data ownership, and failure-recovery patterns for scalable, high-availability systems. Raise engineering quality: Champion best practices for testing, observability, and security. Review critical PRs and guide technical decisions across the team. Operate and improve production systems: Monitor performance, reliability, and cost efficiency. Lead incident response and drive continuous improvement. … Django) Cloud: AWS (Lambda, ECS/Fargate, S3, DynamoDB, CloudWatch, API Gateway) Data & Messaging: PostgreSQL, Redis, Kafka or SQS CI/CD & Infrastructure: Docker, Terraform, GitHub Actions, CloudFormation Monitoring & Observability: Prometheus, Grafana, OpenTelemetry Testing: Pytest, integration and load testing frameworks Key Skills & Expertise Proven experience designing and delivering production systems using Python on AWS . Strong understanding of distributed systems … API design, and event-driven architectures. Deep knowledge of system observability, logging, and performance optimisation. Familiarity with modern security and data-privacy best practices. Excellent communicator who can document and articulate technical trade-offs clearly. Behaviours & Attributes Ownership: Takes full responsibility for systems from design to operation. Pragmatism: Balances long-term architecture with delivery velocity. Influence: Raises standards and mentors More ❯
Employment Type: Part-time
Posted:

Senior Python Software Engineer

City of London, London, United Kingdom
Creo Recruitment
hosted on AWS. Architect and optimise systems: Define service boundaries, data ownership, and failure-recovery patterns for scalable, high-availability systems. Raise engineering quality: Champion best practices for testing, observability, and security. Review critical PRs and guide technical decisions across the team. Operate and improve production systems: Monitor performance, reliability, and cost efficiency. Lead incident response and drive continuous improvement. … Django) Cloud: AWS (Lambda, ECS/Fargate, S3, DynamoDB, CloudWatch, API Gateway) Data & Messaging: PostgreSQL, Redis, Kafka or SQS CI/CD & Infrastructure: Docker, Terraform, GitHub Actions, CloudFormation Monitoring & Observability: Prometheus, Grafana, OpenTelemetry Testing: Pytest, integration and load testing frameworks Key Skills & Expertise Proven experience designing and delivering production systems using Python on AWS . Strong understanding of distributed systems … API design, and event-driven architectures. Deep knowledge of system observability, logging, and performance optimisation. Familiarity with modern security and data-privacy best practices. Excellent communicator who can document and articulate technical trade-offs clearly. Behaviours & Attributes Ownership: Takes full responsibility for systems from design to operation. Pragmatism: Balances long-term architecture with delivery velocity. Influence: Raises standards and mentors More ❯
Posted:

Senior Python Software Engineer

london (city of london), south east england, united kingdom
Creo Recruitment
hosted on AWS. Architect and optimise systems: Define service boundaries, data ownership, and failure-recovery patterns for scalable, high-availability systems. Raise engineering quality: Champion best practices for testing, observability, and security. Review critical PRs and guide technical decisions across the team. Operate and improve production systems: Monitor performance, reliability, and cost efficiency. Lead incident response and drive continuous improvement. … Django) Cloud: AWS (Lambda, ECS/Fargate, S3, DynamoDB, CloudWatch, API Gateway) Data & Messaging: PostgreSQL, Redis, Kafka or SQS CI/CD & Infrastructure: Docker, Terraform, GitHub Actions, CloudFormation Monitoring & Observability: Prometheus, Grafana, OpenTelemetry Testing: Pytest, integration and load testing frameworks Key Skills & Expertise Proven experience designing and delivering production systems using Python on AWS . Strong understanding of distributed systems … API design, and event-driven architectures. Deep knowledge of system observability, logging, and performance optimisation. Familiarity with modern security and data-privacy best practices. Excellent communicator who can document and articulate technical trade-offs clearly. Behaviours & Attributes Ownership: Takes full responsibility for systems from design to operation. Pragmatism: Balances long-term architecture with delivery velocity. Influence: Raises standards and mentors More ❯
Posted:
Observability
the City of London
10th Percentile
£73,125
25th Percentile
£73,750
Median
£85,000
75th Percentile
£105,000