Observability Jobs in the UK

726 to 750 of 868 Observability Jobs in the UK

Observability and Telemetry Specialist

Knutsford, England, United Kingdom
Hybrid/Remote Options
Undisclosed
Job Title: Observability and Telemetry Specialist Location: Knutsford/Hybrid Contract: Until end of June 2026 Rate: £550.00 per day Role Description: Our client is seeking a skilled Observability and Telemetry Specialist to enhance visibility across their IT infrastructure and applications. The ideal candidate will have a strong background in financial services and expertise in monitoring, diagnostics, and performance optimization. … Key Responsibilities: Design and implement observability solutions across web applications, servers, and network infrastructure. Monitor and support Apache HTTP Server, Linux/UNIX systems, and web servers. Collaborate with IT operations, support, and security teams to ensure system reliability and compliance. Administer infrastructure components, including firewalls, NAC, and network security tools. Develop and maintain telemetry pipelines for real-time insights … Server & Web Application Support Infrastructure & Server Administration IT Operations, Support & Security Network Access Control & Security System Administration & Software Development Experience in Financial Services environments Preferred Qualifications: Proven experience with observability platforms and telemetry tools Strong understanding of compliance and regulatory requirements in finance Excellent problem-solving and communication skills This is an excellent opportunity for a technically skilled professional to More ❯
Posted:

Observability and telemetry Specialist

Knutsford, England, United Kingdom
Hybrid/Remote Options
eTeam
Role Title: Observability and Telemetry Specialist Location: Hybrid - 60% Office 40% home Knutsford WA16 9EU Duration: 29/05/2026 Rate: £ 492.00/day on umbrella Role Description "We are seeking a skilled Observability and Telemetry Specialist to enhance visibility across our IT infrastructure and applications. The ideal candidate will have a strong background in financial services and deep … expertise in monitoring, diagnostics, and performance optimization. Key Responsibilities: Design and implement observability solutions across web applications, servers, and network infrastructure. Monitor and support Apache HTTP Server, Linux/UNIX systems, and web servers. Collaborate with IT operations, support, and security teams to ensure system reliability and compliance. Administer infrastructure components including firewalls, NAC, and network security tools. Develop and … Server & Web Application Support Infrastructure & Server Administration IT Operations, Support & Security Network Access Control & Security System Administration & Software Development Experience in Financial Services environments Preferred Qualifications: Proven experience in observability platforms and telemetry tools Strong understanding of compliance and regulatory requirements in finance Excellent problem-solving and communication skills More ❯
Posted:

Senior TypeScript Back-End Engineer

City of London, London, United Kingdom
Wave Talent
design, build, and support extensible, low-maintenance back-end services. Partner with Product, Design, Operations, and Growth to prioritise customer-facing and internal problems that drive value. Champion security, observability, and reliability best practices. Mentor teammates and help cultivate a healthy, innovative engineering culture. Tech Stack: Core: TypeScript/JavaScript (server-side) Cloud & Infra: AWS (cloud-based architectures), Docker, CI …/CD; IaC such as Terraform or CloudFormation Quality & Ops: Security and observability tooling/best practices Bonus exposure: React on the front end (nice to have) What we’re looking for We value skill and impact over strict year counts. You should have: Strong TypeScript/JavaScript fundamentals and experience building high-traffic server-side web applications. Solid understanding … of cloud-based application architecture (preferably AWS). Hands-on experience with Docker and CI/CD tooling. Practical grasp of security and observability best practices. Clear, collaborative communication and leadership skills. Why join? Work on meaningful problems in a fun, healthy, productive environment. Competitive package: £80k–£100k + Bonus, up to 10% employer pension, 28 days holiday (plus bank More ❯
Posted:

Senior TypeScript Back-End Engineer

London Area, United Kingdom
Wave Talent
design, build, and support extensible, low-maintenance back-end services. Partner with Product, Design, Operations, and Growth to prioritise customer-facing and internal problems that drive value. Champion security, observability, and reliability best practices. Mentor teammates and help cultivate a healthy, innovative engineering culture. Tech Stack: Core: TypeScript/JavaScript (server-side) Cloud & Infra: AWS (cloud-based architectures), Docker, CI …/CD; IaC such as Terraform or CloudFormation Quality & Ops: Security and observability tooling/best practices Bonus exposure: React on the front end (nice to have) What we’re looking for We value skill and impact over strict year counts. You should have: Strong TypeScript/JavaScript fundamentals and experience building high-traffic server-side web applications. Solid understanding … of cloud-based application architecture (preferably AWS). Hands-on experience with Docker and CI/CD tooling. Practical grasp of security and observability best practices. Clear, collaborative communication and leadership skills. Why join? Work on meaningful problems in a fun, healthy, productive environment. Competitive package: £80k–£100k + Bonus, up to 10% employer pension, 28 days holiday (plus bank More ❯
Posted:

AWS Cloud Engineer

City of London, London, United Kingdom
Hybrid/Remote Options
Advanced Resource Managers
manage and support a customer’s AWS and Data platform To be technical hands on Provide Incident and problem management on the AWS IaaS and PaaS Platform Monitoring and observability of system and platform performance Collaboration with development and build teams on application and platform deployments and changes Involvement in the resolution of Incidents and problems in an efficient and … timely manner Actively monitor an AWS platform and components for technical issues Implement and improve on existing monitoring and observability solution To be involved in the resolution of technical incidents tickets Assist in the root cause analysis of incidents Assist with improving efficiency and processes within the team Examining traces and logs Escalate incidents and problems to the appropriate teams More ❯
Posted:

AWS Cloud Engineer

London Area, United Kingdom
Hybrid/Remote Options
Advanced Resource Managers
manage and support a customer’s AWS and Data platform To be technical hands on Provide Incident and problem management on the AWS IaaS and PaaS Platform Monitoring and observability of system and platform performance Collaboration with development and build teams on application and platform deployments and changes Involvement in the resolution of Incidents and problems in an efficient and … timely manner Actively monitor an AWS platform and components for technical issues Implement and improve on existing monitoring and observability solution To be involved in the resolution of technical incidents tickets Assist in the root cause analysis of incidents Assist with improving efficiency and processes within the team Examining traces and logs Escalate incidents and problems to the appropriate teams More ❯
Posted:

Cloud Platform Lead

United Kingdom
Hybrid/Remote Options
Tenth Revolution Group
as-Code using AWS CDK and TypeScript Oversee security, scalability, and cost optimisation of cloud environments Collaborate with product and engineering teams to align platform priorities Define and execute observability strategies including monitoring, logging, and alerting Design and maintain CI/CD pipelines and containerised deployments Key Requirements Proven experience designing, deploying, and managing AWS infrastructure for SaaS platforms Hands … TypeScript for infrastructure-as-code Track record of leading a team of 2–5 engineers in a scale-up or SaaS environment Strong understanding of modern DevOps tooling and observability stacks Experience with CI/CD, containerisation, and performance tuning Excellent communication skills and ability to collaborate across teams Security Clearance Requirement Applicants must be eligible for UK Security Clearance. More ❯
Posted:

Platform Engineer: £120k + Bonus/benefits (AI Trading)

City of London, London, United Kingdom
Hunter Bond
storage environments that power a global trading platform. The successful candidate will be involved in every layer of the technology stack—from hardware and operating systems to automation and observability—while gaining exposure to how a world-class investment firm manages its technology infrastructure. Key Responsibilities Manage a distributed compute environment and several petabyte-scale storage systems Install, configure, and … software development practices (version control, agile methodologies) Familiarity with infrastructure automation and configuration management tools (Chef, Puppet, or Ansible) Exposure to distributed storage systems and related protocols Experience with observability and monitoring tools (Elasticsearch, Logstash, Kibana, Datadog, Prometheus, Grafana) Strong written and verbal communication skills Demonstrated ability to learn quickly and adapt to evolving technologies Ability to work effectively in More ❯
Posted:

Platform Engineer: £120k + Bonus/benefits (AI Trading)

London Area, United Kingdom
Hunter Bond
storage environments that power a global trading platform. The successful candidate will be involved in every layer of the technology stack—from hardware and operating systems to automation and observability—while gaining exposure to how a world-class investment firm manages its technology infrastructure. Key Responsibilities Manage a distributed compute environment and several petabyte-scale storage systems Install, configure, and … software development practices (version control, agile methodologies) Familiarity with infrastructure automation and configuration management tools (Chef, Puppet, or Ansible) Exposure to distributed storage systems and related protocols Experience with observability and monitoring tools (Elasticsearch, Logstash, Kibana, Datadog, Prometheus, Grafana) Strong written and verbal communication skills Demonstrated ability to learn quickly and adapt to evolving technologies Ability to work effectively in More ❯
Posted:

Senior Site Reliability Engineer

United Kingdom
Hybrid/Remote Options
TechNET IT Recruitment Ltd
improvements across the platform Participate in an on-call rotation (one week every 4–5 weeks) to ensure 24x7 availability of critical systems Collaborate with internal teams to improve observability, monitoring and alerting across services Identify and implement operational improvements to existing monitoring, logging and incident response processes Use scripting and automation (primarily Bash and Python) to reduce toil and … Practical scripting skills in Bash and/or Python for automation and tooling Familiarity with IaC tools such as Ansible or Puppet Good understanding of monitoring, alerting, logging and observability best practices Excellent communication skills and the ability to own incidents end-to-end, including post-incident reviews More ❯
Posted:

AWS Cloud Engineer

London, United Kingdom
Hybrid/Remote Options
ARM
manage and support a customer's AWS and Data platform To be technical hands on Provide Incident and problem management on the AWS IaaS and PaaS Platform Monitoring and observability of system and platform performance Collaboration with development and build teams on application and platform deployments and changes Involvement in the resolution of Incidents and problems in an efficient and … timely manner Actively monitor an AWS platform and components for technical issues Implement and improve on existing monitoring and observability solution To be involved in the resolution of technical incidents tickets Assist in the root cause analysis of incidents Assist with improving efficiency and processes within the team Examining traces and logs Escalate incidents and problems to the appropriate teams More ❯
Employment Type: Contract
Rate: £450 - £480/day
Posted:

Python Developer

City of London, London, United Kingdom
Creo Recruitment
records). Write performant SQL for data transformations, ETL workflows, and analytical use cases. Contribute to discussions on architecture and design, focusing on scalability, cost, reliability, and performance. Improve observability, testing, and overall system robustness. Participate in incident reviews and continuous improvement initiatives within the squad. Tech You’ll Work With Python (primary language) SQL Large-scale data workflows (ETL … handles large data volumes effectively. You contribute to improving data pipelines, performance, and system reliability. You participate actively in design discussions, planning, and squad rituals. You help strengthen testing, observability, and operational excellence. You continually learn and take on more ownership as part of a tight, high-performing squad. More ❯
Posted:

Python Developer

London Area, United Kingdom
Creo Recruitment
records). Write performant SQL for data transformations, ETL workflows, and analytical use cases. Contribute to discussions on architecture and design, focusing on scalability, cost, reliability, and performance. Improve observability, testing, and overall system robustness. Participate in incident reviews and continuous improvement initiatives within the squad. Tech You’ll Work With Python (primary language) SQL Large-scale data workflows (ETL … handles large data volumes effectively. You contribute to improving data pipelines, performance, and system reliability. You participate actively in design discussions, planning, and squad rituals. You help strengthen testing, observability, and operational excellence. You continually learn and take on more ownership as part of a tight, high-performing squad. More ❯
Posted:

Azure Cloud DevOps Engineer

London Area, United Kingdom
McCabe & Barton
RBAC, PIM), and ensure secure authentication (SAML/OAuth, MFA). Support CI/CD pipelines via Azure DevOps or GitHub Actions, troubleshoot builds, and manage YAML configurations. Implement observability best practices using Azure Monitor, Log Analytics, Application Insights, and dashboards (KQL and Datadog experience desirable). Ensure compliance and security through Microsoft Defender for Cloud, Azure Policy, Key Vault … in Terraform and Ansible for automation and infrastructure management. Deep technical understanding of networking, identity, and security within the Azure ecosystem. Strong exposure to CI/CD, monitoring, and observability tools. Experience supporting financial services or highly regulated environments is advantageous. How to Apply If your experience aligns with the requirements above, please apply with an updated CV. More ❯
Posted:

Azure Cloud DevOps Engineer

City of London, London, United Kingdom
McCabe & Barton
RBAC, PIM), and ensure secure authentication (SAML/OAuth, MFA). Support CI/CD pipelines via Azure DevOps or GitHub Actions, troubleshoot builds, and manage YAML configurations. Implement observability best practices using Azure Monitor, Log Analytics, Application Insights, and dashboards (KQL and Datadog experience desirable). Ensure compliance and security through Microsoft Defender for Cloud, Azure Policy, Key Vault … in Terraform and Ansible for automation and infrastructure management. Deep technical understanding of networking, identity, and security within the Azure ecosystem. Strong exposure to CI/CD, monitoring, and observability tools. Experience supporting financial services or highly regulated environments is advantageous. How to Apply If your experience aligns with the requirements above, please apply with an updated CV. More ❯
Posted:

Solution Architect

United Kingdom
Genese Solution Limited
lake/mesh (gold/silver/bronze). You'll unify discovery, composition, and contribution; enable NLQ + chat driven analytics; and enforce enterprise grade governance, security, and observability across payments, cards, lending, and partner ecosystems. Roles and Responsibilities Define target & interim architecture : reference diagrams, data contracts, semantic/NLQ models, API/event schemas, and write back patterns. … with schema & quality gates, stewardship workflows, and automatic metadata capture into the dictionary. Data platform : lakehouse (medallion) and curated marts; federation; cost/perf optimization; caching; workload isolation. Governance & observability : lineage, audit trail, prompt/response logging, evaluations and drift monitors for AI generated content; SLIs/SLOs & runbooks. Security & compliance : IAM (RBAC/ABAC), fine grained policies, masking/ More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

GenAI Engineer

London Area, United Kingdom
Clarity
up and harden RAG pipelines (indexing, retrieval policies, grounding, guardrails) and agent frameworks. Take basic infra ownership on GCP (or AWS/Azure): networking, autoscaling, CI/CD, IaC, observability, and cost tuning. Participate in on‐call for your area and drive root‐cause analysis with crisp follow‐ups. 15% Collaborate Pair with back‐end & front‐end to wire extractors … evals; hands‐on with time‐series analysis (forecasting, change‐point, drift). Cloud & ops: Basic infra ownership on GCP (or AWS/Azure): networking, autoscaling, CI/CD, IaC, observability, and cost control. Communication: You explain results clearly, align stakeholders, and write crisp docs. Bonus points DevOps wizardry; GPU/accelerator experience. Multimodal pipelines (text + voice + screenshots). More ❯
Posted:

GenAI Engineer

City of London, London, United Kingdom
Clarity
up and harden RAG pipelines (indexing, retrieval policies, grounding, guardrails) and agent frameworks. Take basic infra ownership on GCP (or AWS/Azure): networking, autoscaling, CI/CD, IaC, observability, and cost tuning. Participate in on‐call for your area and drive root‐cause analysis with crisp follow‐ups. 15% Collaborate Pair with back‐end & front‐end to wire extractors … evals; hands‐on with time‐series analysis (forecasting, change‐point, drift). Cloud & ops: Basic infra ownership on GCP (or AWS/Azure): networking, autoscaling, CI/CD, IaC, observability, and cost control. Communication: You explain results clearly, align stakeholders, and write crisp docs. Bonus points DevOps wizardry; GPU/accelerator experience. Multimodal pipelines (text + voice + screenshots). More ❯
Posted:

Senior Database Administrator (DBA)

United Kingdom
OnBuy Limited
optimisation, operation, and cost effective scaling of our GCP data platform. The role will focus on Cloud SQL (MySQL) and BigQuery, with responsibility for performance, reliability, security, automation and observability across transactional and analytical workloads. Key Responsibilities Administer high availability setups in CloudSQL. Performance tuning: indexing, partitioning, query tuning, load balancing and resource sizing. Cost optimisation for Cloud SQL and … Automate as much as possible: IaC (Terraform/Deployment Manager), CI/CD for schema and infra changes, automated remediation and runbooks. Capacity planning - forecasting growth. Maintaining, monitoring and observability of database solutions; own dashboards, SLOs, logging, metrics and actionable alerts. Cost and resource governance: tagging, quota controls, cost monitoring and chargeback recommendations. Mentoring and documentation: maintain runbooks, knowledge base More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Software Engineer, World Service

Glasgow, United Kingdom
Hybrid/Remote Options
BBC Group and Public Services
cross-disciplinary teams. Collaborate with Product, Design, and Editorial partners to deliver features that meet global audience needs, balancing innovation with reliability. Champion test automation, CI/CD, and observability to ensure robust, maintainable systems. Mentor engineers, contribute to shared learning, and promote inclusive, supportive team culture. Shape and contribute to the technical roadmap for World Service Discovery, influencing standards … and deploying large-scale distributed systems in enterprise or public-facing environments, covering testing, experimentation, and release. Strong background in cloud engineering (AWS or similar), including infrastructure automation and observability tooling. Demonstrable experience implementing secure development practices, managing access controls, and ensuring compliance with privacy and data-protection standards. Track record of influencing technical direction, mentoring other engineers, and collaborating More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Machine Learning Engineer

buckinghamshire, south east england, united kingdom
Hybrid/Remote Options
Rightmove
scientists to take models from development to production-grade systems, ensuring scalability, reproducibility, and robustness. Automating feature engineering and data pipeline processes, ensuring reproducibility and auditability. Implementing monitoring and observability to detect drift, bias, and performance degradation, and setting up rollback/recovery processes. Using MLOps tools (e.g., Vertex Pipelines, Kubeflow, Weights & Biases) for experiment tracking, model registry, and automated … distributed systems). 3+ years of experience as an ML Engineer, MLOps Engineer, Data Engineer, or similar, in a larger-scale, production-focused environment. Hands-on with model monitoring, observability, and retraining pipelines. Exposure to feature stores, registries, and experimentation frameworks. Familiarity with business-driven metrics and experience balancing ML performance with commercial goals. Experience with generative AI and LLM More ❯
Posted:

AppSec Lead

Central London, London, United Kingdom
Hybrid/Remote Options
Halian Technology Limited
A leading fintech company is seeking a Lead AppSec Engineer to join their established team. Youll be instrumental in embedding security into every stage of the software development lifecycleguiding engineers, shaping best practices, and driving secure, scalable solutions across our More ❯
Employment Type: Permanent, Work From Home
Posted:

Senior DevOps Engineer

Farnborough, Hampshire, South East, United Kingdom
Hybrid/Remote Options
Spectrum It Recruitment Limited
Senior DevOps Engineer - AWS/Azure Government Transformation Projects (AWS/Azure/DevOps) Location: Winchester, Hampshire, Hybrid Our client is a cloud-first digital consultancy, founded over 10 years ago and trusted by government, policing, and public sector organisations More ❯
Employment Type: Permanent
Salary: £75,000
Posted:

Senior Database Administrator

Bournemouth, Dorset, South West, United Kingdom
Hays
Your new company Join a fast-growing tech start-up that's recently expanded into multiple new markets and earned recognition as one of the UK's most exciting technology businesses. With a proactive, fail-fast culture, this is a More ❯
Employment Type: Permanent
Salary: £75,000
Posted:

Senior Machine Learning Engineer

Warwick, England, United Kingdom
DeepRec.ai
A fast-growing technology business is developing advanced software for accounting, payroll, tax, and practice management. With a strong engineering foundation and a clear commercial vision, the company is now expanding its focus on artificial intelligence to transform how professional More ❯
Posted:
Observability
10th Percentile
£56,593
25th Percentile
£67,500
Median
£80,000
75th Percentile
£105,000
90th Percentile
£140,250