351 to 375 of 557 Observability Jobs in the UK

DV Cleared Cloud Engineer - Contract

Hiring Organisation
Experis
Location
United Kingdom
Employment Type
Contract
Contract Rate
GBP 525 - 550 Daily
support critical, high-availability systems within secure government environments. This is an exciting opportunity to work across cloud and on-prem infrastructure, improving reliability, observability, automation, and delivery pipelines. Key Responsibilities Improve system reliability, performance, and scalability Collaborate with development and support teams Enhance monitoring, observability, and alerting capabilities Automate … relational databases Messaging technologies such as RabbitMQ Desirable Skills Java, Go, or Python development experience Azure experience Service management environment experience Knowledge of observability best practices and availability metrics Experience with secure or cross-domain environments If you receive suspicious outreach claiming to be from us, please contact ...

DV Cleared Cloud Engineer - Contract

Hiring Organisation
Experis
Location
South West, United Kingdom
Employment Type
Contract
Contract Rate
£525 - £550/day
support critical, high-availability systems within secure government environments. This is an exciting opportunity to work across cloud and on-prem infrastructure, improving reliability, observability, automation, and delivery pipelines. Key Responsibilities Improve system reliability, performance, and scalability Collaborate with development and support teams Enhance monitoring, observability, and alerting capabilities Automate … relational databases Messaging technologies such as RabbitMQ Desirable Skills Java, Go, or Python development experience Azure experience Service management environment experience Knowledge of observability best practices and availability metrics Experience with secure or cross-domain environments If you receive suspicious outreach claiming to be from us, please contact ...

Platform Engineer

Hiring Organisation
UA Consulting
Location
City of London, London, United Kingdom
Employment Type
Contract
Contract Rate
From £300 to £400 per day
Platform Engineer with strong site reliability principles to join our Platform team.Youllfocus onmaintainingand improving production reliability, automating operational tasks, and enhancing our observability stack.Youllwork closely with SREs, support engineers, release managers, and incident managers to ensureour systems meet SLIs, SLOs, and SLA targets. Key Responsibilities Maintain and optimise production environments … production workloads (EKS, EC2, RDS/Aurora, S3, IAM). Infrastructure as Code with Terraform and configuration management with Ansible. Strong experience with observability tools (Grafana, Prometheus, Loki, Tempo). Understanding of SRE concepts (SLIs, SLOs, error budgets, capacity planning). Comfortable working in incident and problem management processes. Strong ...

Platform Engineer

Hiring Organisation
UA Consulting
Location
City of London, London, United Kingdom
Employment Type
Permanent
Salary
£75,000
Platform Engineer with strong site reliability principles to join our Platform team.Youllfocus onmaintainingand improving production reliability, automating operational tasks, and enhancing our observability stack.Youllwork closely with SREs, support engineers, release managers, and incident managers to ensureour systems meet SLIs, SLOs, and SLA targets. Key Responsibilities Maintain and optimise production environments … production workloads (EKS, EC2, RDS/Aurora, S3, IAM). Infrastructure as Code with Terraform and configuration management with Ansible. Strong experience with observability tools (Grafana, Prometheus, Loki, Tempo). Understanding of SRE concepts (SLIs, SLOs, error budgets, capacity planning). Comfortable working in incident and problem management processes. Strong ...

AI Platform/ DevOps Engineer

Hiring Organisation
The Portfolio Group
Location
City of London, London, Castle Baynard, United Kingdom
Employment Type
Permanent
Salary
£70000 - £80000/annum + Benefits
Bedrock Knowledge Bases) and embedding pipelines Build and maintain CI/CD pipelines for inference services, retrievers, ingestion workflows, and RAG components Implement observability across AI workloads using CloudWatch, MLflow, and OpenTelemetry - covering latency, throughput, cost, and system health Apply secure-by-design principles including IAM, encryption, network controls … Terraform experience for infrastructure-as-code, provisioning and managing cloud infrastructure at scale Experience operating containerised services, managing CI/CD pipelines, and owning observability and reliability Familiarity with vector databases or search infrastructure (OpenSearch, Algolia) is a strong advantage Python proficiency for scripting, automation, and deploying production services Solid ...

Go Full Stack Developer

Hiring Organisation
itecopeople
Location
London, United Kingdom
Employment Type
Permanent, Work From Home
Salary
£60,000
event-driven services Contribute to CI/CD pipelines and cloud-native deployments Review code and champion engineering best practices Improve application performance, observability and reliability Collaborate within Agile delivery teams across multiple projects Support technical decision-making and continuous improvement Skills & Experience We are looking for candidates with strong … reviews, testing and engineering governance Experience with any of the following would be highly advantageous: Microsoft Azure Python GitOps tooling (Argo CD/Flux) Observability tooling (Prometheus, Grafana, OpenTelemetry) AI/LLM-enabled applications Event-driven architectures and messaging platforms What's on Offer Opportunity to work on cutting-edge ...

Site Reliability Engineer

Hiring Organisation
Pertemps London
Location
London, United Kingdom
Employment Type
Permanent
Salary
GBP 50,000 Annual
Jenkins, GitLab CI) Develop and maintain Terraform modules for infrastructure-as-code Build automation tools (CLI tools, scripts, GitHub Apps, self-service tooling) Own observability: dashboards, alerts, monitoring, and runbooks Continuously improve platform processes and reduce operational toil What We're Looking For Essential Skills & Experience 2-3 years … GitHub Actions, GitLab CI, Jenkins) Ability to write production-quality code in Python or Bash Solid networking fundamentals (DNS, load balancers, CDNs) Experience with observability tools (NewRelic, Datadog, Prometheus, Grafana) Comfortable participating in on-call rotations Experience using AI tools (e.g. ChatGPT, Copilot, Cursor) to enhance productivity Desirable Go, Ansible ...

Senior Software Engineer - AI Team

Hiring Organisation
Jobleads-UK
Location
Belfast, Northern Ireland, United Kingdom
C# and Blazor Lead technical decision-making within your squad, balancing innovation with pragmatic delivery Drive best practices in code quality, testing, security, and observability AI Integration & Development Collaborate closely with the AI platform team to design and deliver compelling AI-first features and products Integrate AI capabilities seamlessly into … services RESTful API design and implementation HTML5, CSS3, and responsive design principles Cloud platform experience with Azure, AWS, or GCP Production systems mindset including observability, testing, security, and reliability Agile delivery experience in fast-paced, iterative environments Strong collaboration and communication skills working effectively with technical and product stakeholders Architectural ...

AKS DevOps Engineer - Azure Kubernetes

Hiring Organisation
Reed
Location
London Gatwick Airport, Gatwick, West Sussex, England, United Kingdom
Employment Type
Full-Time
Salary
£70,000 per annum, Inc benefits
/CD pipelines using Azure DevOps with YAML. Implement and maintain secure networking patterns and apply cloud security best practices. Create and maintain platform observability using Azure Monitor, Analytics, and Application Insights. Collaborate with engineering teams to ensure service reliability on the platform. Promote best practice in cloud engineering … private endpoints, load balancing, etc. Scripting proficiency in Bash, PowerShell, or Python. Linux operating system knowledge and troubleshooting capability. Experience implementing monitoring, logging, and observability solutions in Azure. Ability to communicate platform issues like risk, platform health, cost etc to non-technical audiences. Desirable Skills: Experience contributing to architecture ...

ML Infrastructure Lead

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
versioning, reproducibility, experimentation, feature management and release management Own and improve the production environment for machine learning systems, ensuring strong standards for availability, performance, observability and resilience Define and implement monitoring across model and platform layers, including system health, data quality, drift, latency, throughput and cost efficiency Build or optimise … pipelines, infrastructure-as-code and workflow orchestration Experience with tools such as Airflow or similar platform and orchestration technologies Good understanding of model observability, data quality, feature pipelines, lineage and reproducibility Experience designing scalable infrastructure for ML workloads, including training, batch inference and real-time serving Strong appreciation of reliability ...

Senior Software Engineer II - Data Engineering

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
ensure technical consistency.* Design, develop, and maintain generative AI services and reusable components using Python.* Define and promote best practices in engineering, including scalability, observability, testing, and CI/CD.* Contribute to system designs spanning multiple services and modules, aligning with architectural best practices.* Collaborate with product, platform, and research … work collaboratively across functions in an Agile or Kanban environment.**Nice to have:*** Experience operationalizing LLMs or building an internal AI platform.* Familiarity with observability practices (metrics, logging, alerts).* Exposure to knowledge graphs or semantic search systems.Join our team and contribute to a culture of innovation, collaboration, and excellence. ...

Platform Engineer

Hiring Organisation
UA Consulting
Location
City, London, United Kingdom
Employment Type
Contract
Contract Rate
GBP 300 - 400 Daily
Platform Engineer with strong site reliability principles to join our Platform team.Youllfocus onmaintainingand improving production reliability, automating operational tasks, and enhancing our observability stack.Youllwork closely with SREs, support engineers, release managers, and incident managers to ensureour systems meet SLIs, SLOs, and SLA targets click apply for full job details ...

Splunk Developer

Hiring Organisation
Infoplus Technologies UK Ltd
Location
Edinburgh, Midlothian, Scotland, United Kingdom
Employment Type
Contract
Contract Rate
From £350 to £400 per day
application teams to deliver scalable monitoring, service health, and analytics solutions. ________________________________________ Key Responsibilities Technical Leadership Act as Technical Lead for Splunk implementations across monitoring, observability, and service intelligence use cases. Own end to end Splunk solution design including data onboarding, data models, dashboards, alerts, and ITSI objects. Review and govern … oSplunk Dashboard Studio/Classic dashboards Design meaningful alerts using: oCorrelation searches oRisk based alerting principles Translate operational and business requirements into actionable insights. Observability & Production Support Integrate Splunk with enterprise observability tools (APM, infrastructure monitoring, cloud platforms). Support production incidents using Splunk, driving root cause analysis and post ...

IT Service Performance & Reliability Manager

Hiring Organisation
Spectrum It Recruitment Limited
Location
New Milton, Hampshire, United Kingdom
Employment Type
Permanent
Salary
GBP 60,000 Annual
across critical IT services. This role focuses on keeping customer-facing services fast, reliable, and fully observable, while driving continuous improvement. You will lead observability across services, ensuring effective monitoring and actionable insights click apply for full job details ...

Site Reliability Engineer - BACLJP00013172

Hiring Organisation
Huxley Associates
Location
Bromley, Greater London, UK
Employment Type
Full-time
Lead role within a banking/payments environment that I thought might be of interest. You'd lead SRE strategy, driving automation, observability, and reliability by design, with a focus on reducing incidents and improving recovery. Looking for someone with 8+ years' experience in SRE, strong resilience engi... LFWQ1_UKTJ ...

Site Reliability Engineer - BACLJP00013172

Hiring Organisation
17918
Location
Bromley, South East London, United Kingdom
Lead role within a banking/payments environment that I thought might be of interest. You'd lead SRE strategy, driving automation, observability, and reliability by design, with a focus on reducing incidents and improving recovery. Looking for someone with 8+ years' experience in SRE, strong resilience engi... CRWG1_UKTJ ...

SRE Engineer: High-Availability, Hybrid – London

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
approach. The candidate will work with Product Engineering teams to manage high availability and uptime, utilizing a code-first approach. Responsibilities include implementing observability standards, managing incidents, and optimizing platform costs. Offering a competitive salary of £85,000 - £90,000, the role is ideal for those passionate about reliability ...

Database Reliability Engineer | Postgres, Kubernetes & Cloud

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
leading UK fintech company seeks data professionals to modernize its database systems. Roles involve enhancing PostgreSQL and Kubernetes setups, establishing observability through monitoring tools, and ensuring data integrity across multi-cloud frameworks. Working hybrid, the ideal candidates are innovative and collaborative, with strong backgrounds in backend development and infrastructure provisioning. ...

Data Platform Solution Architect

Hiring Organisation
Jobleads-UK
Location
Basildon, England, United Kingdom
Design Documents (ADDs)*** Deep understanding of **cloud-native design patterns*** Experience in **performance tuning** across:* Snowflake* Airflow* Iceberg* Focus on **platform reliability, scalability, and observability*** Experience designing and operating **data platforms** in production environments #J-18808-Ljbffr ...

Site Reliability Engineer - BACLJP00013172

Hiring Organisation
Jobleads-UK
Location
Bromley, England, United Kingdom
Lead role within a banking/payments environment that I thought might be of interest. You'd lead SRE strategy, driving automation, observability, and reliability by design, with a focus on reducing incidents and improving recovery. Looking for someone with 8+ years' experience in SRE, strong resilience engineering background ...

Site Reliability Engineer - BACLJP

Hiring Organisation
Huxley Associates
Location
Yorkshire, United Kingdom
Employment Type
Contract
Contract Rate
GBP 600 Daily
Lead role within a banking/payments environment that I thought might be of interest. You'd lead SRE strategy, driving automation, observability, and reliability by design, with a focus on reducing incidents and improving recovery. Looking for someone with 8+ years' experience in SRE, strong resilience engineering background ...

Site Reliability Engineer - BACLJP00013172

Hiring Organisation
Huxley Associates
Location
Bromley, London, South Yorkshire, United Kingdom
Employment Type
Contract
Contract Rate
£600/day
Lead role within a banking/payments environment that I thought might be of interest. You'd lead SRE strategy, driving automation, observability, and reliability by design, with a focus on reducing incidents and improving recovery. Looking for someone with 8+ years' experience in SRE, strong resilience engineering background ...

Head of Network Services

Hiring Organisation
G.R.E. Recruitment Limited
Location
Cirencester, Gloucestershire, South West, United Kingdom
Employment Type
Permanent
Salary
£75,000
WLAN, and SD-WAN Network security and segmentation OT/BMS/industrial networking environments Routing, switching, and IP design Monitoring, logging, and observability High availability and resilience design The company have been established for 17 years and have over 100 employees, they have added 25 new employees ...

Senior Network Architect, GPU Fabric and AI Infrastructure

Hiring Organisation
We Love Alfa
Location
London, United Kingdom
Employment Type
Permanent
Salary
GBP 180,000 - 240,000 Annual
directly impact customer training workloads. This person will own network architecture across GPU fabric, InfiniBand, RoCE v2, Ethernet leaf spine, edge connectivity, peering, observability, deployment standards and operational handover. We are looking for someone who has: Deep GPU cluster or HPC deployment experience Strong InfiniBand production experience RoCE v2 experience ...

Senior Infrastructure Architect

Hiring Organisation
ALFA TECHNOLOGY RECRUITMENT LTD
Location
City of London, London, United Kingdom
Employment Type
Temporary
directly impact customer training workloads. This person will own network architecture across GPU fabric, InfiniBand, RoCE v2, Ethernet leaf spine, edge connectivity, peering, observability, deployment standards and operational handover. We are looking for someone who has: Deep GPU cluster or HPC deployment experience Strong InfiniBand production experience RoCE v2 experience ...