226 to 250 of 261 Remote/Hybrid Observability Jobs

Splunk and OpenShift Observability Engineer

Hiring Organisation
CBSbutler Holdings Limited trading as CBSbutler
Location
London, United Kingdom
Employment Type
Contract
Contract Rate
£400 - £490/day
Role Title: Splunk and Openshift Observability Engineer Location: Sheffield/Birmingham/London/Hybrid - 2/3 days per week onsite Duration: 8 months Rate: £494 per day inside ir35 We're looking for a Splunk & OpenShift Observability Engineer to design, deploy, and optimise enterprise-grade monitoring across hybrid … Kubernetes and OpenShift environments. This is a high-impact role where you'll shape observability strategy, enhance service intelligence, and ensure platform reliability at scale - balancing performance, cost efficiency, and security governance. You'll work at the intersection of platform engineering, observability, and service intelligence, helping to transform raw telemetry ...

Senior Site Reliability Engineer (Public Cloud)

Hiring Organisation
Head Resourcing
Location
Edinburgh, Midlothian, Scotland, United Kingdom
Employment Type
Full-Time
Salary
£70,000 - £85,000 per annum
experienced Senior Site Reliability Engineer to join the team. This is real SRE work: reducing toil, building automation, improving system reliability and observability, and supporting large-scale cloud environments across Azure and GCP . The Role You'll be part of a unified SRE team supporting multiple cloud teams, working … Reliability, performance and observability across Azure/GCP Automation to reduce repeat incidents, tickets, and manual processes Improving SLOs, SLIs, error budgets and platform health Building and maintaining Terraform modules, GitHub pipelines and IaC Supporting app teams as they migrate large workloads to cloud 1-in-4 on-call (enhanced ...

GCP Data Engineer - London

Hiring Organisation
Reed
Location
City of London, London, England, United Kingdom
Employment Type
Contractor
Contract Rate
Salary negotiable
pipeline frameworks. Embed data quality checks by default, including schema validation, completeness, freshness, thresholds, and automated alerting. Enhance end-to-end pipeline resilience, monitoring, observability, failure handling, and recovery mechanisms. Integrate AI/ML features to boost reliability, anomaly detection, and operational efficiency. Collaborate closely with Data Product teams, Analytics …/App Engine. Familiarity with CI/CD & DevOps practices, including automated testing and infrastructure as code. Experience in implementing data quality frameworks, observability tooling, and production monitoring patterns. Proven ability to build reusable data pipeline templates for large-scale, multi-domain platforms. Experience in enterprise data transformation programmes with ...

Azure DevOps Platform Engineer Remote Outside IR35

Hiring Organisation
Interact Consulting Limited
Location
Manchester, North West, United Kingdom
Employment Type
Contract, Work From Home
Building and managing cloud infrastructure using Terraform (IaC) Supporting platform development across Azure environments Driving best practices in DevOps, security, and reliability Contributing to observability and SRE principles Collaborating closely with engineers to deliver resilient, scalable solutions What we're looking for Strong Azure experience (Azure certification required) HashiCorp Terraform … Code Kubernetes certification (required) Solid understanding of DevOps practices and platform engineering Strong awareness of security best practices in cloud environments Familiarity with observability and SRE concepts Why join? £500 per day, outside IR35 Fully remote (UK-based) Work with a highly respected, mission-driven health tech company Be part ...

Senior Front-end Developer (AI-First SaaS Platform)

Hiring Organisation
Keepnet
Location
United Kingdom
live in production , Keepnet runs autonomous systems that plan, build, and operate security awareness and human risk workflows — supported by strong guardrails, auditability, and observability . We’re looking for a Senior Front-end Developer who wants to build production-grade web experiences that power these systems … Create resilient UX for async workflows (jobs, queues, long-running tasks): polling, retries, idempotent actions, progress states, and error recovery. Improve and maintain frontend observability (client-side logging, metrics, tracing where applicable; tools like Sentry ) to prevent incidents rather than react to them. Write and maintain automated tests across levels ...

Data Architect (DV)

Hiring Organisation
Anson Mccade
Location
London, United Kingdom
Employment Type
Permanent, Work From Home
clients translate strategic business needs into scalable, resilient, and secure solutions. You will work across cloud and multi-platform architectures, ensuring data governance, security, observability, and cost efficiency are embedded into every design. The Data Architect role is based in a hybrid model, with a minimum of two days … Architect As a Data Architect , you will: Define end-to-end data architecture for complex programmes, including ingestion, orchestration, governance, security, cost-optimisation, and observability Architect and implement multi-cloud, data lake, and data warehouse platforms Design scalable data pipelines, integration workflows, and analytics solutions Apply ML/AI frameworks ...

Data Architect (DV)

Hiring Organisation
Anson Mccade
Location
Manchester, North West, United Kingdom
Employment Type
Permanent, Work From Home
clients translate strategic business needs into scalable, resilient, and secure solutions. You will work across cloud and multi-platform architectures, ensuring data governance, security, observability, and cost efficiency are embedded into every design. The Data Architect role is based in a hybrid model, with a minimum of two days … Architect As a Data Architect , you will: Define end-to-end data architecture for complex programmes, including ingestion, orchestration, governance, security, cost-optimisation, and observability Architect and implement multi-cloud, data lake, and data warehouse platforms Design scalable data pipelines, integration workflows, and analytics solutions Apply ML/AI frameworks ...

Data Architect (DV)

Hiring Organisation
Anson Mccade
Location
Bristol, Avon, South West, United Kingdom
Employment Type
Permanent, Work From Home
clients translate strategic business needs into scalable, resilient, and secure solutions. You will work across cloud and multi-platform architectures, ensuring data governance, security, observability, and cost efficiency are embedded into every design. The Data Architect role is based in a hybrid model, with a minimum of two days … Architect As a Data Architect , you will: Define end-to-end data architecture for complex programmes, including ingestion, orchestration, governance, security, cost-optimisation, and observability Architect and implement multi-cloud, data lake, and data warehouse platforms Design scalable data pipelines, integration workflows, and analytics solutions Apply ML/AI frameworks ...

Support Engineer

Hiring Organisation
Ordnance Survey
Location
Southampton, Hampshire, England, United Kingdom
Employment Type
Full-Time
Salary
£43,918 - £51,238 per annum
improvements to service performance, including automating deployments, right-sizing systems, and extending monitoring and alerting capabilities Safeguarding critical services by continually assessing and improving observability, resilience and security Investigating and resolving root cause issues, identifying why failures occur, and working with subject matter experts when necessary to fully resolve problems … technologies and best practice - ideally in Azure Infrastructure-as-Code - ideally using Bicep A track record of continually identifying and implementing service improvements or observability Experience of coaching and mentoring other team members and providing consultancy to other teams Additionally, you will provide expert technical consultancy to enable the business ...

Agentic RAG Engineer/Architect - London (Contract)

Hiring Organisation
FUTURUS FINANCIAL RECRUITMENT LTD
Location
London Area, United Kingdom
only retrieve data the requesting user is authorised to see. Implement PII detection, data classification, and audit trails for every retrieval operation. 10. Observability & Performance — Instrument the RAG layer with comprehensive tracing: query decomposition traces, retrieval latency per source, relevance scores, token usage, cache hit rates. Optimise for sub-second … Document processing - Unstructured.io, LlamaParse, Apache Tika ● LLM providers - OpenAI (GPT-4+), Anthropic (Claude), Azure OpenAI ● Languages - Python, Rust ● Evaluation - RAGAS, custom evaluation harnesses, LangSmith ● Observability - OpenTelemetry, LangSmith/LangFuse, Grafana ● Infrastructure - Kubernetes (AKS) Qualifications ● Bachelor's or Master's degree in Computer Science, Information Retrieval, Computational Linguistics, Data Science ...

Cloud Architect

Hiring Organisation
Ultima
Location
United Kingdom
Job Description: Cloud Architect – Azure, DevOps, Terraform (with Technical Account Management Focus) Position: Cloud Architect Location: Remote (UK-based) Type: Full-time We are seeking a skilled and client-focused Cloud Architect with deep expertise ...

Senior Fullstack Engineer (Backend)

Hiring Organisation
Ronald James
Location
London, South East, England, United Kingdom
Employment Type
Full-Time
Salary
£60,000 - £95,000 per annum
Senior Fullstack Engineer (Backend) Hybrid/Remote - has to be based in London Role Overview We’re looking for a Senior Full Stack (Backend) Engineer who thrives on building clean, scalable systems quickly. You’ll ...

Lead Platform Engineer

Hiring Organisation
REVYBE IT RECRUITMENT LIMITED
Location
City of London, London, United Kingdom
Employment Type
Permanent, Work From Home
LeadPlatformEngineer-FinTech £110,000+Bonus(£15k+) CentralLondon-Hybridworking,2/3daysperweekintheoffice WereworkingwithahighlysuccessfulFinTechbusinessinCentralLondonwhoarelookingtohireaLeadPlatformEngineertohelpshapethefutureoftheirinfrastructureandplatformstrategy. Thisisahigh-impactrolewithinagrowingengineeringteamwhereyoullhavetheopportunitytoinfluencearchitecturaldecisions,mentorengineers,andremaindeeplyhands-onwithmoderninfrastructuretooling.Thecompanybuildsallit'ssoftwarein-houseandhasbeeninvestingheavilyinitsplatform,observability,andcloudcapabilitiesastheycontinuetoscale. TheOpportunity: YoulljoinastheLeadPlatformEngineer,workingcloselywithengineeringleadershiptodriveimprovementsacrossinfrastructure,reliability,anddeveloperexperience.Thisrolesitsattheintersectionofhands-onengineering,mentoring,andstrategy.Youllguideplatformdirectionwhilecontinuingtobuildandimprovetheinfrastructurethatpowersthebusiness. Youllalsomentoroneplatformengineer,helpingthemgrowwhileensuringtheteamcontinuesdeliveringhigh-qualityinfrastructureandautomation. Environment: Theplatformcurrentlyoperatesinahybridenvironment: ~60%on-premiseinfrastructure ~40%MicrosoftAzure Thelong-termstrategyisfocusedonmodernisingtheplatform,improvingobservability,andevolvingcloudcapabilities,makingthisanexcellentopportunityforsomeonewhoenjoysbuildingandshapingsystems. TechStack: YoullbeworkingacrossamodernDevOpsandplatformstackincluding: Kubernetes Terraform Hybridcloudinfrastructure(on-premise+Azure …/CD&Automation GitHubActions Python AzureServices AzureKubernetesService(AKS) AzureVirtualMachines AzureVirtualNetworks AzureLoadBalancer AzureApplicationGateway AzureStorageAccounts AzureBlobStorage AzureKeyVault AzureMonitor AzureLogAnalytics AzureActiveDirectory AzureContainerRegistry AzureDNS AzureDevOpsintegrations Observability Logging,monitoring,andtracingacrossdistributedsystems Buildingmeaningfultelemetryandplatformvisibility Whatyou'llbedoing: Leadingtheevolutionofthecompanysplatformandinfrastructurestrategy DesigningandimprovinghybridAzure+on-premiseenvironments DrivingKubernetesplatformimprovements BuildingautomationwithTerraformandPython Improvingobservabilityandmonitoringacrosssystems MentoringaPlatformEngineerandhelpingshapeplatformbestpractices Workingcloselywithengineeringteamstoimprovedeveloperexperienceandreliability Whythisroleisexciting: Hugeimpactonthefutureplatformarchitecture Opportunitytoshapethehybridcloudstrategy Combinationoftechnicalleadershipandhands-onengineering ModernDevOpstoolingandcloudtechnologies Directinfluenceonplatformreliabilityandscalability Package: Salary:Upto£110,000 Bonus:15k+ ...

Site Reliability Engineer

Hiring Organisation
Halian | Managed Services, Recruitment Agency & Contract Staffing
Location
United Kingdom
improvements Own and refine SLIs, SLOs, and error budgets Reduce operational toil through automation Deep-dive Linux debugging, performance tuning, and systems analysis Strengthen observability, monitoring, and alerting Provide technical leadership to a small SRE/engineering group Improve and manage on‐call processes (PagerDuty, OpsGenie, etc.) Collaborate with development … experience Hands‐on incident management and postmortems Experience mentoring or leading a small technical team Scripting/automation with Python, Go, or Bash Strong observability skills (Datadog, Prometheus, Grafana, CloudWatch) Why This Role Appeals to Real SREs You’ll be solving actual SRE problems: reliability, incidents, resilience, uptime ...

Site Reliability Engineering (SRE) Manager

Hiring Organisation
Halian Technology Limited
Location
United Kingdom
Employment Type
Permanent, Work From Home
improvements Own and refine SLIs, SLOs, and error budgets Reduce operational toil through automation Deep-dive Linux debugging, performance tuning, and systems analysis Strengthen observability, monitoring, and alerting Provide technical leadership to a small SRE/engineering group Improve and manage on-call processes (PagerDuty, OpsGenie, etc.) Collaborate with development … experience Hands-on incident management and postmortems Experience mentoring or leading a small technical team Scripting/automation with Python, Go, or Bash Strong observability skills (Datadog, Prometheus, Grafana, CloudWatch) Why This Role Appeals to Real SREs Youll be solving actual SRE problems: reliability, incidents, resilience, uptime Youll guide ...

Lead DevOps Engineer (Azure)

Hiring Organisation
Reed Technology
Location
East Anglia, United Kingdom
Employment Type
Permanent
Salary
£75,000
pipeline templates, PR/branch policies, approvals and gated releases * Creating 'golden path' delivery patterns so teams can deploy without bespoke pipelines Operational readiness & observability * Defining monitoring, logging, alerting and dashboards * Improving incident response, runbooks and recovery processes * Shaping DR and operational processes (no on-call at present) Ways …/CD engineering experience * Experience implementing governance, security guardrails and delivery controls * Comfortable operating without an existing DevOps team Desirable * Azure Policy at scale * Observability, SRE or platform engineering practices * Container/AKS experience * Cost governance and showback/chargeback experience Why this role? * Opportunity to own and shape DevOps ...

Cloud Security and Platform Engineer

Hiring Organisation
RealityMine
Location
Trafford Park, England, United Kingdom
mainly focused on AWS, with growing involvement in other cloud and SaaS platforms. You’ll improve existing environments—managing identity and access, governance, security, observability, and lifecycle—by reducing risks, eliminating unsafe configurations, validating ownership, and ensuring the cloud estate is clearly governed and auditable. You will take an active … role in improving RealityMine’s security posture by improving and operating security scanning, improving monitoring and observability, and ensuring risks, vulnerabilities, and end of life components are identified and addressed in a timely and pragmatic way. You will also develop automation used to support security and operational hygiene, reducing manual ...

AI Architect

Hiring Organisation
Stackstudio Digital Ltd
Location
United Kingdom
Employment Type
Permanent
into high value solutions Enforce IAM least privilege with IAM Conditions, organisation policies, and scoped service accounts; integrate BeyondCorp for zero trust access Operationalise observability using Cloud Logging, Cloud Monitoring, Error Reporting, Trace, and Profiler; build model/LLM telemetry dashboards and alerts Identify the right AI/ML frameworks … patterns, vector databases, embeddings, and prompt/guardrail engineering Desirable Skills/Knowledge/Experience Knowledge of MLOps/AgentOps, CI/CD, and observability Strong understanding of regulated financial services environments Proven experience implementing AI risk controls, model governance, and auditability Ensure alignment with FCA, PRA, data privacy, model ...

Gen AI Engineer

Hiring Organisation
Wave Group
Location
England, United Kingdom
applications in production environments Evidence of debugging real issues such as incorrect outputs, latency spikes, retrieval failures or agent misbehaviour Experience with monitoring and observability of LLM systems, for example Langfuse, Prometheus, Grafana, OpenTelemetry or similar Strong understanding of RAG systems, retrieval pipelines and evaluation workflows Experience with agentic frameworks … application and infrastructure layers Multimodal experience across text and image or video is beneficial Tech stack Python, AWS, LangGraph, LangChain, vector databases, evaluation tooling, observability platforms, Docker Why join Small, senior team with high ownership Systems already in production with real customers Bi-weekly shipping cycles with fast feedback loops ...

Production Engineer- DevOps skills (Lisbon or Porto)

Hiring Organisation
Lùkla
Location
Lisboa, Portugal
Employment Type
Permanent
Salary
EUR Annual
scalable environments. If you are passionate about automation, cloud, and continuous system improvement, this opportunity is for you. Responsibilities: Ensure the stability, performance, and observability of production systems Implement and manage monitoring and observability solutions (e.g., Dynatrace) Automate operational processes through scripts and playbooks Work with orchestration and scheduling tools … infrastructures Collaborate with cross-functional teams in an agile environment Requirements: Technical skills Experience in DevOps/Production Engineering (minimum 2 years) Knowledge of: Observability (e.g., Dynatrace) Terraform OpenShift/Cloud environments Schedulers (CFT, AutoSys) Automation with: Python (scripting) Ansible ( ability to create playbooks from scratch ) Soft Skills Strong communication ...

Head of Software Engineering

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
with Product and Design as part of a leadership trio, shaping vision and outcomes. Establish modern engineering standards (cloud‐first, CI/CD, automation, observability, secure SDLC). Drive operational excellence across performance, resilience, and security. Build and scale a multi‐site engineering organisation, embedding a culture of ownership … architectures, and distributed systems. Strong knowledge of Web, Mobile, FE technologies such as JavaScript, React, Kotlin, .Net, Azure. Experience implementing CI/CD pipelines, observability, and secure engineering practices. Track record of scaling teams and delivering in fast‐paced, evolving environments. Experience working in or with startup/scale ...

Head of Software Engineering

Hiring Organisation
MORGAN PHILIPS UK LIMITED
Location
City of London, London, United Kingdom
Employment Type
Permanent, Work From Home
with Product and Design as part of a leadership trio, shaping vision and outcomes. Establish modern engineering standards (cloud-first, CI/CD, automation, observability, secure SDLC). Drive operational excellence across performance, resilience, and security Build and scale a multi-site engineering organisation, embedding a culture of ownership … architectures, and distributed systems. Strong knowledge of Web, Mobile, FE technologies such as JavaScript, React, Kotlin, .Net, Azure. Experience implementing CI/CD pipelines, observability, and secure engineering practices. Track record of scaling teams and delivering in fast-paced, evolving environments. Experience working in or with startup/scale ...

Head of Software Engineering

Hiring Organisation
Morgan Philips Specialist Recruitment
Location
London, South East, England, United Kingdom
Employment Type
Full-Time
Salary
Salary negotiable
with Product and Design as part of a leadership trio, shaping vision and outcomes. Establish modern engineering standards (cloud-first, CI/CD, automation, observability, secure SDLC). Drive operational excellence across performance, resilience, and security Build and scale a multi-site engineering organisation, embedding a culture of ownership … architectures, and distributed systems. Strong knowledge of Web, Mobile, FE technologies such as JavaScript, React, Kotlin, .Net, Azure. Experience implementing CI/CD pipelines, observability, and secure engineering practices. Track record of scaling teams and delivering in fast-paced, evolving environments. Experience working in or with startup/scale ...

Automation Engineer

Hiring Organisation
RealityMine
Location
Trafford Park, England, United Kingdom
test automation frameworks (including our AI-assisted tools), scripting (e.g. Python and JavaScript), CI/CD tooling and our internal observability tools to design and execute automated test suites, manage device infrastructure, and provide fast, reliable feedback to product and engineering teams. Our offices are in Trafford Park, Manchester … managing or using a device farm solution (e.g. AWS Device Farm, Firebase Test Lab, BrowserStack, Sauce Labs, or an internal farm). · Familiarity with observability and monitoring for test and device infrastructure (logs, metrics, dashboards, alerts). · Knowledge of mobile platform internals (Android/iOS), SDK integration testing, or backend ...

Integration Consultant webMethods, Boomi

Hiring Organisation
Smart Sourcer Limited
Location
High Wycombe, Buckinghamshire, South East, United Kingdom
Employment Type
Permanent, Work From Home
Salary
£65,000
runbooks. Implement CI/CD, automated testing, and modern release strategies (blue/green, canary). Strengthen operations with performance tuning, resiliency patterns, and observability dashboards. Govern APIs (authN/Z, rate limiting, threat protection) and manage lifecycle. What You Bring: 35+ years in integration, including 3+ with webMethods. Strong … TechXchange, User Groups). Migration experience (legacy/MuleSoft webMethods). AI in integration (GenAI co-pilots, agent frameworks, anomaly detection). Dashboarding/observability for APIs/B2B/IS. Ways of Working: Hybrid/remote, with on-site workshops across the UK & Ireland (occasional EU travel). Agile ...