901 to 925 of 1,270 Observability Jobs

Platform Site Reliability Engineer

Hiring Organisation
Jobleads-UK
Location
England, United Kingdom
provision of tooling for our support organisation Apply ITSM frameworks: Incident, Major Incident, Change Management, and service improvement. Maintain and enhance Radiant’s observability stack: Prometheus, Grafana, and custom monitoring integrations Operate and support services in 24x7 production environments, including on-call rotation Contribute to Incident postmortem analyses, root cause … DHCP, VLANs, routing, switching Strong experience with API interrogation Strong experience with infrastructure scripting and automation (Bash, Python, Ansible) Deep understanding of observability principles and tools (Prometheus, Grafana preferred) Strong grasp of ITSM and service operation best practices Excellent communication and mentorship skills Comfortable interfacing with internal stakeholders and external ...

Principal Platform Engineer (Edge)

Hiring Organisation
Jobleads-UK
Location
Bristol, England, United Kingdom
peer coordination, and systems that can operate independently during disconnection. You are the sort of engineer who thinks carefully about failure modes, deployment risk, observability, workload lifecycle, service discovery, and operational simplicity. You care about building robust abstractions that allow application teams to securely deploy workloads without needing to understand … limited connectivity. Working on security baselines for edge nodes, including secure boot, hardware‐rooted identity, attestation, and the runtime isolation of workloads. Building observability, logging, and telemetry capabilities that work when bandwidth is scarce and devices are intermittently reachable. Designing zero‐touch onboarding and provisioning flows so devices come online ...

Senior Onboarding Engineering | 6 month Contract

Hiring Organisation
Novatus
Location
London Area, United Kingdom
Novatus is a Series B scale-up RegTech SaaS provider and boutique advisory practice, enabling financial services firms to solve complex challenges and redefine what’s possible through expert-led technology and consulting. Across both ...

Senior SRE Lead

Hiring Organisation
Albany Beck
Location
London Area, United Kingdom
about capability build, technical excellence, and delivering meaningful change within complex enterprise environments. Role Overview Albany Beck is seeking a Senior SRE Lead/Observability SME to lead the establishment of a new enterprise Site Reliability Engineering (SRE) capability, with a primary focus on designing and implementing a modern observability … suite and operational resilience framework. This is a foundational build role, responsible for defining how reliability engineering and observability are structured, measured, and embedded across a complex global technology estate. The successful candidate will play a key role in shifting the organisation from reactive operational support to a metrics-driven ...

Senior SRE – Electronic Trading Observability Lead

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
global financial services firm in Greater London is seeking a Senior Software Engineer/SRE to focus on ensuring observability and resilience for its Electronic Trading systems. You will drive reliability initiatives, develop frameworks for tracking metrics, and collaborate on system health reports. The ideal candidate has a strong background ...

Platform Engineering Lead — AWS, Kubernetes, Observability

Hiring Organisation
Jobleads-UK
Location
City Of London, England, United Kingdom
opportunities for professional growth, including leadership training. Ideal candidates should have key technical skills in AWS, Terraform, and Kubernetes, along with a passion for observability and Linux. An exciting chance for those looking to make an impact within a collaborative team environment. #J-18808-Ljbffr ...

Field CTO EMEA

Hiring Organisation
Jobleads-UK
Location
Maidenhead, England, United Kingdom
Engineering, platform teams, and business stakeholders.Translate customer business goals into compelling transformation strategies powered by Dynatrace.Lead high-impact technical discovery and executive conversations around observability, cloud modernization, AI adoption, security, automation, and business outcomes.Shape account strategy with Sales and Solution Engineering teams for complex, multi-stakeholder deals.Develop board-level … executive-level narratives that connect platform capabilities to risk reduction, operational excellence, digital experience, and growth.Guide customers on modern observability and security operating models, including platform engineering, SRE, DevSecOps, and AI-assisted operations.Support large opportunities by validating architecture direction, differentiation, value realization, and long-term platform vision.Influence go-to-market ...

Senior AI Solutions Architect, Pre-Sales & Integration

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
experimenting with cutting‐edge technologies. Preferred Qualifications Advanced Integration – Experience integrating Salesforce with external agents via APIs and open standards (MCP, A2A). Governance & Observability – Familiarity with prompt governance, observability, monitoring frameworks, responsible AI and compliance best practices. Cross‐Platform Background – Background in cross‐platform integrations (e.g., Hyperscaler SDKs ...

Principal AI Engineer

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
Agentic ecosystem, responsible for the high‐level design choices that define how agents run at PhysicsX. You will cover topics such as: Agent Observability: Own the implementation to enforce deep tracing, granular cost tracking, and observability across the lifecycle. Agent Deployment: Deliver an intuitive deployment lifecycle which simplifies questions around … behalf of users in a regulated enterprise environment. The Tech Stack Core Platform: Python (Primary), Go or TypeScript (Secondary), Kubernetes, Docker, Terraform. Observability & Evals: OTel, LangSmith, Arize, Braintrust. Who You Are An Architect at Heart: You have strong, reasoned opinions on Durable Execution vs. Standard Async, Vector Search vs. Keyword ...

Head of Engineering

Hiring Organisation
Xapien
Location
London Area, United Kingdom
leads who own architectural decisions within a domain-driven design structure. ● Establish engineering-wide standards for code quality, review processes, and technical governance. ● Build observability, incident management, and on-call practices that scale with team growth and deployment frequency. ● Embed DevOps, MLOps, security, and compliance practices into … Series A/B). ● Technical Credibility: Strong background in cloud-native architectures, distributed systems, and modern delivery practices (CI/CD, automated testing, observability). Experience with cloud cost management and infrastructure optimisation. ● Operational Maturity: Experience building observability, on-call rotations, and incident management practices as engineering organisations scale ...

Azure Site Relaibility Engineer

Hiring Organisation
WWT EMEA UK LIMITED
Location
Glasgow, Lanarkshire, Scotland, United Kingdom
Employment Type
Contract
Contract Rate
From £650 to £700 per day
Technology (WWT) is seeking experienced Azure Site Reliability Engineers to join a client-embedded Platform Health workstream. You will help deliver critical reliability and observability capabilities that underpin two major Azure milestones: Gold Dev and Azure General Availability. The ideal candidate will work shoulder-to-shoulder with product, engineering … Glasgow, United Kingdom (Onsite) Job Description: Engineer will support Platform Health workstream, focusing on Azure GA readiness and Gold Dev milestones, with emphasis on observability, automation, and secure cloud architecture. Key Responsibilities: Design and implement SLOs/SLIs across user, application, and infrastructure layers Build Azure platform health solutions using ...

Azure SRE Engineer

Hiring Organisation
Oscar Associates (UK) Limited
Location
Glasgow, Lanarkshire, Scotland, United Kingdom
Employment Type
Part Time
Salary
£575 - £625 per day
Contract We're looking for two experienced Azure Site Reliability Engineers to join a major Financial Services programme focused on platform health, reliability, and observability across a large-scale Azure environment. You'll be responsible for building and maintaining Azure platform health infrastructure using Terraform, developing Python-based automation … integrations, and implementing SLOs/SLIs across infrastructure and application layers. The role also involves working with observability tooling, event-driven integrations, and Azure-native services in a highly collaborative environment with engineering and product stakeholders. Required experience: * Strong hands-on Azure engineering experience * Terraform in production environments (primary ...

Azure SRE Engineer

Hiring Organisation
Oscar Technology
Location
Glasgow, Lanarkshire, Scotland, United Kingdom
Employment Type
Contractor
Contract Rate
£575 - £625 per day
Contract We're looking for two experienced Azure Site Reliability Engineers to join a major Financial Services programme focused on platform health, reliability, and observability across a large-scale Azure environment. You'll be responsible for building and maintaining Azure platform health infrastructure using Terraform, developing Python-based automation … integrations, and implementing SLOs/SLIs across infrastructure and application layers. The role also involves working with observability tooling, event-driven integrations, and Azure-native services in a highly collaborative environment with engineering and product stakeholders. Required experience:* Strong hands-on Azure engineering experience* Terraform in production environments (primary ...

Agentic AI Data Architect

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
ModelOps - Azure AI Foundry (model hosting, versioning, monitoring); Evaluation frameworks (LLM-as-judge, test datasets); Prompt/version control, cost/latency monitoring DevOps & Observability - CI/CD pipelines (Azure DevOps/GitHub Actions); Logging, monitoring, observability (App Insights, etc.); Performance tuning and scalability As part of a leading global ...

Senior Platform Engineer

Hiring Organisation
REALM
Location
United Kingdom
building and owning the production infrastructure for a multi-user distributed system from the ground up. That means designing for debuggability and observability from day one, not bolting it on later. Core remit includes scalable multi-environment Terraform, secrets management, gradual deployment practices (blue/green), and the ability … testing. An AI/multi-agent infrastructure component is on the near-term roadmap. The stack IaC Terraform + Terragrunt Helm/Kubernetes AWS Observability Prometheus/Grafana Auth0 Rust/Golang NoSQL What they're looking for Production experience with Terraform, Helm/Kubernetes, AWS networking, and debugging multi ...

Data Reliability Engineer

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
work from home 2 days per week. This is a high-impact role focused on improving data quality, reducing incidents, and building scalable observability across a modern enterprise data platform. You’ll help ensure data across the organisation is accurate, reliable, and trusted for critical business decision-making. … style roles, with strong SQL and Python skills and experience working in modern cloud-based data environments. Hands‐on experience with data observability tools such as Grafana, Monte Carlo, or Acceldata, and data governance/quality platforms like Informatica, Collibra or Microsoft Purview is highly desirable. Experience within the Azure ...

DevOps Engineer

Hiring Organisation
Prism Digital
Location
Cambridge, England, United Kingdom
across two regions, with the plan to bring the second along the same path over time. They also need someone to introduce proper observability and monitoring - knowing when things aren't running, alerting the right people, and building the kind of visibility that lets the team respond rather than react. … session host infrastructure (being deprecated over the next 12-18 months) CI/CD tooling and working with the development team Observability and monitoring tooling - currently limited; you'd shape this Disaster recovery architecture MFC C++, .NET services, Angular front-end (context for the broader dev estate) Nice to Haves ...

Senior Software Engineer / SRE - Electronic Trading

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
Senior Software Engineer/SRE - Electronic Trading Location London Business Area Engineering and CTO Ref # 10050148 Description & Requirements About Observability Engineering Senior Software Engineers - SRE in Electronic Trading (ET) ensure our global enterprise products spanning fixed income, equities, and derivatives are resilient and observable. This role focuses on building … culture and platforms of observability and resilience to prevent market disruptions for global traders. We specialize in proactive anomaly detection, providing advanced performance insights and best practice guidance. Our team collaborates with application developers to define meaningful SLOs, implement chaos engineering, and build diagnostic tools that mitigate architectural risks ...

SRE Technical Lead

Hiring Organisation
Adecco
Location
Reading, Berkshire, United Kingdom
Employment Type
Permanent
Salary
GBP 70,000 - 90,000 Annual
remediation Act as the technical escalation point for major incidents and high-risk releases Lead blameless post-incident reviews and ensure continuous improvement Establish observability and capacity management practices using modern tooling Identify and eliminate systemic reliability risks and operational inefficiencies Collaborate with engineering, platform, security, and operations teams across … Experience working in multi-cloud or hybrid cloud environments Strong understanding of SRE principles (SLOs, SLAs, error budgets, reliability engineering) Hands-on experience with observability tooling (eg, Prometheus, Grafana, OpenTelemetry, Loki, Tempo) Strong knowledge of Infrastructure as Code and GitOps (eg, Helm, Kustomize, ArgoCD, Tekton) Experience with CI/ ...

Technical Lead

Hiring Organisation
Findrs
Location
Aylesbury, England, United Kingdom
technical contribution. The Role As Software Tech Lead, the successful candidate will define and drive the overall software architecture across backend services, APIs, observability systems, data infrastructure, and cloud integrations. Working closely with product and engineering leadership, they will translate complex deployment requirements into scalable technical solutions while ensuring … part in setting technical standards across the business. From API consistency and schema evolution through to CI/CD practices, security baselines, and observability frameworks, the successful candidate will help establish the engineering foundations that support long term scale. On the backend, the role will involve guiding and contributing ...

Snowflake Data Cloud Architect

Hiring Organisation
Talent Software Services
Location
New York, United States
Employment Type
Permanent
Salary
USD 180,000 Annual
upstream and downstream system interoperability. Data Governance and Compliance: Implement RBAC, data masking, and encryption aligned with enterprise data policy. Ensure lineage and observability for regulatory reporting and audit. Technical Leadership: Act as a trusted advisor for architectural decisions and future-state roadmaps. Prepare technical specifications and design documentation. Innovation … upstream and downstream system interoperability. Data Governance and Compliance: Implement RBAC, data masking, and encryption aligned with enterprise data policy. Ensure lineage and observability for regulatory reporting and audit. Technical Leadership: Act as a trusted advisor for architectural decisions and future-state roadmaps. Prepare technical specifications and design documentation. Innovation ...

AWS DevOps Engineer

Hiring Organisation
Sanderson Recruitment Plc
Location
Bristol, Somerset, United Kingdom
Employment Type
Permanent
Salary
GBP 70,000 - 75,000 Annual
enable efficient, reliable software delivery . Collaborate closely with developers to ensure performance, reliability, and security across the platform . Implement monitoring, alerting, and observability solutions to ensure system health and performance . Support continuous improvement in DevOps practices, automation, and tooling Must Have Experience . Strong experience working within … GitLab, and familiarity with GitFlow . Knowledge of security best practice, including credential and secret management . Experience with monitoring, alerting, and observability tooling . Strong Scripting skills, particularly Bash . Experience working in Agile software development environments . A problem-solving mindset with a collaborative approach to engineering challenges ...

Senior AI Product Engineer

Hiring Organisation
Jobleads-UK
Location
York and North Yorkshire, England, United Kingdom
awareness Use tools such as DSPy (or similar) for optimisation and evaluation Deploy and operate services using Azure (OpenAI, Web Apps/Functions) Implement observability (Application Insights) and CI/CD (Azure DevOps) Contribute to infrastructure via Terraform Build high‐quality, async Python services with strong testing (pytest) Collaborate with … similar) Strong API and data modelling skills Experience with async Python Experience with Azure environments and CI/CD pipelines Familiarity with Terraform and observability tooling Minimum Qualifications Degree or equivalent Right to work in the country of employment Integrity and Ethics All StarCompliance employees are expected to commit ...

Principal Full Stack Engineer & Architecture Lead

Hiring Organisation
BCT Resourcing
Location
London, South East, England, United Kingdom
Employment Type
Full-Time
Salary
£80,000 - £90,000 per annum
technical design decisions * Define scalable, secure, and maintainable engineering standards * Provide technical leadership across frontend, backend, APIs, infrastructure, and integrations * Drive platform scalability, resilience, observability, and performance * Partner with leadership teams to align technical strategy with business goals * Act as the senior technical authority for complex engineering decisionsHands-On Engineering … Lambda, API Gateway, EventBridge, SQS, Step Functions, S3, CloudWatch, RDS)Backend Node.js, TypeScriptFrontend React, Next.js, Tailwind CSSData & Architecture PostgreSQL, Serverless, Event-Driven MicroservicesDevOps & Observability Terraform/AWS CDK, CI/CD, Monitoring & LoggingAbout YouWe are looking for a technically strong and commercially minded engineering leader with: * 10+ years of software ...

AI Engineering Product Manager

Hiring Organisation
Jobleads-UK
Location
Waterside, Scotland, United Kingdom
grade AI agents integrated with complex airline systems. Establish best practices for OpenAI, Anthropic, Azure OpenAI, LangGraph, AutoGen and other frameworks. Implement engineering discipline: observability, safety, automated evaluation, behavioural testing and continuous improvement. Matrix and Partner Leadership Operate effectively across Group, OpCos, cloud, data and security teams. Coordinate delivery streams … direct authority. Demonstrated integration of LLM‐based agents with enterprise systems, APIs, RPA, orchestration platforms and internal tools. Grounding in DevSecOps, cloud‐native architecture, observability and CI/CD. Strong communication skills; able to translate complex technical concepts to senior executives. Experience with high‐stakes, fast‐paced environments and ambiguous ...