376 to 400 of 496 Observability Jobs in England

Site Reliability Engineer II

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
design, build and evolve our infrastructure platform. You will develop Terraform modules, build CI/CD pipelines with GitHub Actions and deliver automation and observability improvements that keep our platform reliable, secure and easy for teams to adopt at scale.**Responsibilities:**- Designing and developing reusable Terraform modules that enable teams … deliver reliable and repeatable infrastructure deployments- Diagnosing and resolving complex infrastructure issues by identifying root causes across distributed cloud environments- Developing automation and observability tooling to improve how infrastructure is operated at scale- Implementing security and governance controls within modules and pipelines so teams inherit secure configurations by default- Collaborating ...

Data Platform Engineer

Hiring Organisation
PRISM DIGITAL LIMITED
Location
Milton Keynes, Buckinghamshire, South East, United Kingdom
Employment Type
Permanent, Work From Home
Salary
£75,000
availability Own incident resolution, root cause analysis, and continuous improvement Collaborate with engineers and third-party providers to mature the platform Contribute to monitoring, observability, and cost optimisation strategies Support projects and business initiatives through robust platform delivery What Theyre Looking For: Microsoft Fabric experience Terraform experience Cloud platform engineering … delivery environments What Youll Work With: Microsoft Fabric Terraform (Infrastructure as Code) Azure cloud technologies SQL Server GitHub/CI/CD tooling Monitoring & observability tools Platform design patterns (scalability, resilience, cost control) Nice to Haves: GitHub Actions/CI/CD pipelines Zero Trust architecture Cloud cost monitoring & reporting ...

Senior Data Engineer

Hiring Organisation
True North Group
Location
Newcastle Upon Tyne, Tyne and Wear, England, United Kingdom
Employment Type
Full-Time
Salary
£70,000 - £75,000 per annum
Asset Bundles for deployment and environment management • Support CI/CD and infrastructure automation across the data platform • Ensure data quality, governance, lineage, and observability best practices • Optimise cluster performance, orchestration, and cost efficiency • Collaborate with architects, analysts, and wider engineering teams • Contribute to platform standards, reusable frameworks, and engineering … ownership and autonomy Nice to Have • CI/CD and Infrastructure as Code experience • Experience with governance and metadata tooling • Exposure to data observability and monitoring frameworks • Previous experience working in fast-paced agile delivery environments This is an excellent opportunity to join a modern engineering environment where ...

Site Reliability Engineer's

Hiring Organisation
F5 consultants
Location
Reading, Berkshire, South East, United Kingdom
Employment Type
Permanent, Work From Home
Salary
£70,000
support, shared ownership, and continuous improvement. You'll work hands-on in a modern cloud-native environment leveraging Kubernetes, OpenShift, GitOps, service mesh, and observability tooling There is genuine investment in your development through training, certifications, and the expertise of those around you. You'll also be part … Ability to work within complex multi-cloud or hybrid environments with a solid foundation in distributed systems Expertise in observability tooling such as Prometheus, Grafana, Loki, and Tempo Proficiency in IaC tools such as Kustomize and Helm, with scripting skills in Bash/Python Experience managing GitOps pipelines using Tekton ...

Senior Solutions Consultant - Open Data Platform (ODP)

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
Consultant to join our SNS team. As a Solutions Consultant, you will work with customers on short- to medium-term engagements to implement data observability needs using the Open Data Platform (ODP). In this role, you will leverage your expertise in Open-Source Foundation, big data systems, Kubernetes … migrations and deployments. Consult on design and architecture; implement strategic customer projects that lead to customers' successful understanding, evaluation, and adoption of Acceldata Data Observability Cloud. Good understanding of Data management concepts like Data Quality, Data Catalog and Data Governance. Hands-on experience with two or more common Cloud ecosystems ...

Senior Software Engineer - Electronic Trading Shared Services

Hiring Organisation
Jobleads-UK
Location
City Of London, England, United Kingdom
trading workflows and products Design common frameworks and APIs that unify data exchange across applications and services Drive initiatives that enhance scalability, resilience, and observability across the platform Partner with engineering and product teams across asset classes to deliver shared solutions that power new trading capabilities Gain a deeper understanding … Linux environments Experience with streaming or messaging technologies, e.g., Kafka Knowledge of service‐oriented or microservices architectures Interest in performance optimization, reliability engineering, and observability Curiosity about financial markets and how technology drives trade automation and transparency If indicated, please note that years of experience are a guide; we will ...

Principal Full Stack Engineer & Architecture Lead

Hiring Organisation
Command Recruitment
Location
London, United Kingdom
Employment Type
Permanent
Salary
£80000 - £90000/annum
technical design decisions Define scalable, secure, and maintainable engineering standards Provide technical leadership across frontend, backend, APIs, infrastructure, and integrations Drive platform scalability, resilience, observability, and performance Partner with leadership teams to align technical strategy with business goals Act as the senior technical authority for complex engineering decisions Hands … Gateway, EventBridge, SQS, Step Functions, S3, CloudWatch, RDS) Backend Node.js, TypeScript Frontend React, Next.js, Tailwind CSS Data & Architecture PostgreSQL, Serverless, Event-Driven Microservices DevOps & Observability Terraform/AWS CDK, CI/CD, Monitoring & Logging About You We are looking for a technically strong and commercially minded engineering leader with: 10+ ...

Senior Software Engineer - DevOps

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
Implementing infrastructure as code and improving automation across environments Troubleshooting and resolving complex build, deployment and production issues across application and infrastructure layers Improving observability, reliability and performance of internal platforms and production systems Partnering with engineering teams to define best practices for deployment, release management and cloud architecture Contributing … principles Experience working with GitHub, including workflow automation and repository management Experience with infrastructure as code and automated environment management Strong understanding of reliability, observability and operational best practices Ability to debug complex systems and work effectively across multiple engineering teams Why Deliveroo Our mission is to transform ...

AI Engineer

Hiring Organisation
VIA MATCH LIMITED
Location
London, South East, England, United Kingdom
Employment Type
Full-Time
Salary
£80,000 - £110,000 per annum
Doing Designing and building production-grade AI systems that integrate LLMs, RAG pipelines, vector databases, and agentic frameworks Creating evaluation and observability frameworks to measure, monitor, and continuously improve system performance, accuracy, and reliability Implementing and maintaining retrieval systems, including ingestion pipelines, chunking strategies, and advanced techniques such as HyDE … with hands-on fine-tuning experience Familiarity with real-time streaming, multimodal models, or search technologies such as Elasticsearch Experience with model observability tools such as LangSmith or Weights & Biases Background in a regulated or specialised vertical (financial services, healthcare, energy, legal, retail), with an understanding of compliance, security ...

Platform Site Reliability Engineer

Hiring Organisation
Jobleads-UK
Location
England, United Kingdom
provision of tooling for our support organisation Apply ITSM frameworks: Incident, Major Incident, Change Management, and service improvement. Maintain and enhance Radiant’s observability stack: Prometheus, Grafana, and custom monitoring integrations Operate and support services in 24x7 production environments, including on-call rotation Contribute to Incident postmortem analyses, root cause … DHCP, VLANs, routing, switching Strong experience with API interrogation Strong experience with infrastructure scripting and automation (Bash, Python, Ansible) Deep understanding of observability principles and tools (Prometheus, Grafana preferred) Strong grasp of ITSM and service operation best practices Excellent communication and mentorship skills Comfortable interfacing with internal stakeholders and external ...

Principal Platform Engineer (Edge)

Hiring Organisation
Jobleads-UK
Location
Bristol, England, United Kingdom
peer coordination, and systems that can operate independently during disconnection. You are the sort of engineer who thinks carefully about failure modes, deployment risk, observability, workload lifecycle, service discovery, and operational simplicity. You care about building robust abstractions that allow application teams to securely deploy workloads without needing to understand … limited connectivity. Working on security baselines for edge nodes, including secure boot, hardware‐rooted identity, attestation, and the runtime isolation of workloads. Building observability, logging, and telemetry capabilities that work when bandwidth is scarce and devices are intermittently reachable. Designing zero‐touch onboarding and provisioning flows so devices come online ...

Platform engineer

Hiring Organisation
Beat My Salary
Location
Reading, Berkshire, United Kingdom
Employment Type
Permanent
Location : Reading NO Visa sponsorship Eligibility :ILR/Citizen/Dependent/Settled Domain : Telecom Job summary : Worked for large-scale, mission critical environments in Telecom domain. Implement service mesh architectures using Istio for traffic ...

Senior SRE – Electronic Trading Observability Lead

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
global financial services firm in Greater London is seeking a Senior Software Engineer/SRE to focus on ensuring observability and resilience for its Electronic Trading systems. You will drive reliability initiatives, develop frameworks for tracking metrics, and collaborate on system health reports. The ideal candidate has a strong background ...

Site Reliability Engineer II: Infra, CI/CD & Observability

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
infrastructure platform. Located in London, you will design and develop Terraform modules, build CI/CD pipelines with GitHub Actions, and improve automation and observability for our systems. The ideal candidate possesses strong experience with Infrastructure as Code, especially Terraform, and enjoys resolving complex infrastructure challenges across cloud environments. ...

Platform Engineering Lead — AWS, Kubernetes, Observability

Hiring Organisation
Jobleads-UK
Location
City Of London, England, United Kingdom
opportunities for professional growth, including leadership training. Ideal candidates should have key technical skills in AWS, Terraform, and Kubernetes, along with a passion for observability and Linux. An exciting chance for those looking to make an impact within a collaborative team environment. #J-18808-Ljbffr ...

Field CTO EMEA

Hiring Organisation
Jobleads-UK
Location
Maidenhead, England, United Kingdom
Engineering, platform teams, and business stakeholders.Translate customer business goals into compelling transformation strategies powered by Dynatrace.Lead high-impact technical discovery and executive conversations around observability, cloud modernization, AI adoption, security, automation, and business outcomes.Shape account strategy with Sales and Solution Engineering teams for complex, multi-stakeholder deals.Develop board-level … executive-level narratives that connect platform capabilities to risk reduction, operational excellence, digital experience, and growth.Guide customers on modern observability and security operating models, including platform engineering, SRE, DevSecOps, and AI-assisted operations.Support large opportunities by validating architecture direction, differentiation, value realization, and long-term platform vision.Influence go-to-market ...

Principal Site Reliability Engineer

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
across critical services Crafting and implementing a cloud infrastructure and tooling strategy Work across our Org to level up SRE practices Help implement robust observability metrics, logs & traces using our observability tool Guide the team in building automated, self-healing systems Own and evolve our incident response processes, including …/NLB, IAM, CloudWatch, etc.) Expert in Infrastructure as Code using tools such as Terraform , with knowledge of GitOps workflows Strong background in observability: metrics, visualization, logging, and tracing Understanding of automation, SDLC, CI/CD pipelines, deployment automation, and blue/green or canary releases Proven experience with incident ...

Principal AI Engineer

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
Agentic ecosystem, responsible for the high‐level design choices that define how agents run at PhysicsX. You will cover topics such as: Agent Observability: Own the implementation to enforce deep tracing, granular cost tracking, and observability across the lifecycle. Agent Deployment: Deliver an intuitive deployment lifecycle which simplifies questions around … behalf of users in a regulated enterprise environment. The Tech Stack Core Platform: Python (Primary), Go or TypeScript (Secondary), Kubernetes, Docker, Terraform. Observability & Evals: OTel, LangSmith, Arize, Braintrust. Who You Are An Architect at Heart: You have strong, reasoned opinions on Durable Execution vs. Standard Async, Vector Search vs. Keyword ...

Agentic AI Data Architect

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
ModelOps - Azure AI Foundry (model hosting, versioning, monitoring); Evaluation frameworks (LLM-as-judge, test datasets); Prompt/version control, cost/latency monitoring DevOps & Observability - CI/CD pipelines (Azure DevOps/GitHub Actions); Logging, monitoring, observability (App Insights, etc.); Performance tuning and scalability As part of a leading global ...

Data Reliability Engineer

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
work from home 2 days per week. This is a high-impact role focused on improving data quality, reducing incidents, and building scalable observability across a modern enterprise data platform. You’ll help ensure data across the organisation is accurate, reliable, and trusted for critical business decision-making. … style roles, with strong SQL and Python skills and experience working in modern cloud-based data environments. Hands‐on experience with data observability tools such as Grafana, Monte Carlo, or Acceldata, and data governance/quality platforms like Informatica, Collibra or Microsoft Purview is highly desirable. Experience within the Azure ...

Senior Software Engineer / SRE - Electronic Trading

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
Senior Software Engineer/SRE - Electronic Trading Location London Business Area Engineering and CTO Ref # 10050148 Description & Requirements About Observability Engineering Senior Software Engineers - SRE in Electronic Trading (ET) ensure our global enterprise products spanning fixed income, equities, and derivatives are resilient and observable. This role focuses on building … culture and platforms of observability and resilience to prevent market disruptions for global traders. We specialize in proactive anomaly detection, providing advanced performance insights and best practice guidance. Our team collaborates with application developers to define meaningful SLOs, implement chaos engineering, and build diagnostic tools that mitigate architectural risks ...

Senior AI Product Engineer

Hiring Organisation
Jobleads-UK
Location
York and North Yorkshire, England, United Kingdom
awareness Use tools such as DSPy (or similar) for optimisation and evaluation Deploy and operate services using Azure (OpenAI, Web Apps/Functions) Implement observability (Application Insights) and CI/CD (Azure DevOps) Contribute to infrastructure via Terraform Build high‐quality, async Python services with strong testing (pytest) Collaborate with … similar) Strong API and data modelling skills Experience with async Python Experience with Azure environments and CI/CD pipelines Familiarity with Terraform and observability tooling Minimum Qualifications Degree or equivalent Right to work in the country of employment Integrity and Ethics All StarCompliance employees are expected to commit ...

Platform Principal Engineer

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
self-service capabilities. Upskill and Mentor: Transition the in-house engineering team into a high-performing internal platform team throughout the platform build process. Observability: Design and implement enterprise-grade logging, metrics, and tracing for Kubernetes at scale. IaC Leadership: Implement and manage Infrastructure as Code to a senior standard … Terraform/Open Tofu module design. (MUST) Kubernetes Engineering: GitOps (Argo CD/Flux), secrets management, ingress/mesh, and OPA/Gatekeeper. (MUST) Observability: OpenTelemetry (MUST) Tooling: Spacelift, Atlantis, or Terraform Cloud (Desired) Governance: EPAC (Enterprise Policy as Code) (Desired) What You'll Bring To Us Recent, hands ...

Senior Software Engineer

Hiring Organisation
Jobleads-UK
Location
City Of London, England, United Kingdom
flexibility, simplicity and delivery speed Build and maintain backend services and integrations that support our insurance journeys Work with infrastructure, CI/CD and observability to help the team ship safely and often Partner with product, design and data to turn ambiguous opportunities into concrete, measurable improvements Raise the technical … similar Testing: integration and end-to-end testing, component story testing, and visual regression testing CI/CD: Automated testing and deployment pipelines Observability: Analytics platforms, error monitoring and performance tracking Cloudflare experience, including Workers, CDN or load balancing Builder.io or other visual/content tooling experience ...

Principal Machine Learning Infrastructure Engineer London, United Kingdom

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
training pipelines for throughput, fault tolerance, and cost efficiency, including checkpointing strategies, gradient accumulation, and multi-node synchronization. Build and maintain experiment tracking and observability systems that give researchers clear visibility into training runs, hyperparameter sweeps, and model performance. Data I/O and Performance Solve data loading bottlenecks … workflows generate and consume data Experience building model serving infrastructure with latency and throughput requirements Familiarity with experiment tracking tools (Weights & Biases, MLflow) and observability stacks (Prometheus, Grafana) What we offer Equity options – share in our success and growth. 10% employer pension contribution – invest in your future. Free office lunches ...