726 to 750 of 1,233 Permanent Observability Jobs

Site Reliability Engineer

Hiring Organisation
Pertemps London
Location
London, United Kingdom
Employment Type
Permanent
Salary
GBP 50,000 Annual
Jenkins, GitLab CI) Develop and maintain Terraform modules for infrastructure-as-code Build automation tools (CLI tools, scripts, GitHub Apps, self-service tooling) Own observability: dashboards, alerts, monitoring, and runbooks Continuously improve platform processes and reduce operational toil What We're Looking For Essential Skills & Experience 2-3 years … GitHub Actions, GitLab CI, Jenkins) Ability to write production-quality code in Python or Bash Solid networking fundamentals (DNS, load balancers, CDNs) Experience with observability tools (NewRelic, Datadog, Prometheus, Grafana) Comfortable participating in on-call rotations Experience using AI tools (e.g. ChatGPT, Copilot, Cursor) to enhance productivity Desirable Go, Ansible ...

AKS DevOps Engineer - Azure Kubernetes

Hiring Organisation
Reed
Location
London Gatwick Airport, Gatwick, West Sussex, England, United Kingdom
Employment Type
Full-Time
Salary
£70,000 per annum, Inc benefits
/CD pipelines using Azure DevOps with YAML. Implement and maintain secure networking patterns and apply cloud security best practices. Create and maintain platform observability using Azure Monitor, Analytics, and Application Insights. Collaborate with engineering teams to ensure service reliability on the platform. Promote best practice in cloud engineering … private endpoints, load balancing, etc. Scripting proficiency in Bash, PowerShell, or Python. Linux operating system knowledge and troubleshooting capability. Experience implementing monitoring, logging, and observability solutions in Azure. Ability to communicate platform issues like risk, platform health, cost etc to non-technical audiences. Desirable Skills: Experience contributing to architecture ...

Principal Artificial Intelligence (AI) Platform Engineer/Architect

Hiring Organisation
WTW
Location
Greater London, United Kingdom
Employment Type
Full Time
engagement—building credibility and driving adoption across the organization Provide escalation pathways for architecture questions and unblock teams on complex integration challenges Implement monitoring, observability, and governance systems that provide transparency without creating bottlenecks Collaborate with security, compliance, and data teams to embed safety guardrails into platform capabilities Participate … experience) Proven ability to design systems that abstract complexity and enable teams to self-serve at scale Strong software engineering fundamentals (system design, testing, observability, operational excellence, SDLC practices) Experience building or maintaining developer-facing platforms, SDKs, or internal tools Comfortable articulating technical architecture, vision, and strategy to both technical ...

Platform Engineer

Hiring Organisation
Accenture
Location
Glasgow, Scotland, United Kingdom
code generation, testing, documentation, and analysis, while understanding model limitations, protecting client data, and improving delivery quality and speed through pragmatic automation SRE & Observability You’ll bring a reliability mindset to delivery, designing services that are operable by default and measured through meaningful SLIs/SLOs. You’ll help teams … implement pragmatic observability—logging, metrics, and distributed tracing—with actionable alerting, and you’ll contribute to (or lead) incident response and post-incident reviews that drive learning and measurable improvements. We are looking for experience in the following skills: Strong experience with the AWS cloud platform and core services. Hands ...

ML Infrastructure Lead

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
versioning, reproducibility, experimentation, feature management and release management Own and improve the production environment for machine learning systems, ensuring strong standards for availability, performance, observability and resilience Define and implement monitoring across model and platform layers, including system health, data quality, drift, latency, throughput and cost efficiency Build or optimise … pipelines, infrastructure-as-code and workflow orchestration Experience with tools such as Airflow or similar platform and orchestration technologies Good understanding of model observability, data quality, feature pipelines, lineage and reproducibility Experience designing scalable infrastructure for ML workloads, including training, batch inference and real-time serving Strong appreciation of reliability ...

Senior Software Engineer II - Data Engineering

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
ensure technical consistency.* Design, develop, and maintain generative AI services and reusable components using Python.* Define and promote best practices in engineering, including scalability, observability, testing, and CI/CD.* Contribute to system designs spanning multiple services and modules, aligning with architectural best practices.* Collaborate with product, platform, and research … work collaboratively across functions in an Agile or Kanban environment.**Nice to have:*** Experience operationalizing LLMs or building an internal AI platform.* Familiarity with observability practices (metrics, logging, alerts).* Exposure to knowledge graphs or semantic search systems.Join our team and contribute to a culture of innovation, collaboration, and excellence. ...

Lead DevOps Engineer

Hiring Organisation
Vaco LLC
Location
Dublin, Ohio, United States
Employment Type
Permanent
Salary
USD Annual
secure, and cost-effective infrastructure across production, development, and test environments. This is a deeply hands-on position responsible for executing and improving deployments, observability, and core operational practices to reduce risk caused by opaque processes, undocumented knowledge, and single points of failure. The Lead DevOps Engineer transforms deployment … application architecture, infrastructure, and deployment workflows. Proven ability to troubleshoot complex issues across infrastructure, CI/CD pipelines, and runtime environments. Solid understanding of observability, including metrics, logging, alerting, and root-cause analysis. Strong security mindset, including secrets management, access controls, encryption, patching, and vulnerability management. Deep understanding of network ...

Site Reliability Engineer

Hiring Organisation
Fuel Recruitment
Location
Farnborough, Hampshire, United Kingdom
Employment Type
Permanent
Salary
GBP 60,000 Annual
Site Reliability Engineer to help design, deploy and optimise secure, resilient platforms across internal and customer environments. The role is focused on automation, observability and taking new solutions from proof-of-concept through to full p click apply for full job details ...

Database Reliability Engineer | Postgres, Kubernetes & Cloud

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
leading UK fintech company seeks data professionals to modernize its database systems. Roles involve enhancing PostgreSQL and Kubernetes setups, establishing observability through monitoring tools, and ensuring data integrity across multi-cloud frameworks. Working hybrid, the ideal candidates are innovative and collaborative, with strong backgrounds in backend development and infrastructure provisioning. ...

Site Reliability Engineer

Hiring Organisation
WTW
Location
Cambridgeshire, United Kingdom
Employment Type
Full Time
their technology. This role will have the opportunity to help the team and product deal with exciting, complex and large-scale client propositions where observability will be essential and help transform how the product is designed and deployed. You will join a cross-team guild of Site Reliability Engineers, which … enables you to not only influence direction within your product family, but to also help shape how we handle observability and monitoring across ICT. This role is open to flexible and hybrid working arrangements, with presence in the Cambridge office a minimum of two days per week. The Role Collaborate ...

Data Platform Solution Architect

Hiring Organisation
Jobleads-UK
Location
Basildon, England, United Kingdom
Design Documents (ADDs)*** Deep understanding of **cloud-native design patterns*** Experience in **performance tuning** across:* Snowflake* Airflow* Iceberg* Focus on **platform reliability, scalability, and observability*** Experience designing and operating **data platforms** in production environments #J-18808-Ljbffr ...

Back End Developer

Hiring Organisation
Insight Global
Location
City of London, London, United Kingdom
development within an enterprise level organization Extensive experience coding and deploying features within AWS serverless environment Experience working with AWS Services Lambda, S3, DynamoDB Observability tools such as Datadog or NewRelic ...

Site Reliability Engineer - BACLJP00013172

Hiring Organisation
Jobleads-UK
Location
Bromley, England, United Kingdom
Lead role within a banking/payments environment that I thought might be of interest. You'd lead SRE strategy, driving automation, observability, and reliability by design, with a focus on reducing incidents and improving recovery. Looking for someone with 8+ years' experience in SRE, strong resilience engineering background ...

Head of Network Services

Hiring Organisation
G.R.E. Recruitment Limited
Location
Cirencester, Gloucestershire, South West, United Kingdom
Employment Type
Permanent
Salary
£75,000
WLAN, and SD-WAN Network security and segmentation OT/BMS/industrial networking environments Routing, switching, and IP design Monitoring, logging, and observability High availability and resilience design The company have been established for 17 years and have over 100 employees, they have added 25 new employees ...

Senior Network Architect, GPU Fabric and AI Infrastructure

Hiring Organisation
We Love Alfa
Location
London, United Kingdom
Employment Type
Permanent
Salary
GBP 180,000 - 240,000 Annual
directly impact customer training workloads. This person will own network architecture across GPU fabric, InfiniBand, RoCE v2, Ethernet leaf spine, edge connectivity, peering, observability, deployment standards and operational handover. We are looking for someone who has: Deep GPU cluster or HPC deployment experience Strong InfiniBand production experience RoCE v2 experience ...

Business Development Representative

Hiring Organisation
Apica
Location
United Kingdom
Apica is a leading provider of innovative software solutions in the Observability space, designed to revolutionize how businesses gain insights into their systems and applications. We are dedicated to delivering cutting-edge products that streamline processes and enhance user experiences. Our mission is to empower organizations to thrive ...

BDR Language Speaker

Hiring Organisation
Pareto
Location
London, South East, England, United Kingdom
Employment Type
Full-Time
Salary
£30,000 - £35,000 per annum
must speak Filipino fluently to qualify for this role* Our client is a global data platform that helps turn data into action for Observability, IT, Security and more. Leaders in their field, our client is growing at an exciting rate and as such are now looking for new bi-lingual ...

Senior Frontend Developer

Hiring Organisation
SEEKR
Location
London Area, United Kingdom
bridges so builders can wire their products into hundreds of third‐party tools without hand‐rolling every integration. It handles managed auth, real‐time observability and connector sprawl so product teams can focus on great agent experiences instead of glue code. Your job is to make the surface they ...

Frontend Engineer

Hiring Organisation
Loft Orbital Solutions
Location
Golden, Colorado, United States
Employment Type
Permanent
Salary
USD Annual
Frontend Engineering team, as well as interface with the wider Software team. About this role: Building and refining tools for satellite operations and autopilot observability Developing and maintaining our web applications and services Collaborating on UI/UX design to enhance user experience (no need to be a designer, just ...

Frontend Engineer

Hiring Organisation
Loft Orbital Solutions
Location
San Francisco, California, United States
Employment Type
Permanent
Salary
USD Annual
Frontend Engineering team, as well as interface with the wider Software team. About this role: Building and refining tools for satellite operations and autopilot observability Developing and maintaining our web applications and services Collaborating on UI/UX design to enhance user experience (no need to be a designer, just ...

Principal Architect

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
cross‐team delivery of strategic outcomes; fewer blockers; consistent adherence to standards Developer experience and reliability Partner with Platform to uplift CI/CD, observability, SLOs and incident learning; advocate for fitness functions and paved roads Improved flow and stability metrics; faster, safer releases; measurable DX improvements People leadership Mentor ...

Python Engineer - up to £60,000 + Bonus - Hybrid

Hiring Organisation
Involved Solutions
Location
Ireland
Employment Type
Full-Time
Salary
£50,000 - £60,000 per annum
driving engineering best practice across the software delivery lifecycle. The Python Engineer role is suited to an engineer who enjoys clean coding principles, automation, observability and modern DevOps practices. Responsibilities for the Python Engineer: Design, develop, test and maintain backend services and microservices Build and enhance RESTful APIs aligned … ensure code quality and reliability Containerise applications using Docker and support CI/CD deployment pipelines Implement logging, monitoring and metrics to improve platform observability Collaborate with QA, DevOps and architecture teams across delivery initiatives Troubleshoot and resolve production and application issues Contribute towards continuous improvement of engineering standards ...

Senior DataOps Engineer – Databricks Platform

Hiring Organisation
Unisys
Location
City of London, London, United Kingdom
promotion strategies across development, testing, and production Reduce manual deployment activities and operational risk Integrate source control and modern software engineering practices Platform Reliability & Observability Develop monitoring, alerting, and operational dashboards Improve platform resilience, stability, and recoverability Design solutions for failure handling, rollback, and operational recovery Support platform performance optimisation … infrastructure automation Proven experience building CI/CD frameworks for complex cloud platforms Strong Python skills for automation and tooling Experience implementing monitoring, observability, and operational support capabilities Solid understanding of cloud security, access control, and governance principles Strong software engineering fundamentals and automation mindset Nice-to-have: Enterprise-scale ...

Site Reliability Engineer

Hiring Organisation
Oliver Bernard
Location
United Kingdom
hire a mid-level Site Reliability Engineer into a newly created role. This is a true SRE position with a strong focus on observability, incident management and production operations, working closely alongside development and platform teams to improve reliability and performance across a high-scale cloud environment. For this opportunity … with Terraform (building modules, not just consuming templates) CI/CD work with GitHub and/or GitLab Strong history of Monitoring and Observability (with Prometheus and Datadog) Solid understanding of incident management and response Experience operating within high-scale production environments The business is heavily investing ...

Principal Site Reliability Engineering Expert Director

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
shaping how reliability, automation, and operational excellence are engineered across the organisation. Operating across domains including traditional infrastructure, cloud engineering, network operations, identity, observability, security, AI-driven operations, and automated data workflows, the role focuses on designing scalable systems, reusable engineering patterns, and standardised controls that reduce operational toil, improve … first, measurable, and repeatable practices. A key part of the role is building and evolving reusable CI/CD and Terraform modules, engineering guardrails, observability patterns, and automation frameworks that can be adopted across multiple teams and domains without requiring each team to solve the same problems independently. The Principal ...