will involve designing robust software solutions that enhance system performance while ensuring high availability for critical applications. You will work hand-in-hand with product engineering teams to improve observability tools and telemetry systems, driving forward automation initiatives that reduce manual intervention. By participating in incident management processes-facilitating transparent communication with stakeholders and leading blameless post-mortems-you will … a focus on automating these activities wherever possible.* Provide on-call support during production incidents outside standard working hours as required by the business needs.* Contribute to enhancing product observability and telemetry by supporting ongoing modernisation efforts within the infrastructure.* Collaborate closely with engineering teams to brainstorm ideas that simplify infrastructure management and streamline SRE practices. What you bring: * Proficiency More ❯
South West London, London, England, United Kingdom
Oscar Technology
experienced Site Reliability Engineer (SRE) to join them on a 6-month contract (outside IR35) You'll be leading efforts acriss AWS and Azure Cloud environments, focusing on automation, observability, infrastructure as code and performance at scale. Stakeholder engagements and strong communication is essential in this role, so if you've been in a start-up/smaller team- this … scripting (Python, Bash, PowerShell), and cloud architecture Comfortable with containerisation and orchestration ( Docker, Kubernetes ) Understanding of networking, DNS, IAM, and load balancing in cloud environments Hands-on experience with observability tooling and production-level troubleshooting If this sounds like you, it's a great opportunity so apply now! Site Reliability Engineer - AWS/Azure | Outside IR35 | £450-500/day More ❯
Site Reliability/DevOp Engineer London - 5 Days Onsite Up to £550 per day (Umbrella, Inside IR35) 12-Month Contract Must hold live and transferrable DV Clearance Are you passionate about reliability, automation, and supporting mission-critical systems? Join this More ❯
Site Reliability/DevOp Engineer London - 5 Days Onsite Up to 550 per day (Umbrella, Inside IR35) 12-Month Contract Must hold live and transferrable DV Clearance Are you passionate about reliability, automation, and supporting mission-critical systems? Join this More ❯
Solihull, West Midlands, England, United Kingdom Hybrid / WFH Options
Sanderson
as Code (IaC) using Terraform , Vagrant , and related tools. Build and maintain secure CI/CD pipelines using Jenkins , Groovy scripting , and other automation tools. Enable robust monitoring and observability through Grafana, Prometheus, Alert Manager , and related tools. Apply DevSecOps practices , integrating tools like SonarQube , ClamAV , and MS Defender into delivery pipelines. Essential Skills & Experience: 10+ years of hands-on More ❯
Job Title: Senior SRE - Site Reliability Engineering for Observability Location: London (Mostly Remote | 1 Day/Week in Office) Pay Rate: £50 - £62 per hour (Inside IR35) Contract Duration: Initial 12 Months Working Hours: 11:00 AM - 7:00 PM About the Role We're looking for a Senior Site Reliability Engineer (SRE) to join a high-impact Observability team … monitoring and logging platforms that ensure service reliability, performance, and visibility. If you're passionate about distributed systems, high-throughput data pipelines, and enabling engineering teams with top-tier observability tooling-this is the role for you. What You'll Be Doing Designing and operating observability platforms (logging, monitoring, alerting) at scale. Managing large, high-performance ElasticSearch clusters and Prometheus … deployments. Building scalable data pipelines using Kafka to process millions of events per second. Developing tools, APIs, and dashboards to enable self-service observability for engineering teams. Automating infrastructure using Terraform and configuration with Ansible . Participating in on-call rotations to ensure platform uptime and responsiveness. What We're Looking For 5+ years of experience in SRE/DevOps More ❯
this role you will, assist in upgrading the Elastic DP estate to Kubernetes thereby moving away from Obsolete technology (Cloudera), uplifting to RHEL 8, contributing towards improving stability and observability of the platform and providing advanced analytics tooling and services for modelling analytics. Working across continuous integration, development, build and deployment using automation & cloud technologies to support the growth of More ❯
AWS services at the DevOps Engineer level Incident, change & problem management experience. This role is heavily operational-oriented, including on-call requirements Strong background in setup & operation of enterprise observability tooling, specifically Prometheus, Grafana and Splunk, including usage of PromQL Proficient in one or more languages of Python, Go, Bash, SQL Familiar with GitHub/GitOps/container orchestration/ More ❯
and CI/CD pipelines. Experience supporting real-time trading applications and proficient in scripting and automation (Python, Bash, PowerShell). Knowledge of messaging middleware (e.g., Solace, 29West) and observability platforms (e.g., ITRS Geneos, Prometheus). Excellent communication skills and comfortable working in Linux systems and hybrid infrastructure. Benefits: Flexible working options between office and home. Exposure to global production More ❯
Dynatrace Observability Monitoring AWS Job Description We are seeking a hands-on Dynatrace expert to join our observability team, tasked with rolling out the platform across the company. You will be instrumental in building a centre of excellence to support adoption across hundreds of teams, working closely with platform teams, application owners, and DevOps engineers to ensure successful adoption of … Dynatrace through best practices, hands-on guidance, and integration with existing monitoring ecosystems. Dynatrace Observability Monitoring AWS Responsibilities Provide technical consulting and enablement to internal teams on using Dynatrace effectively. Guide teams in building dashboards, alerts, and service flow mappings aligned to engineering needs. Help teams craft complex DQL queries to extract meaningful insights from telemetry data. Support observability design … on RBAC models and data access strategies based on team structure and security requirements. Assist in monitoring strategy for Kubernetes-based workloads, especially in hybrid environments. Promote adoption of observability-as-code using tools like Terraform and GitLab. Contribute to reusable patterns, documentation, and internal enablement materials for engineering teams. Dynatrace Observability Monitoring AWS Essential Skills Hands-on experience with More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Morgan Hunt Recruitment
Postgres) Implement OCR, NLP, and ML for document analysis and automation risk assessment Lead R&D spikes and validate system improvements through robust data analysis Ensure code quality, testing, observability, and non-functional compliance (security, UX, performance) Coach team members and contribute to Agile delivery practices Essential Skills Strong commercial experience with Python, TypeScript, SpaCy, and AWS (serverless) Background in More ❯
Manchester, Lancashire, England, United Kingdom Hybrid / WFH Options
Lorien
growing to meet our business needs. What you'll lead: Shape and evolve the backend technical architecture to support product scale and complexity Identify and drive improvements in performance, observability, and infrastructure Lead the design of domain models aligned with evolving business needs Be a go-to person for backend excellence, and improve code quality Engineering centric requirement definition (user More ❯
in financial services, government, or critical infrastructure environments with high assurance cryptography *Familiarity with cryptographic libraries and APIs (OpenSSL, BouncyCastle, Libsodium, PKCS, JCE) *Experience implementing crypto monitoring and certificate observability (eg, CertSpotter, CRL telemetry, misissuance detection) Preferred Qualifications: *Professional certifications such as CISSP-ISSAP, CCSP, CISM, GCLD, GCPN, or Certified Encryption Specialist (EC-Council) LA International is a HMG approved More ❯
Bristol, Avon, England, United Kingdom Hybrid / WFH Options
Sanderson
front-end frameworks (ReactJS, TypeScript), hosted on AWS. Core Responsibilities Provide architectural guidance and ensure alignment with programme-wide technical strategy Champion DevOps practices and support CI/CD, observability, and cloud-native tooling adoption Represent engineering within programme-level forums, planning sessions, and governance routines Ensure delivery is technically robust, scalable, and aligns with evolving business goals Mentor senior More ❯
Your new company This is a major global bank with an office in Central London. Your new role You will be working in a team supporting AWS native databases, supporting other existing products and improving observability. As well as working More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Develop
data, integration layers, and authentication modules Ensure secure, scalable deployment using Azure cloud-native tools Build and support systems using PostgreSQL, Java, and Spring Boot Integrate and monitor using observability tools like Datadog and BigPanda Collaborate closely with architects, DevOps, and security teams across the full SDLC Core Skills & Technologies Strong backend development in Java with Spring Boot Cloud migration … experience, particularly Azure Lift-and-Shift Familiarity with cloud infrastructure and deployment pipelines Exposure to PostgreSQL, authentication/security patterns Monitoring/observability tooling: Datadog, BigPanda Apply now to be considered. More ❯
and Python code for use in Databricks Contributing to architectural decisions around pipeline scalability and performance Supporting the integration of diverse data sources into the platform Ensuring data quality, observability, and cost-efficiency KEY SKILLS AND REQUIREMENTS Strong experience with DBT, Airflow, and Databricks Advanced SQL and solid Python scripting skills Solid understanding of modern data engineering best practices Ability More ❯
specialism in vulnerability management Self-starter, able to work in technical detail and motivate a diverse group of stakeholders to build sponsorship for significant and impactful change Desired: Establishing observability platforms Capabilities adjacent to exposure/vulnerability management capabilities (ie cyber security asset management, attack surface management, etc) Pragmatic application of zero-trust philosophies Cloud based security (GCP, AWS and More ❯
specialism in vulnerability management Self-starter, able to work in technical detail and motivate a diverse group of stakeholders to build sponsorship for significant and impactful change Desired: Establishing observability platforms Capabilities adjacent to exposure/vulnerability management capabilities (ie cyber security asset management, attack surface management, etc) Pragmatic application of zero-trust philosophies Cloud based security (GCP, AWS and More ❯
Data Operations Manager: We are seeking a dynamic and driven Data Operations Manager to lead a team of data engineers. You will oversee the daily operations of our data infrastructure and ensure the accuracy, availability, and security of data across More ❯
and suggest improvements automatically Work closely with teams across the UK, Europe, and India including engineers, product managers, and leadership Tools & Environment: Git, Confluence, Jira Documentation feedback tools and observability dashboards Enterprise documentation systems Exposure to cloud platforms and DevOps environments Distributed, cross-functional teams Ideal Candidate: Proven experience with enterprise-level technical documentation systems Strong understanding of documentation structure More ❯
semantic search, and reasoning workflows. The ideal candidate is proficient in Python, experienced in building multi-step intelligent systems, and comfortable working across UI, APIs, cloud AI platforms, and observability tools. You'll work with cutting-edge frameworks like LangChain, LangGraph, LangFlow, CrewAI, and others to build advanced prototypes that integrate short-term and long-term memory, RAG pipelines, vector More ❯