to integrate. Create a culture of sharing and collaboration through the portal. Technical Integration:Partner with engineering teams to integrate the portal with CI/CD pipelines, cloud resources, observability, and other key developer tools to make reuse seamless within the development lifecycle. Product Backlog Management:Translate the product vision into actionable stories and epics, maintaining a well-prioritized backlog More ❯
Code principles Design an agile release engineering strategy that delivers value incrementally and continuously Support a highly-available live production system, respond to alerts, diagnose problems using logs and observability tooling, triage and resolve incidents What we offer We make sure our team is well looked after with generous salaries and a great benefits package which includes: Enhanced pension with More ❯
scalability and reduce manual intervention. Operational Security, SRE & Assurance: Ensure security platforms are resilient, continuously monitored, and designed for 24x7 support and incident response readiness. Embed security telemetry and observability to enable proactive threat detection and automated response. Apply SRE principles to improve reliability, performance, and maintainability of security services. Define service level objectives (SLOs) and key performance indicators (KPIs More ❯
Lead SRE/Observability Engineering Lead - (Outside IR35 Contract/Remote) Location: Bristol/London HQ – Largely Remote (Occasional Travel) Day Rate: Outside IR35 – £650 to £750 p/d Duration: 3-6 Months Initial – with intention to extend Payment Terms: Monthly Our client is a FTSE100 Wealth/Asset Management firm seeking to engage a Lead SRE Engineer (Observability … SME) to support the implementation and instrumentation of their new Observability solution. This role will be critical in delivering against our Digital OKRs by embedding observability best practices, frameworks, and tooling across digital platforms and engineering teams. Key Responsibilities: Strategy & Roadmap: Define and drive the observability roadmap in alignment with business priorities and digital platform objectives. Champion observability-by-design … manage SLIs, SLOs, and error budgets to track and improve system reliability. Support capacity and availability planning through real-time telemetry and predictive analytics. Instrumentation & Runbooks: Design and implement observability runbooks covering metrics, logs, traces, synthetics, and customer journey monitoring. Set standards for instrumentation, dashboards, alerting, and enable teams to self-serve their system metrics and traces. Implementation & Enablement: Assist More ❯
platform infrastructure, with opportunities to contribute to backend development when needed. Our systems are already live and in active use, so your focus will be on reliability, performance, automation, observability, and clean, maintainable Infrastructure-as-Code. You’ll also support Python and Node.js backend services. This role is centred on operational excellence, continuous improvement, and technical ownership. The position is … provider (GCP preferred; AWS or Azure acceptable) Experience working with relational databases in production environments (e.g., Postgres, MySQL), including basic performance troubleshooting, migrations, backups, and access control. Familiarity with observability tools such as Prometheus, Grafana, ELK stack, or OpenTelemetry Experience with container orchestration platforms, particularly Kubernetes Ability to systematically troubleshoot and debug distributed systems Comfortable reading, modifying, and writing code More ❯
platform infrastructure, with opportunities to contribute to backend development when needed. Our systems are already live and in active use, so your focus will be on reliability, performance, automation, observability, and clean, maintainable Infrastructure-as-Code. You’ll also support Python and Node.js backend services. This role is centred on operational excellence, continuous improvement, and technical ownership. The position is … provider (GCP preferred; AWS or Azure acceptable) Experience working with relational databases in production environments (e.g., Postgres, MySQL), including basic performance troubleshooting, migrations, backups, and access control. Familiarity with observability tools such as Prometheus, Grafana, ELK stack, or OpenTelemetry Experience with container orchestration platforms, particularly Kubernetes Ability to systematically troubleshoot and debug distributed systems Comfortable reading, modifying, and writing code More ❯
/CD pipelines. Familiarity with cloud-native tooling: AWS (especially CloudWatch) Artifact Management (e.g., Artifactory, CodeArtifact) Infrastructure as Code with Terraform Monitor test metrics, troubleshoot failures, and improve system observability and debuggability. More ❯
lead the next phase of platform maturity. This is your opportunity to: 🔧 Build and scale high-performance, secure, cloud-native infrastructure 📦 Lead platform architecture, CI/CD, DevOps tooling, observability, and security 🌍 Impact product lines used across 9+ international markets 💥 Own platform reliability, developer experience, and operational excellence 🧠 Drive a forward-thinking engineering culture focused on velocity and resilience This More ❯
lead the next phase of platform maturity. This is your opportunity to: 🔧 Build and scale high-performance, secure, cloud-native infrastructure 📦 Lead platform architecture, CI/CD, DevOps tooling, observability, and security 🌍 Impact product lines used across 9+ international markets 💥 Own platform reliability, developer experience, and operational excellence 🧠 Drive a forward-thinking engineering culture focused on velocity and resilience This More ❯
Leicester, Leicestershire, England, United Kingdom
Uniting Ambition
organisational standards. Support environment troubleshooting, incident resolution, and root cause analysis. Drive continuous improvement and advocate best practices across infrastructure automation and cloud governance. Implement and maintain monitoring and observability solutions using Dynatrace. Essential Experience Proven experience in a DevOps or Cloud Engineer role within an enterprise Azure environment. Strong hands-on experience with: Azure (IaaS, PaaS, networking, identity) Terraform More ❯
skills, with a mindset geared towards enabling internal engineering teams Platform Engineer nice to have Exposure to AWS and use of the Cloud Development Kit (CDK) Previous experience maintaining observability stacks (e.g. Datadog ) Background in applying security-first approaches to cloud architecture Aimtech Recruitment is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment More ❯
data architectures in Google Cloud Platform (GCP) – Partner with analysts, product owners, and developers to ensure reliable data delivery – Champion best practices for governance, lineage, and performance – Enhance data observability and automate key workflows for efficiency You Will Need – Advanced hands-on experience with SQL, Python, and core GCP tools (BigQuery, Dataflow, Pub/Sub, Composer or similar) – Proven background More ❯
City of London, London, United Kingdom Hybrid/Remote Options
Retelligence
data architectures in Google Cloud Platform (GCP) – Partner with analysts, product owners, and developers to ensure reliable data delivery – Champion best practices for governance, lineage, and performance – Enhance data observability and automate key workflows for efficiency You Will Need – Advanced hands-on experience with SQL, Python, and core GCP tools (BigQuery, Dataflow, Pub/Sub, Composer or similar) – Proven background More ❯
experience with LLMs/GenAI/ML in production Strong background in C#, .NET, REST APIs , and cloud platforms (Azure, AWS, or GCP) Agile mindset with focus on testing, observability, and secure delivery Excellent communication and cross-functional collaboration skills Nice to have Experience with vector databases , RAG systems , or multi-agent AI Python skills for AI/ML development More ❯
City of London, London, United Kingdom Hybrid/Remote Options
Arrows
Kafka CI/CD pipelines with fully automated deployments and testing TDD, BDD, and modern coding standards Microservices architecture – understanding both its power and its trade-offs Scaling, reliability, observability, and all things non-functional 🧠 We’re Looking For Someone Who: Has worked on OTT (Over-the-top) technologies Knows how to implement lean/agile practices like Scrum, Kanban More ❯
Kafka CI/CD pipelines with fully automated deployments and testing TDD, BDD, and modern coding standards Microservices architecture – understanding both its power and its trade-offs Scaling, reliability, observability, and all things non-functional 🧠 We’re Looking For Someone Who: Has worked on OTT (Over-the-top) technologies Knows how to implement lean/agile practices like Scrum, Kanban More ❯
scientists, bioinformaticians, and product teams to translate scientific needs into resilient software solutions Optimize data processing pipelines for performance, reliability, and compliance Drive adoption of best practices for security, observability, and data governance Mentor engineers and contribute to architectural decisions across teams What We’re Looking For 7+ years of software engineering experience, including extensive experience with Golang Deep understanding More ❯
City of London, London, United Kingdom Hybrid/Remote Options
Oliver Bernard
scientists, bioinformaticians, and product teams to translate scientific needs into resilient software solutions Optimize data processing pipelines for performance, reliability, and compliance Drive adoption of best practices for security, observability, and data governance Mentor engineers and contribute to architectural decisions across teams What We’re Looking For 7+ years of software engineering experience, including extensive experience with Golang Deep understanding More ❯
Crawley, West Sussex, South East, United Kingdom Hybrid/Remote Options
Henderson Scott
to for guidance. Your day-to-day will include: ??? Designing and delivering Linux-based infrastructures for enterprise clients. ? Driving automation with Terraform, Ansible, and Git . ?? Enhancing monitoring/observability with Zabbix and other tools. ?? Acting as a consultant and advisor - translating client needs into technical reality. ?? Mentoring junior engineers and fostering a collaborative learning culture. ?? Collaborating across cloud (AWS More ❯
Basingstoke, Hampshire, United Kingdom Hybrid/Remote Options
Spectrum IT Recruitment
Hands-on AWS experience (certified to Associate or Professional level). Knowledge of multi-account AWS environments and migration best practice. Desirable Azure exposure and cross-cloud understanding. Strong observability, monitoring, and pipeline modernisation experience. If you're a Cloud Engineer with a passion for DevOps, automation, and secure cloud transformation - this is an excellent opportunity to work on high More ❯
enjoy working in a product and client value driven environment. Collaborating closely with Product Managers and business stakeholders Strong understanding of full stack engineering best practices including CICD and Observability Acted as a strong partner to multiple business areas to deliver on key priorities with a one-team mindset Strong people focus - you prioritise your team's L&D, understand More ❯
Alto preferred), network access control (802.1x, RADIUS), or zero-trust security concepts. Exposure to infrastructure-as-code (Terraform, Ansible) and version control systems (Git). Experience with monitoring and observability tools (LogicMonitor, Grafana, Prometheus). Knowledge of hybrid cloud networking, including AWS Direct Connect or GCP Interconnect. Relevant certifications such as CCNP, AWS Advanced Networking Specialty, or Google Cloud Network More ❯
West Midlands (County), Birmingham, United Kingdom Hybrid/Remote Options
Sherborne Talent Solutions
automation, and optimisation of CI/CD pipelines to drive speed, reliability, and consistency. Manage and optimise Azure infrastructure for scalability, security, performance, and cost control. Champion modern monitoring, observability, and incident management practices to maintain high availability. Partner with engineering, architecture, and product leadership to accelerate delivery and reduce operational friction. Drive adoption of FinOps principles to balance technical More ❯