execution of disaster recovery tests & seek to automate these activities where possible Covering on-call schedule when Production support is required outside of working hours Participate in enhancing product observability and telemetry, support modernization. Brainstorm ideas to simplify and streamline infrastructure by closely working with infrastructure and SRE teams. Required qualifications, capabilities and skills Knowledge of Python/Unix Shell More ❯
often, and embrace hands-on problem-solving; maturing projects as they become foundational parts of the company's infrastructure, whether that means writing resilient, test-driven code, designing for observability, or building systems that can scale and recover gracefully. You’ll have the space to experiment and the responsibility to stabilise when it counts. You’ll work across AWS and … and CI/CD pipelines in a cloud-native environment. -Database Familiarity: Skilled in both SQL and NoSQL (PostgreSQL, DynamoDB, OpenSearch, or equivalents), using ORMs like Django or SQLAlchemy. -Observability & Monitoring: Comfortable using tools like CloudWatch, X-Ray, and structured logging to keep systems running smoothly. -Mindset: Curious, Collaborative, and Proactive - you enjoy solving problems hands-on and aren’t More ❯
Fi authentication systems, CRMs and partnered PropTech tools Continually hone and perfect our homegrown DevOps and CI/CD processes by further developing GitHub Actions pipelines, Terraform definitions and observability integrations. Ensure quality & reliability: establish testing best practices (unit, integration, end-to-end), conduct code reviews and demand high quality standards Shape and refine our cloud-native platform to optimise More ❯
digital trends, challenges, solutions, market dynamics, competition, and peer group activities. Understanding and ability to articulate the vision for modern engineering (e.g., agile, cloud-native, DevOps), and operations (e.g., observability, automated response, SRE etc.), and articulate a path toward a target operating model (people, process, and tools). Required Skills Leadership: Strong leadership skills are essential for guiding teams to More ❯
digital trends, challenges, solutions, market dynamics, competition, and peer group activities. Understanding and ability to articulate the vision for modern engineering (e.g., agile, cloud-native, DevOps), and operations (e.g., observability, automated response, SRE etc.), and articulate a path toward a target operating model (people, process, and tools). Required Skills Leadership: Strong leadership skills are essential for guiding teams to More ❯
City of London, London, United Kingdom Hybrid / WFH Options
DGH Recruitment
managing cloud infrastructures, with expertise in Infrastructure as Code (IaC), particularly using Terraform, proficiency in designing and implementing CI/CD pipelines, and a deep understanding of monitoring and observability practices. Core responsibilities: - Architect, deploy, and manage Azure-based infrastructure to ensure high availability, scalability, and security. - Develop and maintain Infrastructure as Code (IaC) using Terraform for automated and consistent … Code (IaC) tools, especially Terraform. - Experience in designing and managing CI/CD pipelines using tools such as Azure DevOps, Jenkins, or AWS CodePipeline. - Strong understanding of monitoring and observability tools and practices, including experience with Azure Monitor, SCOM, SolarWinds or similar technologies. Senior Azure Infrastructure Engineer (Azure/Terraform/IaC/CI/CD/AWS) In accordance More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Tec Partners
focus on security, resilience, and continuous improvement. Key Responsibilities: Manage and maintain Elastic Cloud Enterprise (ECE) environments, ensuring high availability and performance. Design and deploy scalable Elasticsearch solutions for Observability and Search use cases. Implement robust security, privacy, and compliance controls across Elasticsearch systems. Optimise system configurations and queries to enhance performance and reduce latency. Collaborate with cross-functional teams More ❯
Infrastructure Architect | Lead the Future of Cloud & Hybrid IT London - Hybrid 🚀 Ready to architect the future? As an Infrastructure Architect , you’ll be at the forefront of cloud transformation, leading high-impact projects that bridge the gap between cutting-edge More ❯
City of London, London, England, United Kingdom Hybrid / WFH Options
QA
is seeking a dedicated DevOps Engineer Apprentice to bolster their NHS project team. In this role, the chosen candidate will be instrumental in enhancing the incident management protocols, advancing observability and monitoring strategies, and refining CI/CD practices within the AWS ecosystem.Responsibilities:Collaborating with cross-functional teams to ensure smooth and reliable incident management using Jira and Service Now.Developing … and implement observability and monitoring solutions to ensure high system availability and performance.Contributing to maintaining and improving CI/CD pipelines, ensuring efficient code integration and deployment on AWS.Supporting the design and execution of automated test strategies to enhance the quality and security of cloud-based applications.The successful candidate must have:Experience with AWS cloud services and management tools.Familiarity with More ❯
teams to execute effectively. DataOps Enablement and Optimization: Drive the adoption of modern DataOps principles to streamline engineering workflows. Partner with platform teams to establish CI/CD pipelines, observability standards that improve operational efficiency, reliability, and speed across data pipelines. Data Governance and Quality Assurance: Embed governance, security, and data quality practices into engineering workflows. Define guardrails and reference More ❯
City of London, London, England, United Kingdom Hybrid / WFH Options
Client Server Ltd
real-time operations, resilience and extensibility. You'll collaborate with engineers across the full stack to integrate backend services with Identity and Access control frameworks such as Keycloak, apply observability best practices and contribute to system architecture and codebase quality through reviews and mentoring. Location/WFH: You can work from home most of the time, meeting up with colleagues … have strong Spring Boot, Kafka and event driven microservices experience with high throughput You have leadership, mentoring and coaching skills You have a strong understanding of system-level concerns: observability, availability, resilience You have experience working within security-conscious, regulated, or mission-critical domains You are proficient working with CI/CD pipelines, Infrastructure-as-Code principles and containerised environments More ❯
City Of Westminster, London, United Kingdom Hybrid / WFH Options
Track24 Limited
ISO and SOC compliance standards while collaborating with the InfoSec team to maintain security best practices. Containerisation & Orchestration: Deploy and manage containerised applications using Docker and other orchestration tools. Observability & Monitoring: Provision and maintain observability platforms such as DataDog, Splunk, or New Relic to gain monitoring and performance insights. Incident Management: Establish and oversee monitoring and incident management processes to More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Develop
data, integration layers, and authentication modules Ensure secure, scalable deployment using Azure cloud-native tools Build and support systems using PostgreSQL, Java, and Spring Boot Integrate and monitor using observability tools like Datadog and BigPanda Collaborate closely with architects, DevOps, and security teams across the full SDLC Core Skills & Technologies Strong backend development in Java with Spring Boot Cloud migration … experience, particularly Azure Lift-and-Shift Familiarity with cloud infrastructure and deployment pipelines Exposure to PostgreSQL, authentication/security patterns Monitoring/observability tooling: Datadog, BigPanda Apply now to be considered. More ❯
specialism in vulnerability management Self-starter, able to work in technical detail and motivate a diverse group of stakeholders to build sponsorship for significant and impactful change Desired: Establishing observability platforms Capabilities adjacent to exposure/vulnerability management capabilities (ie cyber security asset management, attack surface management, etc) Pragmatic application of zero-trust philosophies Cloud based security (GCP, AWS and More ❯
Data Operations Manager. Duration 6 months We are seeking a dynamic and driven Data Operations Manager to lead a team of data engineers. You will oversee the daily operations of our data infrastructure and ensure the accuracy, availability, and security More ❯
the stack , but proficiency with Python and Django is necessary , and ideally some exposure to front-end engineering. Frontend solutions are built for both web and mobile platforms. For observability, DataDog is used for monitoring and alerting, and CI/CD pipelines are managed through GitLab to automate testing and deployment workflows. Were looking for someone with a strong product More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Uniting Ambition
insight delivery, and robust reporting. Governance & Operations Implement best practices in data governance, quality, privacy, and compliance (e.g. GDPR, ISO 27001). Monitor product usage and platform performance using observability tools and analytics. Apply data driven insights to inform feature development and improve user experience. Skills & Experience Required 5+ years of experience in product management or technical product delivery, with More ❯