adoption of infrastructure-as-code and GitOps principles for consistent, automated delivery. Lead design forums and provide architectural governance across multiple projects. Develop cloud roadmaps covering network segmentation, identity, observability, and resilience. Embed security, compliance, and resilience into all architectural designs. Manage cost optimisation, including RI/SP planning and right-sizing. Mentor engineers and architects on AWS best practices More ❯
adoption of infrastructure-as-code and GitOps principles for consistent, automated delivery. Lead design forums and provide architectural governance across multiple projects. Develop cloud roadmaps covering network segmentation, identity, observability, and resilience. Embed security, compliance, and resilience into all architectural designs. Manage cost optimisation, including RI/SP planning and right-sizing. Mentor engineers and architects on AWS best practices More ❯
London, England, United Kingdom Hybrid/Remote Options
Client Server
crypto offering and split your time between hands-on development with people management (70/30). You'll set the technical direction, mentor engineers and ensure code quality, observability, scalability and security are embedded into high-quality, high-impact releases. You'll be working with a modern, cloud native tech stack using Java, Spring Boot, AWS, Kafka and CI More ❯
Your Impact Be the technical lead for backend initiatives powering supply chain and real-time operations Define engineering strategy and drive architecture decisions Own and improve backend quality, performance, observability, and uptime Collaborate directly with users in stores and warehouses to improve tooling that matters Participate in planning and roadmap discussions as a core team leader We have interview slots More ❯
Your Impact Be the technical lead for backend initiatives powering supply chain and real-time operations Define engineering strategy and drive architecture decisions Own and improve backend quality, performance, observability, and uptime Collaborate directly with users in stores and warehouses to improve tooling that matters Participate in planning and roadmap discussions as a core team leader We have interview slots More ❯
Mansfield, England, United Kingdom Hybrid/Remote Options
develop
written A collaborative team player who thrives in a hybrid work environment Nice to Have Experience working in retail or consumer-facing product teams Knowledge of monitoring, analytics, or observability tools Exposure to testing in microservices architectures More ❯
requirements into elegant technical solutions. Participate in architectural discussions and shape engineering best practices. Troubleshoot and resolve production issues across services and systems. Contribute to CI/CD pipelines, observability, and automation initiatives. Interview Process 1st stage – Introductory chat with the Hiring Manager to explore your experience and the team. 2nd stage – Technical pairing exercise with Staff Engineer 3rd stage More ❯
City of London, London, United Kingdom Hybrid/Remote Options
WeDo
requirements into elegant technical solutions. Participate in architectural discussions and shape engineering best practices. Troubleshoot and resolve production issues across services and systems. Contribute to CI/CD pipelines, observability, and automation initiatives. Interview Process 1st stage – Introductory chat with the Hiring Manager to explore your experience and the team. 2nd stage – Technical pairing exercise with Staff Engineer 3rd stage More ❯
City of London, London, United Kingdom Hybrid/Remote Options
Quantum Technology Solutions Inc
granular permission structures, RBAC, and least-privilege configurations across all resources. · Build and manage infrastructure using Terraform and Azure CLI , enabling consistency, traceability, and automated change control. · Implement strong observability and compliance frameworks (metrics, logging, tracing, and audits) to guarantee visibility, reliability, and adherence to high-regulation standards. · Support and automate development, staging/UAT, and production environments with robust More ❯
granular permission structures, RBAC, and least-privilege configurations across all resources. · Build and manage infrastructure using Terraform and Azure CLI , enabling consistency, traceability, and automated change control. · Implement strong observability and compliance frameworks (metrics, logging, tracing, and audits) to guarantee visibility, reliability, and adherence to high-regulation standards. · Support and automate development, staging/UAT, and production environments with robust More ❯
agent systems, integrating them with core enterprise systems like SAP, Salesforce, and the ECOLAB3D™ platform. Define and enforce architectural standards and governance frameworks for the agent lifecycle, data lineage, observability, and interoperability. Technology Evaluation and Selection: Evaluate and select AI platforms, tools, and protocols, such as LangChain, AutoGen, or similar frameworks, ensuring they meet scalability, security, and performance requirements within More ❯
Eastleigh, Hampshire, United Kingdom Hybrid/Remote Options
TalentTrade Recruitment
Explorer) Open to AWS or GCP candidates with strong cloud integration experience Must understand cloud deployments, monitoring, and API integration DevOps CI/CD pipeline setup and maintenance Monitoring, observability, automated deployments Strong engineering hygiene (not a DevOps role, but good DevOps practices required) Security Secure coding principles Authentication/authorisation (OAuth2, Entra ID/B2C) Role-based access control More ❯
Manchester, England, United Kingdom Hybrid/Remote Options
Awaze
environment. Partner with Product to balance innovation with reliability, ensuring our core platforms can scale to support millions of bookings. Champion engineering best practices such as CI/CD, observability, automated testing, and platform reliability. Create an environment where teams can experiment, learn, and deliver value quickly and safely. Play a key role in shaping how we attract, develop, and More ❯
optimization, anomaly detection, and predictive analytics. Understanding of AI frameworks and libraries (e.g., TensorFlow, PyTorch, Scikit-learn) and their application in network automation and monitoring. Experience with telemetry and observability frameworks (e.g., Prometheus, Grafana) for real-time network monitoring and troubleshooting. Experience : Minimum of 7 years' of experience in network engineering, operations, and support. Proven ability to work hands-on More ❯
Birmingham, West Midlands, United Kingdom Hybrid/Remote Options
ByteHire
or communicating with robotic automation systems and integrating with physical devices Desktop app development with Electron CI/CD setup, rollback strategies, and deployment automation Sentry, NewRelic, or other observability tooling implementation More ❯
will: Design and evolve the architecture of highly scalable, reliable, and secure distributed systems. Drive technical excellence across the engineering organization by setting standards for code quality, system design, observability, and operational best practices. Collaborate closely with Product, UX, and Application Engineering teams to deliver impactful features while ensuring architectural soundness and scalability. Mentor and guide senior and mid-level More ❯
Job Details Data Platform Operations : Managing production ETL/ELT pipelines and data orchestration frameworks. Ensuring data platform security, compliance, and access controls. Implementing and governing data quality, lineage, observability, and metadata management. Ensuring the sustainability and scalability of our data infrastructure and that it enables the company to achieve its business objectives Data Platform Optimization: Ensuring that the team More ❯
Bexhill-On-Sea, East Sussex, South East, United Kingdom Hybrid/Remote Options
Hastings Direct
Job Details Data Platform Operations : Managing production ETL/ELT pipelines and data orchestration frameworks. Ensuring data platform security, compliance, and access controls. Implementing and governing data quality, lineage, observability, and metadata management. Ensuring the sustainability and scalability of our data infrastructure and that it enables the company to achieve its business objectives Data Platform Optimization: Ensuring that the team More ❯
other internal teams to fully understand client requirements and deliver tailored technical solutions. Design and implement scalable, future-proof architectures for new third-party connectors and integrations. Enhance system observability by improving diagnostics, logging, and tracing to aid technical support teams in resolving issues swiftly. Oversee the ongoing development and management of the public API, covering REST and event streaming More ❯
other internal teams to fully understand client requirements and deliver tailored technical solutions. Design and implement scalable, future-proof architectures for new third-party connectors and integrations. Enhance system observability by improving diagnostics, logging, and tracing to aid technical support teams in resolving issues swiftly. Oversee the ongoing development and management of the public API, covering REST and event streaming More ❯
City of London, London, United Kingdom Hybrid/Remote Options
RedTech Recruitment
other internal teams to fully understand client requirements and deliver tailored technical solutions. Design and implement scalable, future-proof architectures for new third-party connectors and integrations. Enhance system observability by improving diagnostics, logging, and tracing to aid technical support teams in resolving issues swiftly. Oversee the ongoing development and management of the public API, covering REST and event streaming More ❯
other internal teams to fully understand client requirements and deliver tailored technical solutions. Design and implement scalable, future-proof architectures for new third-party connectors and integrations. Enhance system observability by improving diagnostics, logging, and tracing to aid technical support teams in resolving issues swiftly. Oversee the ongoing development and management of the public API, covering REST and event streaming More ❯
you thrive in a fast-paced environment where you can make a real difference, we want to hear from you! Required skills/expertise: Develop and implement a comprehensive observability strategy for self-hosted deployments, including infrastructure and tooling for monitoring, alerting, and troubleshooting. This will involve designing and implementing robust metrics and logging systems. Engineer the ACRA platform for More ❯
announcements, and fast-paced breaking news coverage. What you'll do Deploy and manage containerised, virtualised, and software-defined workloads. Implement monitoring, logging, and alerting solutions to ensure effective observability across core platforms. Maintain and operate CI/CD pipelines to automate build, test, and deployment processes. Diagnose and resolve complex technical issues in broadcast systems, ensuring minimal downtime and More ❯
Middlesex, south east england, united kingdom Hybrid/Remote Options
Sky
announcements, and fast-paced breaking news coverage. What you'll do Deploy and manage containerised, virtualised, and software-defined workloads. Implement monitoring, logging, and alerting solutions to ensure effective observability across core platforms. Maintain and operate CI/CD pipelines to automate build, test, and deployment processes. Diagnose and resolve complex technical issues in broadcast systems, ensuring minimal downtime and More ❯