support high-performance computing workloads and scalable services. Collaborate with R&D teams to provision and manage compute environments for model training and experimentation. Maintain/monitor systems, implement observability solutions (e.g., logging, metrics, tracing), and proactively resolve infrastructure issues. Manage CI/CD pipelines for rapid, reliable deployment of services and models. Ensure high availability, disaster recovery, and robust More ❯
support high-performance computing workloads and scalable services. Collaborate with R&D teams to provision and manage compute environments for model training and experimentation. Maintain/monitor systems, implement observability solutions (e.g., logging, metrics, tracing), and proactively resolve infrastructure issues. Manage CI/CD pipelines for rapid, reliable deployment of services and models. Ensure high availability, disaster recovery, and robust More ❯
support high-performance computing workloads and scalable services. Collaborate with R&D teams to provision and manage compute environments for model training and experimentation. Maintain/monitor systems, implement observability solutions (e.g., logging, metrics, tracing), and proactively resolve infrastructure issues. Manage CI/CD pipelines for rapid, reliable deployment of services and models. Ensure high availability, disaster recovery, and robust More ❯
support high-performance computing workloads and scalable services. Collaborate with R&D teams to provision and manage compute environments for model training and experimentation. Maintain/monitor systems, implement observability solutions (e.g., logging, metrics, tracing), and proactively resolve infrastructure issues. Manage CI/CD pipelines for rapid, reliable deployment of services and models. Ensure high availability, disaster recovery, and robust More ❯
support high-performance computing workloads and scalable services. Collaborate with R&D teams to provision and manage compute environments for model training and experimentation. Maintain/monitor systems, implement observability solutions (e.g., logging, metrics, tracing), and proactively resolve infrastructure issues. Manage CI/CD pipelines for rapid, reliable deployment of services and models. Ensure high availability, disaster recovery, and robust More ❯
support high-performance computing workloads and scalable services. Collaborate with R&D teams to provision and manage compute environments for model training and experimentation. Maintain/monitor systems, implement observability solutions (e.g., logging, metrics, tracing), and proactively resolve infrastructure issues. Manage CI/CD pipelines for rapid, reliable deployment of services and models. Ensure high availability, disaster recovery, and robust More ❯
reliability engineering teams (sourced from internal associates and preferred third party vendors) in applying Site Reliability Engineering principles to in-house developed applications. Optimise and reduce operational overheads through observability and service automation. Identify growth opportunities for your manager level reportees on how to achieve their technical, business and personal goals. Work closely with peer senior manager people leader(s … Technical leadership coupled with a passion for software engineering and operational processes. Strong background in software/system engineering and architecture within the cloud. Strong background/appreciation in observability principles, techniques and toolsets. Demonstrable knowledge in the software development lifecycle within a cloud based environment. Demonstrable knowledge of developing and managing RESTful API services written within a modern OO More ❯
cloud services. Define and enforce best practices for Infrastructure as Code (IaC), CI/CD, monitoring, and cost management. Support and lead platform-wide initiatives to reduce toil, improve observability, and increase deployment velocity. Guide and own architectural decisions across services and systems. Build and mentor a high-performing team, fostering a culture of collaboration and continuous improvement. Partner with More ❯
cloud services. Define and enforce best practices for Infrastructure as Code (IaC), CI/CD, monitoring, and cost management. Support and lead platform-wide initiatives to reduce toil, improve observability, and increase deployment velocity. Guide and own architectural decisions across services and systems. Build and mentor a high-performing team, fostering a culture of collaboration and continuous improvement. Partner with More ❯
cloud services. Define and enforce best practices for Infrastructure as Code (IaC), CI/CD, monitoring, and cost management. Support and lead platform-wide initiatives to reduce toil, improve observability, and increase deployment velocity. Guide and own architectural decisions across services and systems. Build and mentor a high-performing team, fostering a culture of collaboration and continuous improvement. Partner with More ❯
cloud services. Define and enforce best practices for Infrastructure as Code (IaC), CI/CD, monitoring, and cost management. Support and lead platform-wide initiatives to reduce toil, improve observability, and increase deployment velocity. Guide and own architectural decisions across services and systems. Build and mentor a high-performing team, fostering a culture of collaboration and continuous improvement. Partner with More ❯
cloud services. Define and enforce best practices for Infrastructure as Code (IaC), CI/CD, monitoring, and cost management. Support and lead platform-wide initiatives to reduce toil, improve observability, and increase deployment velocity. Guide and own architectural decisions across services and systems. Build and mentor a high-performing team, fostering a culture of collaboration and continuous improvement. Partner with More ❯
cloud services. Define and enforce best practices for Infrastructure as Code (IaC), CI/CD, monitoring, and cost management. Support and lead platform-wide initiatives to reduce toil, improve observability, and increase deployment velocity. Guide and own architectural decisions across services and systems. Build and mentor a high-performing team, fostering a culture of collaboration and continuous improvement. Partner with More ❯
Join Barclays as an Engineering Manager for Operational Support Systems and Tools, where you'll play a key role in supporting the growth and evolution of our OSS capabilities. As the company establishes six new functions to strengthen its operational More ❯
and on-premises infrastructure with security and reliability in mind Write high-quality internal tools and automation scripts using Python Collaborate with engineering and security teams to ensure compliance, observability, and incident readiness Contribute to infrastructure as code and CI/CD pipeline improvements Monitor system health and performance, proposing and implementing enhancements Requirements: 5+ years of experience in platform More ❯
and on-premises infrastructure with security and reliability in mind Write high-quality internal tools and automation scripts using Python Collaborate with engineering and security teams to ensure compliance, observability, and incident readiness Contribute to infrastructure as code and CI/CD pipeline improvements Monitor system health and performance, proposing and implementing enhancements Requirements: 5+ years of experience in platform More ❯
and on-premises infrastructure with security and reliability in mind Write high-quality internal tools and automation scripts using Python Collaborate with engineering and security teams to ensure compliance, observability, and incident readiness Contribute to infrastructure as code and CI/CD pipeline improvements Monitor system health and performance, proposing and implementing enhancements Requirements: 5+ years of experience in platform More ❯
Northampton, Northamptonshire, UK Hybrid / WFH Options
Signify Technology
and on-premises infrastructure with security and reliability in mind Write high-quality internal tools and automation scripts using Python Collaborate with engineering and security teams to ensure compliance, observability, and incident readiness Contribute to infrastructure as code and CI/CD pipeline improvements Monitor system health and performance, proposing and implementing enhancements Requirements: 5+ years of experience in platform More ❯
and on-premises infrastructure with security and reliability in mind Write high-quality internal tools and automation scripts using Python Collaborate with engineering and security teams to ensure compliance, observability, and incident readiness Contribute to infrastructure as code and CI/CD pipeline improvements Monitor system health and performance, proposing and implementing enhancements Requirements: 5+ years of experience in platform More ❯
and on-premises infrastructure with security and reliability in mind Write high-quality internal tools and automation scripts using Python Collaborate with engineering and security teams to ensure compliance, observability, and incident readiness Contribute to infrastructure as code and CI/CD pipeline improvements Monitor system health and performance, proposing and implementing enhancements Requirements: 5+ years of experience in platform More ❯
level backend and microservices in Python and TypeScript Work with product and engineering teams to shape models, services, and system behaviour Contribute to system architecture and infrastructure for scale, observability, and performance Explore and implement LLMs, prompt engineering, and AI orchestration frameworks Take ownership of features end-to-end, from design to deployment and monitoring What They’re Looking For More ❯
level backend and microservices in Python and TypeScript Work with product and engineering teams to shape models, services, and system behaviour Contribute to system architecture and infrastructure for scale, observability, and performance Explore and implement LLMs, prompt engineering, and AI orchestration frameworks Take ownership of features end-to-end, from design to deployment and monitoring What They’re Looking For More ❯
Nottingham, England, United Kingdom Hybrid / WFH Options
Digital Waffle
level backend and microservices in Python and TypeScript Work with product and engineering teams to shape models, services, and system behaviour Contribute to system architecture and infrastructure for scale, observability, and performance Explore and implement LLMs, prompt engineering, and AI orchestration frameworks Take ownership of features end-to-end, from design to deployment and monitoring What They’re Looking For More ❯
p Other highly valued skills may include:/p ul li Strong understanding of modern infrastructure architecture (containerization, virtualization, public cloud) and Site Reliability Engineering practices, including metrics and observability tools./li li Experience working in a finance, banking, or fintech company with an internal customer base./li li Certified Product Owner/li/ul p You More ❯
availability and performance across stores, corporate offices, supply chains, and data centres. You will proactively mitigate risks by reviewing analytics on network metrics using Meraki Dashboards and other network observability toolsets. Additionally, you will oversee the service performance of global network managed services, advocating for regional service teams on continuous improvement initiatives. You will manage risks, issues, and escalations, monitor More ❯