automation tools. Manage Windows and Linux systems, including patching, monitoring, and performance tuning. Drive DevOps adoption—managing CI/CD pipelines, Docker containers, and security-first deployment pipelines. Implement high-availability systems and disaster recovery for business continuity across time zones and territories. Maintain system observability and monitoring to proactively identify issues and optimize system health. Ensure compliance … Proven ability to collaborate with senior stakeholders, translating business needs into technical strategy. Strong scripting capabilities (e.g., PowerShell, Bash, Python). Excellent troubleshooting and problem-solving skills, especially in high-availability and mission-critical environments. Nice to Have: Experience supporting logistics, warehousing, or multi-region ecommerce platforms. Knowledge of container orchestration (e.g., Kubernetes). Certifications in cloud (AWS More ❯
Wymondham, Norfolk, England, United Kingdom Hybrid / WFH Options
DMR Personnel Ltd
of our cloud infrastructure on Microsoft Azure. The ideal candidate will have a deep understanding of Azure cloud services, web infrastructure management, and a proven track record of ensuring highavailability, security, and scalability of cloud-based web applications. As a key member of our IT team, you will manage the cloud infrastructure lifecycle while leading a team … protect organizational data and resources. Monitor the health and performance of cloud infrastructure and applications, leveraging Azure monitoring tools and third-party solutions. Identify and resolve issues that affect availability, scalability, and cost. Optimize cloud resource usage and costs, working with finance and IT teams to ensure that cloud spending aligns with the budget. Provide regular cost analysis reports More ❯
NR18, Hethersett, Norfolk, United Kingdom Hybrid / WFH Options
DMR Personnel Ltd
of our cloud infrastructure on Microsoft Azure. The ideal candidate will have a deep understanding of Azure cloud services, web infrastructure management, and a proven track record of ensuring highavailability, security, and scalability of cloud-based web applications. As a key member of our IT team, you will manage the cloud infrastructure lifecycle while leading a team … protect organizational data and resources. Monitor the health and performance of cloud infrastructure and applications, leveraging Azure monitoring tools and third-party solutions. Identify and resolve issues that affect availability, scalability, and cost. Optimize cloud resource usage and costs, working with finance and IT teams to ensure that cloud spending aligns with the budget. Provide regular cost analysis reports More ❯
Employment Type: Permanent
Salary: £50000 - £60000/annum Plus excellent benefits
Responsibilities Azure Cloud Infrastructure: Build, maintain and improve web infrastructure hosted on Microsoft Azure with a focus on performance, security, scalability and cost-effectiveness. Web Hosting & Load Balancing: Support high-availability hosting environments, including web servers, WAFs, load balancers and DNS. Automation & IaC: Use tools such as Terraform, ARM templates or Bicep to manage infrastructure as code. Security More ❯
Responsibilities Azure Cloud Infrastructure: Build, maintain and improve web infrastructure hosted on Microsoft Azure with a focus on performance, security, scalability and cost-effectiveness. Web Hosting & Load Balancing: Support high-availability hosting environments, including web servers, WAFs, load balancers and DNS. Automation & IaC: Use tools such as Terraform, ARM templates or Bicep to manage infrastructure as code. Security More ❯
Cambridge, England, United Kingdom Hybrid / WFH Options
ZipRecruiter
Job Description Director of Engineering (High Performance Computing Team) Cambridge x3 days/week in the office, up to £170,000 per annum + benefits We are looking for an experienced and innovative Director of Engineering to lead our clients global engineering team. This key leadership role is part of the Engineering IT Leadership team and will be responsible … for overseeing several critical technical areas, including High-Performance Computing (HPC), Engineering Platform Access, Engineering Collaboration and Linux Platforms. You will lead a global team to ensure seamless product development by maintaining and improving the infrastructure that supports engineering teams. Key Responsibilities: High-Performance Computing (HPC): Manage and lead a large-scale HPC environment (handling half a million … cores), using LSF (or similar schedulers) to ensure highavailability, scalability, and operational efficiency. DevOps & Automation: Drive the implementation of DevOps best practices (CI/CD, Terraform, Ansible, GitLab) to automate infrastructure and improve the efficiency of development workflows. Engineering Collaboration Tools: Manage and optimize the Atlassian suite (Jira, Confluence) for enhanced engineering collaboration and compliance. Linux Platform More ❯
systems, support and optimize CI/CD pipelines, and determine optimal solutions for the company’s products. You’ll collaborate closely with development, DevOps, and other teams to maintain high uptime, security, and user experience standards for millions of endpoints. Experience and Education: Bachelor's or higher degree in Computer Science, Information Systems, Information Technology, or a related technical … Observability tools (New Relic, DataDog, Splunk) Scripting (Ansible, Bash, Python, GO) CI/CD Primary Job Responsibilities: Design and support EC2/ECS/EKS/Fargate environments for highavailability and fault tolerance. Implement advanced AWS features (Route53, ALB/NLB, multi-region setups) to ensure global reliability. Maintain and optimize the existing CI/CD pipelines … real-time monitoring dashboards and alerting systems to track system health, capacity, and performance trends. Work closely with development teams to fine-tune infrastructure for cost efficiency while maintaining high performance. Ensure business continuity by designing and maintaining robust backup, failover, and disaster recovery solutions. #J-18808-Ljbffr More ❯
systems, support and optimize CI/CD pipelines, and determine optimal solutions for the company’s products. You’ll collaborate closely with development, DevOps, and other teams to maintain high uptime, security, and user experience standards for millions of endpoints. Experience and Education: Bachelor's or higher degree in Computer Science, Information Systems, Information Technology, or a related technical … Observability tools (New Relic, DataDog, Splunk) Scripting (Ansible, Bash, Python, GO) CI/CD Primary Job Responsibilities: Design and support EC2/ECS/EKS/Fargate environments for highavailability and fault tolerance. Implement advanced AWS features (Route53, ALB/NLB, multi-region setups) to ensure global reliability. Maintain and optimize the existing CI/CD pipelines … real-time monitoring dashboards and alerting systems to track system health, capacity, and performance trends. Work closely with development teams to fine-tune infrastructure for cost efficiency while maintaining high performance. Ensure business continuity by designing and maintaining robust backup, failover, and disaster recovery solutions. #J-18808-Ljbffr More ❯
systems, support and optimize CI/CD pipelines, and determine optimal solutions for the company’s products. You’ll collaborate closely with development, DevOps, and other teams to maintain high uptime, security, and user experience standards for millions of endpoints. Experience and Education: Bachelor's or higher degree in Computer Science, Information Systems, Information Technology, or a related technical … Observability tools (New Relic, DataDog, Splunk) Scripting (Ansible, Bash, Python, GO) CI/CD Primary Job Responsibilities: Design and support EC2/ECS/EKS/Fargate environments for highavailability and fault tolerance. Implement advanced AWS features (Route53, ALB/NLB, multi-region setups) to ensure global reliability. Maintain and optimize the existing CI/CD pipelines … real-time monitoring dashboards and alerting systems to track system health, capacity, and performance trends. Work closely with development teams to fine-tune infrastructure for cost efficiency while maintaining high performance. Ensure business continuity by designing and maintaining robust backup, failover, and disaster recovery solutions. #J-18808-Ljbffr More ❯
systems, support and optimize CI/CD pipelines, and determine optimal solutions for the company’s products. You’ll collaborate closely with development, DevOps, and other teams to maintain high uptime, security, and user experience standards for millions of endpoints. Experience and Education: Bachelor's or higher degree in Computer Science, Information Systems, Information Technology, or a related technical … Observability tools (New Relic, DataDog, Splunk) Scripting (Ansible, Bash, Python, GO) CI/CD Primary Job Responsibilities: Design and support EC2/ECS/EKS/Fargate environments for highavailability and fault tolerance. Implement advanced AWS features (Route53, ALB/NLB, multi-region setups) to ensure global reliability. Maintain and optimize the existing CI/CD pipelines … real-time monitoring dashboards and alerting systems to track system health, capacity, and performance trends. Work closely with development teams to fine-tune infrastructure for cost efficiency while maintaining high performance. Ensure business continuity by designing and maintaining robust backup, failover, and disaster recovery solutions. #J-18808-Ljbffr More ❯
systems, support and optimize CI/CD pipelines, and determine optimal solutions for the company’s products. You’ll collaborate closely with development, DevOps, and other teams to maintain high uptime, security, and user experience standards for millions of endpoints. Experience and Education: Bachelor's or higher degree in Computer Science, Information Systems, Information Technology, or a related technical … Observability tools (New Relic, DataDog, Splunk) Scripting (Ansible, Bash, Python, GO) CI/CD Primary Job Responsibilities: Design and support EC2/ECS/EKS/Fargate environments for highavailability and fault tolerance. Implement advanced AWS features (Route53, ALB/NLB, multi-region setups) to ensure global reliability. Maintain and optimize the existing CI/CD pipelines … real-time monitoring dashboards and alerting systems to track system health, capacity, and performance trends. Work closely with development teams to fine-tune infrastructure for cost efficiency while maintaining high performance. Ensure business continuity by designing and maintaining robust backup, failover, and disaster recovery solutions. #J-18808-Ljbffr More ❯
systems, support and optimize CI/CD pipelines, and determine optimal solutions for the company’s products. You’ll collaborate closely with development, DevOps, and other teams to maintain high uptime, security, and user experience standards for millions of endpoints. Experience and Education: Bachelor's or higher degree in Computer Science, Information Systems, Information Technology, or a related technical … Observability tools (New Relic, DataDog, Splunk) Scripting (Ansible, Bash, Python, GO) CI/CD Primary Job Responsibilities: Design and support EC2/ECS/EKS/Fargate environments for highavailability and fault tolerance. Implement advanced AWS features (Route53, ALB/NLB, multi-region setups) to ensure global reliability. Maintain and optimize the existing CI/CD pipelines … real-time monitoring dashboards and alerting systems to track system health, capacity, and performance trends. Work closely with development teams to fine-tune infrastructure for cost efficiency while maintaining high performance. Ensure business continuity by designing and maintaining robust backup, failover, and disaster recovery solutions. #J-18808-Ljbffr More ❯
systems, support and optimize CI/CD pipelines, and determine optimal solutions for the company’s products. You’ll collaborate closely with development, DevOps, and other teams to maintain high uptime, security, and user experience standards for millions of endpoints. Experience and Education: Bachelor's or higher degree in Computer Science, Information Systems, Information Technology, or a related technical … Observability tools (New Relic, DataDog, Splunk) Scripting (Ansible, Bash, Python, GO) CI/CD Primary Job Responsibilities: Design and support EC2/ECS/EKS/Fargate environments for highavailability and fault tolerance. Implement advanced AWS features (Route53, ALB/NLB, multi-region setups) to ensure global reliability. Maintain and optimize the existing CI/CD pipelines … real-time monitoring dashboards and alerting systems to track system health, capacity, and performance trends. Work closely with development teams to fine-tune infrastructure for cost efficiency while maintaining high performance. Ensure business continuity by designing and maintaining robust backup, failover, and disaster recovery solutions. #J-18808-Ljbffr More ❯
systems, support and optimize CI/CD pipelines, and determine optimal solutions for the company’s products. You’ll collaborate closely with development, DevOps, and other teams to maintain high uptime, security, and user experience standards for millions of endpoints. Experience and Education: Bachelor's or higher degree in Computer Science, Information Systems, Information Technology, or a related technical … Observability tools (New Relic, DataDog, Splunk) Scripting (Ansible, Bash, Python, GO) CI/CD Primary Job Responsibilities: Design and support EC2/ECS/EKS/Fargate environments for highavailability and fault tolerance. Implement advanced AWS features (Route53, ALB/NLB, multi-region setups) to ensure global reliability. Maintain and optimize the existing CI/CD pipelines … real-time monitoring dashboards and alerting systems to track system health, capacity, and performance trends. Work closely with development teams to fine-tune infrastructure for cost efficiency while maintaining high performance. Ensure business continuity by designing and maintaining robust backup, failover, and disaster recovery solutions. #J-18808-Ljbffr More ❯
systems, support and optimize CI/CD pipelines, and determine optimal solutions for the company’s products. You’ll collaborate closely with development, DevOps, and other teams to maintain high uptime, security, and user experience standards for millions of endpoints. Experience and Education: Bachelor's or higher degree in Computer Science, Information Systems, Information Technology, or a related technical … Observability tools (New Relic, DataDog, Splunk) Scripting (Ansible, Bash, Python, GO) CI/CD Primary Job Responsibilities: Design and support EC2/ECS/EKS/Fargate environments for highavailability and fault tolerance. Implement advanced AWS features (Route53, ALB/NLB, multi-region setups) to ensure global reliability. Maintain and optimize the existing CI/CD pipelines … real-time monitoring dashboards and alerting systems to track system health, capacity, and performance trends. Work closely with development teams to fine-tune infrastructure for cost efficiency while maintaining high performance. Ensure business continuity by designing and maintaining robust backup, failover, and disaster recovery solutions. #J-18808-Ljbffr More ❯
systems, support and optimize CI/CD pipelines, and determine optimal solutions for the company’s products. You’ll collaborate closely with development, DevOps, and other teams to maintain high uptime, security, and user experience standards for millions of endpoints. Experience and Education: Bachelor's or higher degree in Computer Science, Information Systems, Information Technology, or a related technical … Observability tools (New Relic, DataDog, Splunk) Scripting (Ansible, Bash, Python, GO) CI/CD Primary Job Responsibilities: Design and support EC2/ECS/EKS/Fargate environments for highavailability and fault tolerance. Implement advanced AWS features (Route53, ALB/NLB, multi-region setups) to ensure global reliability. Maintain and optimize the existing CI/CD pipelines … real-time monitoring dashboards and alerting systems to track system health, capacity, and performance trends. Work closely with development teams to fine-tune infrastructure for cost efficiency while maintaining high performance. Ensure business continuity by designing and maintaining robust backup, failover, and disaster recovery solutions. #J-18808-Ljbffr More ❯
systems, support and optimize CI/CD pipelines, and determine optimal solutions for the company’s products. You’ll collaborate closely with development, DevOps, and other teams to maintain high uptime, security, and user experience standards for millions of endpoints. Experience and Education: Bachelor's or higher degree in Computer Science, Information Systems, Information Technology, or a related technical … Observability tools (New Relic, DataDog, Splunk) Scripting (Ansible, Bash, Python, GO) CI/CD Primary Job Responsibilities: Design and support EC2/ECS/EKS/Fargate environments for highavailability and fault tolerance. Implement advanced AWS features (Route53, ALB/NLB, multi-region setups) to ensure global reliability. Maintain and optimize the existing CI/CD pipelines … real-time monitoring dashboards and alerting systems to track system health, capacity, and performance trends. Work closely with development teams to fine-tune infrastructure for cost efficiency while maintaining high performance. Ensure business continuity by designing and maintaining robust backup, failover, and disaster recovery solutions. #J-18808-Ljbffr More ❯
systems, support and optimize CI/CD pipelines, and determine optimal solutions for the company’s products. You’ll collaborate closely with development, DevOps, and other teams to maintain high uptime, security, and user experience standards for millions of endpoints. Virtual job fairs Experience and Education: Bachelor's or higher degree in Computer Science, Information Systems, Information Technology, or … Relic, DataDog, Splunk) Scripting (Ansible, Bash, Python, GO) CI/CD Primary Job Responsibilities: Virtual job fairs Design and support EC2/ECS/EKS/Fargate environments for highavailability and fault tolerance. Implement advanced AWS features (Route53, ALB/NLB, multi-region setups) to ensure global reliability. Maintain and optimize the existing CI/CD pipelines … real-time monitoring dashboards and alerting systems to track system health, capacity, and performance trends. Work closely with development teams to fine-tune infrastructure for cost efficiency while maintaining high performance. Ensure business continuity by designing and maintaining robust backup, failover, and disaster recovery solutions. Please note that if you are NOT a passport holder of the country for More ❯
automation tools. Manage Windows and Linux systems, including patching, monitoring, and performance tuning. Drive DevOps adoption—managing CI/CD pipelines, Docker containers, and security-first deployment pipelines. Implement high-availability systems and disaster recovery for business continuity across time zones and territories. Maintain system observability and monitoring to proactively identify issues and optimize system health. Ensure compliance … platform security principles, with exposure to DevSecOps workflows. Proven ability to collaborate with senior stakeholders, translating business needs into technical strategy. Excellent troubleshooting and problem-solving skills, especially in high-availability and mission-critical environments. What We Offer: A visible leadership role with direct access to senior leadership Opportunity to shape infrastructure strategy for global operations A dynamic More ❯
on, cross-functional, and central to our product and research success. Key Responsibilities DevOps & Infrastructure Design, implement, and maintain infrastructure on AWS and Google Cloud Platform (GCP) to support high-performance computing workloads and scalable services. Collaborate with R&D teams to provision and manage compute environments for model training and experimentation. Maintain/monitor systems, implement observability solutions … e.g., logging, metrics, tracing), and proactively resolve infrastructure issues. Manage CI/CD pipelines for rapid, reliable deployment of services and models. Ensure highavailability, disaster recovery, and robust security practices across environments. Data Engineering Build and maintain data processing pipelines for model training, experimentation, and analytics. Work closely with machine learning engineers and researchers to understand data … ingestion, transformation, and storage using tools such as Scrappy , Playwright , agentic workflows (e.g. crawl4a i) or equivalent. Optimize and benchmark AI training/inference/data workflows to ensure high performance, scalability, cost and an exceptional customer experience. Maintain data quality, lineage, and compliance across multiple environments. Key Requirements 5+ years of experience in DevOps , Site Reliability Engineering , or More ❯
Cambourne, Cambridgeshire, United Kingdom Hybrid / WFH Options
Remotestar
to ensure project goals are met efficiently and effectively, fostering a collaborative and results-driven team environment. Monitor project performance and take proactive measures to ensure the delivery of high-quality solutions. Develop documentation, monitor and report project status, assess the effectiveness and accuracy of documentation. Foster clear and transparent communication channels with stakeholders, team members, and senior management …/or GCP) Competent experience in deploying, monitoring and maintaining Kubernetes clusters as well as experience deploying applications to Kubernetes using Helm charts. Experience in large-scale, secure, and highavailability solutions in the AWS Cloud, using automation to support deployment, scaling, monitoring, and management (i.e. Terraform, etc.) Familiarity with the design and implementation of CI/CD … insurance. Flexible remuneration: hospitality and public transportation. Eligibility for educational budget according to internal policy. Hybrid opportunity. Flexible working hours. Language classes and discounted lunch options Working in a high paced environment, working on cutting edge technologies. Career plan. Opportunity to learn and teach. Progressive Company. Happy people culture More ❯
Proactively monitor and report on system capacity and performance. Provide 2nd and 3rd line technical support for Linux and IBM-Power platforms. Lead and contribute to infrastructure projects, delivering high-quality solutions aligned to business needs. Ensure availability of mid-range platforms, resolving service-affecting issues as necessary. Implement best practices across Linux platforms to meet availability … sites and participate in out-of-hours support as part of a rota (37.5 hour week). IBM Power, AIX, VIO, NIM, CMC/HMC administration. Designing and supporting highavailability architectures. Experience with public cloud environments (Azure and/or AWS). Job scheduling tools such as Redwood Cronacle/RunMyJobs. Understanding of project methodologies such as More ❯
Proactively monitor and report on system capacity and performance. Provide 2nd and 3rd line technical support for Linux and IBM-Power platforms. Lead and contribute to infrastructure projects, delivering high-quality solutions aligned to business needs. Ensure availability of mid-range platforms, resolving service-affecting issues as necessary. Implement best practices across Linux platforms to meet availability … sites and participate in out-of-hours support as part of a rota (37.5 hour week). IBM Power, AIX, VIO, NIM, CMC/HMC administration. Designing and supporting highavailability architectures. Experience with public cloud environments (Azure and/or AWS). Job scheduling tools such as Redwood Cronacle/RunMyJobs. Understanding of project methodologies such as More ❯
Watford, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
includes: Designing and maintaining modern CI/CD pipelines (GitHub Actions experience is a big plus) Implementing Infrastructure as Code (Terraform) Supporting deployments of PHP/Laravel applications in high-concurrency environments Working with Docker, Kubernetes, ECS or EKS Automating development workflows and driving performance optimisations Building out monitoring solutions, cost management strategies, and SOC2-compliant processes Skills & Experience … a DevOps-focused role Strong cloud background (AWS, Azure, or GCP) Proficiency in Terraform, Docker, Python or Bash scripting Solid experience with infrastructure performance, security, and scaling Comfortable in high-availability, fast-paced environments Understanding of SOC2 compliance within DevOps workflows If you're passionate about automation, performance, and scalable systems and love solving problems with a proactive More ❯
based applications. Build and maintain deployment pipelines and configuration management for Windows workloads Create tooling and automation around the deployment of a customer-specific Windows-based SaaS product Ensure highavailability, reliability, and scalability of Windows services. Integrate observability tooling (metrics, logs, traces) into IIS-hosted services Harden Windows infrastructure for security, compliance, and operational best practices Lead More ❯