understanding of system architecture, and experience in troubleshooting complex technical issues. Key Responsibilities: Design, develop, and implement IT infrastructure, including hardware, software, and networks. Monitor system performance and ensure highavailability and reliability. Identify, diagnose, and resolve technical issues related to system operations. Implement security measures and best practices to protect systems and data. Collaborate with cross-functional More ❯
environments. Collaboration & Communication: Work closely with development and operations teams to streamline processes, enhance productivity, and solve complex deployment challenges. Monitoring & Optimization: Proactively monitor and optimize pipeline performance, ensuring highavailability, scalability, and security throughout the entire delivery pipeline. Automation & Efficiency: Continually seek out opportunities to automate manual processes, reduce friction in deployment, and improve operational efficiency. Security More ❯
as Terraform, Ansible, and CloudFormation Create, manage, and secure cloud-hosted OS images (e.g., AWS AMIs) across Linux and Windows Server environments Configure and troubleshoot production-grade systems, ensuring highavailability and performance in AWS and hybrid environments Implement and monitor cloud-native security controls, ensuring compliance with DoD cybersecurity standards Collaborate with cross-functional teams to support More ❯
Wandsworth, Greater London, UK Hybrid / WFH Options
Experteer Italy
with a wide array of asset issuers. As a well-established market maker, our distinctive expertise led us to expand rapidly. Today, our services span market making, options trading, high-frequency trading, OTC, and DeFi trading desks. But we're more than a service provider. We're an initiator. We're pioneers in adopting the Rust development language for … and trading data platform systems at the core of our organisation. We are looking for a hands-on leader who is not only experienced in building scalable, resilient, and high-performance systems but also willing to roll up their sleeves and actively contribute to engineering efforts. The ideal candidate thrives in fast-paced environments, has a strong track record … managing and mentoring engineers, fosters a collaborative work culture, and drives product-centric initiatives while staying deeply engaged in technical challenges. Key Responsibilities * Architect, develop, and maintain large-scale, high-performance trading data platforms with a focus on low latency and high availability. * Apply data engineering principles to design efficient, scalable, and fault-tolerant data pipelines for trading More ❯
london, south east england, united kingdom Hybrid / WFH Options
Experteer Italy
with a wide array of asset issuers. As a well-established market maker, our distinctive expertise led us to expand rapidly. Today, our services span market making, options trading, high-frequency trading, OTC, and DeFi trading desks. But we're more than a service provider. We're an initiator. We're pioneers in adopting the Rust development language for … and trading data platform systems at the core of our organisation. We are looking for a hands-on leader who is not only experienced in building scalable, resilient, and high-performance systems but also willing to roll up their sleeves and actively contribute to engineering efforts. The ideal candidate thrives in fast-paced environments, has a strong track record … managing and mentoring engineers, fosters a collaborative work culture, and drives product-centric initiatives while staying deeply engaged in technical challenges. Key Responsibilities * Architect, develop, and maintain large-scale, high-performance trading data platforms with a focus on low latency and high availability. * Apply data engineering principles to design efficient, scalable, and fault-tolerant data pipelines for trading More ❯
with a wide array of asset issuers. As a well-established market maker, our distinctive expertise led us to expand rapidly. Today, our services span market making, options trading, high-frequency trading, OTC, and DeFi trading desks. But we’re more than a service provider. We’re an initiator. We're pioneers in adopting the Rust development language for … and trading data platform systems at the core of our organisation. We are looking for a hands-on leader who is not only experienced in building scalable, resilient, and high-performance systems but also willing to roll up their sleeves and actively contribute to engineering efforts. The ideal candidate thrives in fast-paced environments, has a strong track record … managing and mentoring engineers, fosters a collaborative work culture, and drives product-centric initiatives while staying deeply engaged in technical challenges. Key Responsibilities Architect, develop, and maintain large-scale, high-performance trading data platforms with a focus on low latency and high availability. Apply data engineering principles to design efficient, scalable, and fault-tolerant data pipelines for trading More ❯
Brussels - Hybrid Your role: Design, build, and manage scalable, secure cloud infrastructure using infrastructure-as-code tools (e.g., Terraform, Helm). Design and maintain OpenShift clusters to ensure scalability, highavailability, and security Develop and maintain CI/CD pipelines to automate testing, deployment, and infrastructure provisioning. Implement containerization and orchestration solutions (e.g., Docker, Kubernetes) to support microservices … and cloud-native applications. Monitor system performance, ensure highavailability, and troubleshoot production issues across cloud environments. Your background: Bachelor's or higher degree in IT, Computer Science, or other related fields 5+ Years' experience in Cloud Computing (AWS, GCP, Azure, IBM) with relevant certifications Experience in developing CI/CD pipelines, and knowledge of DevOps tools including More ❯
s degree plus 3 years of job-related experience. Agile experience preferred. CLEARANCE REQUIREMENTS: Secret Qualifications: We have an immediate opening for a talented Advanced Systems Engineer on the high-impact USWDSS program, where you'll be at the forefront of software integration, testing, and automation. As a vital member of our cross-functional team, you'll play a … a focus on networking, software compatibility, and virtualization in complex system environments. • Plan, execute, and validate software integrations across various stages of the DevOps lifecycle, ensuring seamless deployments and high system reliability. • Utilize DevOps tools and practices (such as CI/CD pipelines, automated testing, and containerization) to streamline software integration and reduce time to deployment. • Integrate software deliveries … Agile methodologies by actively participating in sprint planning, backlog refinement, daily stand-ups, and retrospective meetings, contributing to continuous improvement and timely delivery of features. • Design, implement, and maintain highavailability and failover solutions using Red Hat HighAvailability tools (Pacemaker, PCS, DRBD) and other virtualization platforms. • Document and analyze test results, manage test artifacts in More ❯
Proactively monitor and report on system capacity and performance. Provide 2nd and 3rd line technical support for Linux and IBM-Power platforms. Lead and contribute to infrastructure projects, delivering high-quality solutions aligned to business needs. Ensure availability of mid-range platforms, resolving service-affecting issues as necessary. Implement best practices across Linux platforms to meet availability … participate in out-of-hours support as part of a rota (37.5 hour week). Desirable Experience: IBM Power, AIX, VIO, NIM, CMC/HMC administration. Designing and supporting highavailability architectures. Experience with public cloud environments (Azure and/or AWS). Job scheduling tools such as Redwood Cronacle/RunMyJobs. Understanding of project methodologies such as More ❯
fixes; oversee regression testing to ensure compatibility across OS and hardware. Participate in Agile planning, daily stand-ups, and retrospectives to align work with team goals. Design and implement high-availability and failover systems using Red Hat HighAvailability tools such as Pacemaker, PCS, and DRBD. Maintain thorough documentation of test results and artifacts in compliance … of VLAN networking and Bash scripting Familiarity with monitoring tools such as Prometheus, Grafana, and Sensu Go Hands-on experience with Kubernetes, OpenShift, and KubeVirt Preferred Experience: Red Hat HighAvailability solutions (Pacemaker, PCS, DRBD) Virtualization platforms (Red Hat Virtualization, VMware) Advanced scripting and automation with Ansible Military Experience Moseley Technical Services, Inc. is an AA/EEO More ❯
and implement new services and features on the platform to meet the needs of our clients and internal teams. Collaborate with data engineers and other stakeholders to ensure data availability, reliability, and scalability. Build infrastructure and automation to support deployment, monitoring, and maintenance of the platform (using DevOps best practices). Write clean, maintainable, and efficient code to improve … platform functionality and performance. Take ownership of full-service lifecycle: design, development, deployment, and support. Ensure security and highavailability of the data platform and services built upon it. Troubleshoot and resolve issues, and continuously work to improve system efficiency and reliability. Required Skills & Qualifications: Strong background in software engineering , with expertise in cloud computing and DevOps practices … AWS services). Proficiency in programming languages such as Python (preferred), Java, or Go. Experience with infrastructure-as-code tools (e.g., Terraform, CloudFormation). Experience building scalable, secure, and high-performance data services . Familiarity with CI/CD pipelines and automated testing practices. Ability to manage complex systems and troubleshoot production issues effectively. Experience working in an agile More ❯
in automation and operations. As part of the AWS Managed Operations team, you will play a pivotal role in building and leading operations and development teams dedicated to delivering high-availability AWS services, including EC2, S3, Dynamo, Lambda, and Bedrock, exclusively for EU customers. For more information on ESC please check out our blog: Your responsibilities will encompass … of AWS services and technology. A typical day in this role involves collaborating with technology leaders, contributing to the enhancement of day-to-day operations, and ensuring improvements in availability, reliability, latency, performance, and efficiency of the ESC. You will be required to occasionally participate in "on-call" rotations to resolve incidents occurring out-of-hours. The overarching goal … is to deliver scalable services and ensure a high-availability experience for EU customers. If you are an experienced professional ready for a challenging and impactful opportunity, we invite you to join our efforts in building a best-in-class development engineering and operations team that aligns with AWS' commitment to customer satisfaction and continual innovation. Utility Computing More ❯
with a primary focus on MongoDB. Your mission is to lead database administration efforts, define the MongoDB roadmap, and collaborate with IT Operations and other stakeholders to ensure the availability, performance, and security of our database systems. We are looking for a candidate who is passionate about database technologies and values collaboration, innovation, and continuous learning. You should have … you to: Lead the design, implementation, and maintenance of MongoDB database systems. Develop and enforce database security measures, policies, and best practices. Monitor and optimize database performance to ensure highavailability, scalability, and efficient resource utilization. Collaborate with development teams on database-related activities, including schema changes, data migrations, and performance tuning. Troubleshoot and resolve complex database issues … maintain robust backup and recovery strategies to ensure data integrity and recoverability. Plan and execute database upgrades, patches, and migrations. Implement and maintain database replication and clustering technologies for highavailability and disaster recovery. Document database configurations, procedures, and troubleshooting steps. Stay current with the latest database technologies, industry trends, and best practices. Mentor and provide guidance to More ❯
Job Description: Be responsible for the implementation, designing, developing, debugging, and troubleshooting high quality, high performance, and highavailability applications using both Waterfall and Agile development best practices. Execute responsibilities on various technical platforms including COBOL, JCL, Azure, AWS, .Net Core, .Net Framework, REST APIs, Microservices, Cosmos DB, NoSQL, SQL, JavaScript, HTML, CSS, Salesforce, Azure DevOps More ❯
opportunity for an experienced Senior Developer to play a pivotal role in the technical delivery of a flagship digital platform within UK Central Government. This role sits within a high-performing, cross-functional team responsible for designing and building scalable, secure, and modern services. You'll operate with a high level of autonomy and authority, taking ownership of … AWS (plus some exposure to Azure). Extensive knowledge of Terraform and infrastructure-as-code best practices. Containerisation using Docker, orchestration with ECS, and familiarity with load balancing and high-availability patterns. Experience working with MongoDB or similar NoSQL databases. Proficient in Git, GitHub Actions, and YAML-based pipeline design for CI/CD. Deep understanding of version … role is ideal for an experienced Senior Software Engineer who enjoys solving complex technical problems, thrives in delivery-led environments, and is motivated by the opportunity to work on high-profile government platforms. Apply now to join a high-performing team and make a tangible impact on a critical national platform. More ❯
responsible for blendingSite Reliability Engineering (SRE), DevOps, and traditional operations modelsto build a next-Reliability Engineering function. This role ensuresend-to-end automation at scale, 24x7 operational excellence, and high availabilityacrossall of BCG, includingBCG Core, BCG X, and Consulting Team (CT) worldwide. The leader will drivestrategic planning, execution, and optimizationof global IT infrastructure, cloud operations, and service management while … environmentsacrossall BCG business units.* Managenetwork reliability, compute platforms, and cloud- servicesacross AWS, Azure, and GCP.* ScaleInfrastructure as Code (IaC),automated provisioning, andcloud workload optimization.* Driveedge computing, containerized workloads, and high-performance computing strategies.* ImplementAI-driven monitoring, self-healing automation, and full-stack observability.IT Service Management & Operational Excellence:* Mandate and assure the adoption of IT Service Management (ITSM) processes across … ensuring standardized, efficient, and effective service delivery.* EstablishSRE-based operational metrics, includingSLOs, SLIs, and error budgets.* Overseeincident response, problem resolution, and root cause analysis with AI-driven remediation.* Ensurehigh availability, performance, and security compliancefor all enterprise services.* Develop afollow-the-sun operational support model, ensuring24x7 resilience and uptime across all of BCG.* Optimizeincident, change, and capacity management, ensuring alignment More ❯
responsible for blendingSite Reliability Engineering (SRE), DevOps, and traditional operations modelsto build a next-generationReliability Engineering function. This role ensuresend-to-end automation at scale, 24x7 operational excellence, and high availabilityacrossall of BCG, includingBCG Core, BCG X, and Consulting Team (CT) worldwide. The leader will drivestrategic planning, execution, and optimizationof global IT infrastructure, cloud operations, and service management while … BCG business units. Managenetwork reliability, compute platforms, and cloud-native servicesacross AWS, Azure, and GCP. ScaleInfrastructure as Code (IaC),automated provisioning, andcloud workload optimization. Driveedge computing, containerized workloads, and high-performance computing strategies. ImplementAI-driven monitoring, self-healing automation, and full-stack observability. IT Service Management & Operational Excellence: Mandate and assure the adoption of IT Service Management (ITSM) processes … ensuring standardized, efficient, and effective service delivery. EstablishSRE-based operational metrics, includingSLOs, SLIs, and error budgets. Overseeincident response, problem resolution, and root cause analysis with AI-driven remediation. Ensurehigh availability, performance, and security compliancefor all enterprise services. Develop afollow-the-sun operational support model, ensuring24x7 resilience and uptime across all of BCG. Optimizeincident, change, and capacity management, ensuring alignment More ❯
opportunity for an experienced Senior Developer to play a pivotal role in the technical delivery of a flagship digital platform within UK Central Government. This role sits within a high-performing, cross-functional team responsible for designing and building scalable, secure, and modern services. You'll operate with a high level of autonomy and authority, taking ownership of … AWS (plus some exposure to Azure). Extensive knowledge of Terraform and infrastructure-as-code best practices. Containerisation using Docker, orchestration with ECS, and familiarity with load balancing and high-availability patterns. Experience working with MongoDB or similar NoSQL databases. Proficient in Git, GitHub Actions, and YAML-based pipeline design for CI/CD. Deep understanding of version … role is ideal for an experienced Senior Software Engineer who enjoys solving complex technical problems, thrives in delivery-led environments, and is motivated by the opportunity to work on high-profile government platforms. Apply now to join a high-performing team and make a tangible impact on a critical national platform. TPBN1_UKTJ More ❯
the reliability and scalability of our production systems. Key Responsibilities Design, implement, and manage AWS cloud infrastructure. Develop and maintain automation scripts and tooling. Support production systems and ensure highavailability and performance. Implement observability and monitoring solutions. Collaborate closely with the PBS (Platform/Backend Services) team. Contribute to infrastructure as code (IaC) and DevOps best practices. More ❯
responsible for blendingSite Reliability Engineering (SRE), DevOps, and traditional operations modelsto build a next-generationReliability Engineering function. This role ensuresend-to-end automation at scale, 24x7 operational excellence, and high availabilityacrossall of BCG, includingBCG Core, BCG X, and Consulting Team (CT) worldwide. The leader will drivestrategic planning, execution, and optimizationof global IT infrastructure, cloud operations, and service management while … BCG business units. Managenetwork reliability, compute platforms, and cloud-native servicesacross AWS, Azure, and GCP. ScaleInfrastructure as Code (IaC),automated provisioning, andcloud workload optimization. Driveedge computing, containerized workloads, and high-performance computing strategies. ImplementAI-driven monitoring, self-healing automation, and full-stack observability. IT Service Management & Operational Excellence Mandate and assure the adoption of IT Service Management (ITSM) processes … ensuring standardized, efficient, and effective service delivery. EstablishSRE-based operational metrics, includingSLOs, SLIs, and error budgets. Overseeincident response, problem resolution, and root cause analysis with AI-driven remediation. Ensurehigh availability, performance, and security compliancefor all enterprise services. Develop afollow-the-sun operational support model, ensuring24x7 resilience and uptime across all of BCG. Optimizeincident, change, and capacity management, ensuring alignment More ❯
the reliability and scalability of our production systems. Key Responsibilities Design, implement, and manage AWS cloud infrastructure. Develop and maintain automation scripts and tooling. Support production systems and ensure highavailability and performance. Implement observability and monitoring solutions. Collaborate closely with the PBS (Platform/Backend Services) team. Contribute to infrastructure as code (IaC) and DevOps best practices. More ❯
Engineer to join our team on a contract basis, with a focus on AWS infrastructure, observability tooling, and CI/CD automation. This is a hands-on role supporting high-availability systems, rapid deployments, and production incident response. Key Responsibilities - Manage and monitor AWS infrastructure for performance and security - Respond to production incidents, perform root cause analysis, and More ❯
Northampton, West Northamptonshire, Northamptonshire, United Kingdom Hybrid / WFH Options
Howdens Joinery
support and guidance in delivering solutions to meet business needs. - Provide detailed 2nd/3rd level design and documentation in support of Linux and IBM-Power platforms - Ensuring system availability, resolving any service effecting issues and escalating as appropriate - Ensure projects and introduction of new Mid-Range platforms are delivered in a timely and cost effective manner. - Ensures adoption … of best practise for Linux platforms and any underlying hardware to satisfy capacity, performance, availability and security requirements - Build and maintain relationships with key internal and external stakeholders and partner What we need from you - Demonstrable experience designing, implementing and supporting Linux systems (including but not limited to RHEL and SLES) - RedHat satellite server experience - Shell scripting (Bash and … support rota for “Out of Hours” cover. Applicants with experience in the following is highly desirable - IBM Power, AIX, VIO, NIM and CMC/HMC administration - Delivering and supporting highavailability architecture and solutions - Virtualisation and containerisation technologies - Delivery and support of cloud services in Azure and/or AWS - Redwood Cronacle/RunMyJobs - Project Management Methodologies (SAFe More ❯