role in delivering a mission-critical software suite for a large-scale SatComs enterprise. Your work will involve implementing complex algorithmic solutions that drive high-availability and high-reliability systems. You will be responsible for developing and maintaining high-quality software solutions that meet our clients … any arising issues. Engaging directly with client stakeholders, you will demonstrate new software features, ensuring alignment with operational needs and expectations. Key responsibilities: Implement high-quality, high-availability, and highly reliable algorithmic code. Manage and maintain an automated subsystem test suite. Produce detailed designs for future software … Additionally, hands-on experience with SatComs is mandatory, as the role involves working on satellite communications-related solutions. Desirable qualifications include: Experience in implementing high-availability and near real-time software solutions, as well as proficiency in Linux environments, Python scripting, and Robot Framework. Familiarity with containerisation technologies More ❯
role in delivering a mission-critical software suite for a large-scale SatComs enterprise. Your work will involve implementing complex algorithmic solutions that drive high-availability and high-reliability systems. You will be responsible for developing and maintaining high-quality software solutions that meet our clients … any arising issues. Engaging directly with client stakeholders, you will demonstrate new software features, ensuring alignment with operational needs and expectations. Key responsibilities: • Implement high-quality, high-availability, and highly reliable algorithmic code. • Manage and maintain an automated subsystem test suite. • Produce detailed designs for future software … Additionally, hands-on experience with SatComs is mandatory, as the role involves working on satellite communications-related solutions. Desirable qualifications include experience in implementing high-availability and near real-time software solutions, as well as proficiency in Linux environments, Python scripting, and Robot Framework. Familiarity with containerisation technologies More ❯
Space division, you will be working on a mission-critical software suite for a large-scale SatComs enterprise. You will be responsible for developing high-quality, high-availability, and highly reliable software solutions. This includes implementing complex algorithmic solutions that meet stringent performance and reliability standards. Your … software functionality. Your ability to communicate complex technical concepts to both technical and non-technical stakeholders will be highly valued. Key responsibilities include: Implementing high-quality, high-availability, and highly reliable algorithmic code. Managing and maintaining an automated subsystem test suite. Producing software designs and estimating required … background in Mathematics, Physics, Astrophysics, or a similar field. Evidence of working on complex mathematical development projects. Desirable qualifications include: Experience developing software with highavailability and near real-time responsiveness. Proficiency in Linux, Python, and Robot framework. Experience with containerisation technologies such as Docker or Kubernetes. Knowledge More ❯
Space division, you will be working on a mission-critical software suite for a large-scale SatComs enterprise. You will be responsible for developing high-quality, high-availability, and highly reliable software solutions. This includes implementing complex algorithmic solutions that meet stringent performance and reliability standards. Your … software functionality. Your ability to communicate complex technical concepts to both technical and non-technical stakeholders will be highly valued. Key responsibilities include: • Implementing high-quality, high-availability, and highly reliable algorithmic code. • Managing and maintaining an automated subsystem test suite. • Producing software designs and estimating required … background in Mathematics, Physics, Astrophysics, or a similar field. • Evidence of working on complex mathematical development projects. Desirable qualifications include: • Experience developing software with highavailability and near real-time responsiveness. • Proficiency in Linux, Python, and Robot framework. • Experience with containerisation technologies such as Docker or Kubernetes. • Knowledge More ❯
recovery strategies are fundamental for businesses implementing hybrid cloud solutions in London. Windows Server 2025 elevates these capabilities through advanced cloud-based backup and high-availability configurations, offering a comprehensive framework that supports the resilience of business operations. Here's how Windows Server 2025 enhances these critical areas … redundancy, where data can be replicated in multiple locations across the cloud. This is especially crucial for businesses in London that must maintain data availability and business operations despite local disruptions. Enhanced Business Continuity Plans : High-Availability Configurations : Windows Server 2025 supports high-availability setups More ❯
to respond quickly to changing business needs. You'll be working in a highly skilled engineering team to drive automation, scalability, performance, reliability, and highavailability for our complex, large-scale, revenue-critical systems handling: Deployment to GCP and AWS across multiple regions Highly scalable systems capable of … secure infrastructure solutions Develop and maintain CI/CD pipelines, Infrastructure as Code, and automation frameworks tailored to our systems Drive disaster recovery planning, highavailability architecture, and 24/7 SLO adherence for critical ad-serving solutions Build and maintain custom, complex deployment pipelines using Jenkins and … team members and evangelize technical innovation We're excited if you have 6+ years of experience designing and building DevOps/SRE solutions for high-scale, distributed systems Proven expertise with GCP and AWS, including multi-region deployments Proficiency with Infrastructure-as-Code (IaC) tools such as Terraform (preferred More ❯
part of the AWS Managed Operations team, you will play a pivotal role in building and leading operations and development teams dedicated to delivering high-availability AWS services, including EC2, S3, Dynamo, Lambda, and Bedrock, exclusively for EU customers. For more information on ESC please check out our … A typical day in this role involves collaborating with technology leaders, contributing to the enhancement of day-to-day operations, and ensuring improvements in availability, reliability, latency, performance, and efficiency of the ESC. You will be required to occasionally participate in "on-call" rotations to resolve incidents occurring out … of-hours. The overarching goal is to deliver scalable services and ensure a high-availability experience for EU customers. If you are an experienced professional ready for a challenging and impactful opportunity, we invite you to join our efforts in building a best-in-class development engineering and More ❯
part of the AWS Managed Operations team, you will play a pivotal role in building and leading operations and development teams dedicated to delivering high-availability AWS services, including EC2, S3, Dynamo, Lambda, and Bedrock, exclusively for EU customers. For more information on ESC please check out our … A typical day in this role involves collaborating with technology leaders, contributing to the enhancement of day-to-day operations, and ensuring improvements in availability, reliability, latency, performance, and efficiency of the ESC. You will be required to occasionally participate in "on-call" rotations to resolve incidents occurring out … of-hours. The overarching goal is to deliver scalable services and ensure a high-availability experience for EU customers. If you are an experienced professional ready for a challenging and impactful opportunity, we invite you to join our efforts in building a best-in-class development engineering and More ❯
Sunderland, Tyne and Wear, Tyne & Wear, United Kingdom Hybrid / WFH Options
Randstad Technologies Recruitment
and take a lead role in the ongoing development, optimisation, and resilience of the organisation's database environment. You'll be responsible for maintaining highavailability, supporting integrations across platforms, and ensuring the reliability and performance of systems critical to business operations. Key Responsibilities: Configure and manage highavailability and disaster recovery solutions including Always On Availability Groups, mirroring, and clustering. Implement and test backup and recovery procedures to safeguard data. Monitor performance metrics and carry out tuning and optimisation as required. Support development and integration efforts across cloud and on-prem environments. Use version … line support and investigate root causes of system issues. What We're Looking For: Solid experience in SQL Server database administration. Strong understanding of highavailability, backup, and recovery strategies. Proficient in writing and troubleshooting T-SQL. Experience with ETL tools (e.g. SSIS, Azure Data Factory, Informatica, Talend More ❯
part of the AWS Managed Operations team, you will play a pivotal role in building and leading operations and development teams dedicated to delivering high-availability AWS services, including EC2, S3, Dynamo, Lambda, and Bedrock, exclusively for EU customers. For more information on ESC please check out our … A typical day in this role involves collaborating with technology leaders, contributing to the enhancement of day-to-day operations, and ensuring improvements in availability, reliability, latency, performance, and efficiency of the ESC. You will be required to occasionally participate in "on-call" rotations to resolve incidents occurring out … of-hours. The overarching goal is to deliver scalable services and ensure a high-availability experience for EU customers. If you are an experienced professional ready for a challenging and impactful opportunity, we invite you to join our efforts in building a best-in-class development engineering and More ❯
Engineer (SRE) to join our dedicated team. As an SRE at Paymentology, you'll be the superhero responsible for maintaining, improving, and ensuring the highavailability, scalability, and performance of our platform. Tasks Platform Reliability and Scalability: Build software that enhances Paymentology services' scalability and reliability. Ensure platform … will allow you to clearly convey your ideas and recommendations. As a key member of our technical team, you will be expected to maintain highavailability and be ready to address critical incidents, ensuring the continuous performance of our systems. This includes being part of an on-call … and development. Ready to Join Us? If you're a gadget guru who thrives on optimizing infrastructure, automating all the things, and delivering sky-highavailability and performance, we want to hear from you! Apply now and be part of a company that values your skills and fosters More ❯
Engineer (SRE) to join our dedicated team. As an SRE at Paymentology, you'll be the superhero responsible for maintaining, improving, and ensuring the highavailability, scalability, and performance of our platform. Tasks Platform Reliability and Scalability: Build software that enhances Paymentology services' scalability and reliability. Ensure platform … will allow you to clearly convey your ideas and recommendations. As a key member of our technical team, you will be expected to maintain highavailability and be ready to address critical incidents, ensuring the continuous performance of our systems. This includes being part of an on-call … and development. Ready to Join Us? If you're a gadget guru who thrives on optimizing infrastructure, automating all the things, and delivering sky-highavailability and performance, we want to hear from you! Apply now and be part of a company that values your skills and fosters More ❯
traditional operations models to build a next-generation Reliability Engineering function. This role ensures end-to-end automation at scale, 24x7 operational excellence, and highavailability across all of BCG, including BCG Core, BCG X, and Consulting Team (CT) worldwide. The leader will drive strategic planning, execution, and … native services across AWS, Azure, and GCP. * Scale Infrastructure as Code (IaC), automated provisioning, and cloud workload optimization. * Drive edge computing, containerized workloads, and high-performance computing strategies. * Implement AI-driven monitoring, self-healing automation, and full-stack observability. IT Service Management & Operational Excellence: * Mandate and assure the adoption … SRE-based operational metrics, including SLOs, SLIs, and error budgets. * Oversee incident response, problem resolution, and root cause analysis with AI-driven remediation. * Ensure highavailability, performance, and security compliance for all enterprise services. * Develop a follow-the-sun operational support model, ensuring 24x7 resilience and uptime across More ❯
traditional operations models to build a next-generation Reliability Engineering function. This role ensures end-to-end automation at scale, 24x7 operational excellence, and highavailability across all of BCG, including BCG Core, BCG X, and Consulting Team (CT) worldwide. The leader will drive strategic planning, execution, and … native services across AWS, Azure, and GCP. * Scale Infrastructure as Code (IaC), automated provisioning, and cloud workload optimization. * Drive edge computing, containerized workloads, and high-performance computing strategies. * Implement AI-driven monitoring, self-healing automation, and full-stack observability. IT Service Management & Operational Excellence: * Mandate and assure the adoption … SRE-based operational metrics, including SLOs, SLIs, and error budgets. * Oversee incident response, problem resolution, and root cause analysis with AI-driven remediation. * Ensure highavailability, performance, and security compliance for all enterprise services. * Develop a follow-the-sun operational support model, ensuring 24x7 resilience and uptime across More ❯
Purpose Responsible for the proactive support of products so that there is high product performance that is continuously improved. Responsible for identifying and resolving the root causes of operational incidents, implementing solutions to improve stability and prevent recurrence. Manages the creation and maintenance of the event catalog to trigger … enhance efficiency and collaboration between development and operations within service operations. Key Responsibilities Site Reliability Engineer Define, build, and maintain support systems to ensure highavailability and performance. Handle complex cases for the Operations team. Build events to add to the event catalog for the relevant product or … external stakeholders for feedback for continual service improvement for inscope products & drive plan till successful closure Accountable for the in scope product to ensure highavailability performance. Problem Management Conduct thorough problem investigations and root cause analyses (RCA) to diagnose recurring incidents and service disruptions Coordinate with incident More ❯
Purpose Responsible for the proactive support of products so that there is high product performance that is continuously improved. Responsible for identifying and resolving the root causes of operational incidents, implementing solutions to improve stability and prevent recurrence. Manages the creation and maintenance of the event catalog to trigger … enhance efficiency and collaboration between development and operations within service operations. Key Responsibilities Site Reliability Engineer Define, build, and maintain support systems to ensure highavailability and performance. Handle complex cases for the Operations team. Build events to add to the event catalog for the relevant product or … external stakeholders for feedback for continual service improvement for inscope products & drive plan till successful closure Accountable for the in scope product to ensure highavailability performance. Problem Management Conduct thorough problem investigations and root cause analyses (RCA) to diagnose recurring incidents and service disruptions Coordinate with incident More ❯
City, Edinburgh, United Kingdom Hybrid / WFH Options
ENGINEERINGUK
team is on a transformational journey from a mature set of applications to an integrated persona-based platform with streamlined user workflows and a high degree of automation and scale. The team works closely with BlackRock portfolio managers, traders, and investment compliance officers and delivers to external clients. They … also partner closely with world-class AI research and engineering teams, product managers, UX designers, quality assurance engineers, and client support teams to deliver high quality, scalable, and resilient capabilities. Being a member of investment and trading engineering you will be: • Tenacious: Work in a fast-paced and highly … Quick learner: Pick up new concepts and apply them quickly. Responsibilities • Take ownership of individual project priorities, deadlines, and deliverables using AGILE methodologies. • Deliver high efficiency, highavailability, concurrent and fault-tolerant software systems. • Contribute to the development of Aladdin's global, multi-asset trading platform. • Provide More ❯
City, Edinburgh, United Kingdom Hybrid / WFH Options
ENGINEERINGUK
team is on a transformational journey from a mature set of applications to an integrated persona-based platform with streamlined user workflows and a high degree of automation and scale. The team works closely with BlackRock portfolio managers, traders and investment compliance officers and delivers to external clients. They … also partner closely with world class AI research and engineering teams, product managers, UX designers, quality assurance engineers, and client support teams to deliver high quality, scalable and resilient capabilities. Being a member of investment and trading engineering you will be: Tenacious: Work in a fast paced and highly … Quick learner: Pick up new concepts and apply them quickly. Responsibilities Take ownership of individual project priorities, deadlines and deliverables using AGILE methodologies. Deliver high efficiency, highavailability, concurrent and fault tolerant software systems. Contribute to development of Aladdin's global, multi-asset trading platform. Provide impact More ❯
City, Edinburgh, United Kingdom Hybrid / WFH Options
ENGINEERINGUK
asset management process. This includes ensuring that clients follow guidelines established internally, by themselves, and by the market regulators. The system accomplishes this through high-throughput rule-based compliance engines, which sit at the heart of the investment system and process millions of calculations per day. Being a member … apply them quickly. Responsibilities Lead a team of 5-10 highly qualified software engineers. Own projects priorities, deadlines and deliverables using AGILE methodologies. Deliver high efficiency, highavailability, concurrent and fault tolerant software systems. Significantly contribute to development of Aladdin's global, multi-asset trading platform. Provide …/Web enterprise-grade software engineering. Prior experience leading a team of 5-15 software engineers. In-depth understanding of concurrent programming and designing high throughput, highavailability, fault-tolerant distributed applications. Expertise in building distributed applications using SQL and/or NoSQL technologies such as MSSQL More ❯
growth and are expanding their Infrastructure team. In this varied role, you will: Install and maintain SUSE Linux Enterprise servers Design, deploy, and manage high-availability clusters on SUSE Linux Configure and manage NFS servers in a SUSE Linux Environment The ideal candidate will have: 3+ years SUSE … Linux experience Specialisation in managing high-availability (HA) clusters Experience configuring Network File Systems (NFS) in Azure cloud environments for SAP Azure Cloud Integration Familiarity with automation & scripting Knowledge of virtualisation and containerisation This is a fantastic opportunity for a Linux Infrastructure Specialist to work with an industry … SAP experience/understanding Azure Cloud Integration 5 Days On site in Wantage The Person: 5+ Years of Experience SUSE Linux expert Experience with high-availability (HA) clusters in Linux environments Familiarity with automation, scripting, virtualisation and containerisation Reference Number: BBBH245645 To apply for this role or to More ❯
Oxford, Oxfordshire, United Kingdom Hybrid / WFH Options
Ellison Institute of Technology
most challenging problems. EIT Oxford will ensure scientific discoveries and pioneering science are turned into products for the benefit of society that can have high-impact worldwide and, over time, be commercialised to ensure long-term sustainability. Led by a faculty of world experts, EIT Oxford seeks to solve … the world's most challenging problems across four high-risk, high-reward, high-impact humane endeavours: health and medical science; food security and sustainable agriculture; climate change and clean energy; and government innovation in an era of artificial intelligence. EIT Oxford is investing significant resources in a … the seamless connectivity for this, and other, UK based locations. This role requires a seasoned Network expert, with many years operating within a complex, high-performance environment who possesses expertise in enterprise networking, network security, cloud networking, and IT infrastructure management. You will work closely with internal stakeholders, technology More ❯
Design, implement, and test key features in the Conversation Engine component of Clarity. Contribute to the overall architecture of Clarity, ensuring excellent conversation quality, highavailability, strong observability, and system efficiency. Assist with and troubleshoot across all stages of the software lifecycle, including design, deployment, and operations. Collaborate … e.g. MongoDB, DynamoDB) and in-memory Data Stores (e.g. Redis). Familiarity with asynchronous programming and event-driven architecture. Familiarity with building low latency, highavailability, and high throughput systems. Familiarity with Docker, CI/CD pipelines, and GCP. Ability to work collaboratively within a remote team More ❯
features in the Product Search Engine and Product Catalog processing components of Clarity. Contribute to the overall architecture of Clarity, ensuring excellent conversation quality, highavailability, strong observability, and system efficiency. Assist with and troubleshoot across all stages of the software lifecycle, including design, deployment, and operations. Collaborate … in-memory Data Stores (e.g., Redis). Experience with Vector DBs (e.g., Qdrant, FAISS, Pinecone) is a strong plus. Familiarity with building low latency, highavailability, and high throughput systems. Familiarity with Docker, CI/CD pipelines, and GCP. Ability to work collaboratively within a remote team More ❯
be very “hands on” with a focus on the improvement, automation and engineering of their trading platforms – a cutting edge, low latency and very highavailability infrastructure. You’ll take real ownership of release pipelines, performance engineering and work how you want a huge variety of tools and … of tools such as Kubernetes, Terraform, Ansible etc - Good knowledge of DevOps best practices, Cloud and tools (AWS is ideal) - Knowledge of working in highavailability/high volume infrastructure environments - Excellent communications skills More ❯
be very “hands on” with a focus on the improvement, automation and engineering of their trading platforms – a cutting edge, low latency and very highavailability infrastructure. You’ll take real ownership of release pipelines, performance engineering and work how you want a huge variety of tools and … of tools such as Kubernetes, Terraform, Ansible etc - Good knowledge of DevOps best practices, Cloud and tools (AWS is ideal) - Knowledge of working in highavailability/high volume infrastructure environments - Excellent communications skills More ❯