to manage global teams, optimize operational efficiency, and ensure 99.9% uptime for critical services. In this role, you'll own the observability platform, manage high-impact incidents, and drive automation to reduce manual tasks. If you’re passionate about enhancing application resiliency and have a strong background in modern DevOps practices, this opportunity is for you! Key Responsibilities Incident … Management: Lead global teams in managing critical incidents, working closely with subject matter experts to identify root causes and implement lasting solutions. Operational Efficiency: Drive automation and tooling improvements to eliminate manual tasks, optimize system performance, and enhance service stability. Resiliency Planning: Enhance the resiliency of applications and infrastructure through self-healing systems, automated failovers, and proactive capacity management. … that drives high performance. Skills and Qualifications Skills Previous experience in front office trading support is an advantage (FX/Rates/capital markets technology front office support). Automation & Tooling: Expertise in automating manual tasks and improving technical processes using tools such as Chef, Ansible, Puppet, and Rundeck. Technical Leadership: Strong problem-solving skills, with the ability to More ❯
Skills Include Developing and enforcing SRE best practices and principles. Collaborating with development teams to build scalable and resilient systems. Aligning cross-functional teams on priorities and deliverables. Driving automation to enhance operational efficiency. You may be assessed on key skills such as risk and controls, change and transformation, business acumen, strategic thinking, digital and technology skills, as well … supporting applications and data systems, using hardware, software, networks, and cloud platforms to ensure reliability, scalability, and security. Ensure the systems' reliability, availability, and scalability through software engineering techniques, automation, and incident response best practices. Accountabilities Build Engineering: Develop, deliver, and maintain high-quality infrastructure solutions to meet business requirements, ensuring reliability, performance, and ease of use. Incident Management … Monitor infrastructure and system performance, identify and resolve issues, and use data to reduce mean time to resolution. Automation: Develop and implement automated tasks and processes to improve efficiency and reduce manual work. Security: Implement secure configurations and measures to protect infrastructure against cyber threats, vulnerabilities, and unauthorized access. Teamwork: Collaborate with product managers, architects, and engineers to define More ❯
role at Aviva 1 week ago Be among the first 25 applicants Join to apply for the Site Reliability Engineering Lead role at Aviva Are you passionate about infrastructure automation and engineering? Do you have experience of building reliable scalable systems, or previous knowledge of Service Reliability Engineering? Do you have experience working hands-on with automation approaches … and tools in an infrastructure engineering or operations capacity? Do you want to be deeply involved in exciting strategic automation and want to contribute to our cloud transformation for Aviva? We are looking for passionate Cloud engineering professional to join our diverse and growing team to shape and actively contribute to the future of Cloud Service Desk and Service … of the sourcing partner, taking action to improve service delivery for end consumers Drive continuous improvements across the Cloud Platform services by identifying, raising, and managing enhancement opportunities Lead automation of recurring operational tasks to improve efficiency and reduce manual intervention Manage the operational acceptance of new services transitioned from Platform Engineering and Cloud Enablement teams into BAU, and More ❯
build robust, highly scalable solutions that will power the future of how Bloomberg automates network infrastructure. You'll be trusted to design and work on tooling that builds on automation best practices and principles. We'll trust you to Develop and maintain software tools to manage a large-scale, multi-vendor network with an emphasis on automation, telemetry … as a Software, Network Production, or System Reliability Engineer. Experience with building, maintaining and continuously enhancing automations needed for scalability & efficiency in running the Network Infrastructure. Experience in infrastructure Automation and orchestration Frameworks e.g. Ansible, Airflow, Terraform, Chef, Salt. Proven experience with object-oriented programming languages preferably in Python. A bachelor's or master's degree in computer science More ❯
all connectivity on site to be used to identify business areas/stakeholders and used as a basis to produce a runbook for migration tasks . Develop and maintain automation scripts to streamline network operations and ensure consistency across environments. Collaborate with cross-functional teams to understand application requirements and translate them into network solutions. Ensure network security by … with a strong portfolio of successful network infrastructure projects. Expertise in network architecture and engineering, including TCP/IP, DNS, VPN, LAN/WAN, and QoS. Proficient in network automation tools and scripting languages such as Python, Ansible, or Terraform. Strong knowledge of network security protocols and best practices. Excellent problem-solving skills and the ability to work under More ❯
A proven track record of implementing and leading SRE practices across large organizations or complex teams. Extensive hands-on experience on Containers and Kubernetes. In-depth experience with DevOps automation tools such as code versioning (git), JIRA, Ansible, database CI/CD tools and their implementation. Some other highly valued skills may include: Expertise with scripting languages (e.g. PowerShell … Python, Bash) for automation/migration tasks. Experience of working on data migration tools and software. Expertise in system configuration management tools such as Chef, Ansible for database server configurations. You may be assessed on the key critical skills relevant for success in the role, such as risk and controls, change and transformation, business acumen, strategic thinking, and digital … and technology, as well as job-specific technical skills. This role can be based in our Knutsford or Glasgow locations. Purpose of the role To apply software engineering techniques, automation, and best practices in incident response, to ensure the reliability, availability, and scalability of the systems, platforms, and technology. Accountabilities Availability, performance, and scalability of systems and services through More ❯
A Proven track record of implementing and leading SRE practices across large organizations or complex teams. Extensive hands-on experience on Containers and Kubernetes In depth experience with DevOps automation tools such as Code versioning (git), JIRA, Ansible, database CI/CD tools and their implementation. Some other highly valued skills may include: Expert expertise with scripting languages (e.g. … PowerShell, Python, Bash) for automation/migration tasks Experience of working on Data migration tools and software’s Expertise in system configuration management tools such as Chef, Ansible for database server configurations. This role can be based in our Knutsford, or Glasgow, locations. Purpose of the role To apply software engineering techniques, automation, and best practices in incident More ❯
critical services on PostgreSQL In depth knowledge of high availability approaches, backup and recovery procedures as well as database performance tuning Experience with PowerShell or other scripting languages for automation Experience building and debugging database stored procedures, triggers and functions Experience of change management procedures and documentation of technical solutions Working knowledge of Python and Spark is highly desirable … critical services on PostgreSQL In depth knowledge of high availability approaches, backup and recovery procedures as well as database performance tuning Experience with PowerShell or other scripting languages for automation Experience building and debugging database stored procedures, triggers and functions Experience of change management procedures and documentation of technical solutions Working knowledge of Python and Spark is highly desirable More ❯
build robust, highly scalable solutions that will power the future of how Bloomberg automates network infrastructure. You'll be trusted to design and work on tooling that builds on automation best practices and principles. We’ll trust you to Develop and maintain software tools to manage a large-scale, multi-vendor network with an emphasis on automation, telemetry … as a Software, Network Production, or System Reliability Engineer. Experience with building, maintaining and continuously enhancing automations needed for scalability & efficiency in running the Network Infrastructure. Experience in infrastructure Automation and orchestration Frameworks e.g. Ansible, Airflow, Terraform, Chef, Salt. Proven experience with object-oriented programming languages preferably in Python. A bachelor's or master's degree in computer science More ❯
distributed systems challenges at unprecedented scale, your could help shape our next generation of cloud infrastructure. We are actively seeking experienced Systems Development Engineers with a strong background in automation and operations. Must hold or be able to attain an Australian Government Security Vetting Agency clearance (see ). Key job responsibilities - Support the refinement of system requirements, participate in … the development and delivery of operability-related features such as system health monitoring, diagnostics, repair, and other self-healing automation - Develop or further existing application and system management tools and processes that reduce manual efforts and increase overall efficiency - Adapt and improve operations management systems and processes to accommodate rapid and increasing growth in systems and traffic - Participate in … s why you'll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. - 1+ years of contributing to automation for new and current system experience - Experience programming with at least one modern language such as Python, Ruby, Golang, Java, C++, C#, Rust - Experience with Linux/Unix - Experience More ❯
disaster recovery strategies. Flex Appliance Management - Experience in deploying, configuring, and maintaining NetBackup Flex Appliances. Access Appliance Management - Experience in deploying, configuring, and maintaining NetBackup Access Appliances. Bash Scripting & Automation - Proficiency in Bash, Powershell, or Python scripts for automating backup tasks, log analysis, and system monitoring. Containerization & Linux Administration - Familiarity with Docker-based NetBackup instances and Linux-based system More ❯
to industry and regulatory security requirements. Utilize System Center Configuration Manager for software deployments, updates, and compliance in private and public cloud setups. Ensure robust backup management and monitoring Automation and Scripting: Develop and optimize PowerShell scripts to automate administration, monitoring, and incident response across hybrid environments. Streamline processes to reduce manual efforts and improve operational efficiency. Documentation and … network troubleshooting advanced troubleshooting, and system hardening. Hands-on experience with System Center Configuration Manager (SCCM) for multi-tenant and cloud-hosted setups. Advanced skills in PowerShell scripting for automation in hybrid infrastructures. Familiarity with monitoring tools like Zabbix, PRTG, or other hybrid cloud monitoring platforms. Proven track record of maintaining uptime and security in private and public cloud … to-end, our hard-working process engines deliver exceptional functionality and embed workflows that drive efficiency and best practice with a long-term focus for regulated environments. Through the automation of tasks, the simplification of complex operations, finding scalability as operations evolve, and more effective management of information, we help our customers harness the power of Digital, so they More ❯
Tooling & Integration: Evaluate, select, and onboard security solutions (e.g., endpoint protection, SIEM, vulnerability scanners). Integrate security tools with existing systems and workflows, ensuring effective threat detection and response. Automation & Scripting: Develop and maintain scripts and automation tools to streamline IT operations and enhance security. Automate security tasks, such as patch management, vulnerability scanning or secure configuration enforcement. More ❯
Possible extension ) Are you a skilled Site Reliability Engineer (SRE) with experience in maintaining scalable and reliable infrastructure? We're looking for a proactive leader with a passion for automation, incident management, and system optimization. Key Skills Required: 5+ years of SRE or similar experience Expertise in Cloud Platforms (SIEM technologies preferred) Proficiency in Python or Bash scripting Hands More ❯
London, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
to industry and regulatory security requirements. Utilize System Center Configuration Manager for software deployments, updates, and compliance in private and public cloud setups. Ensure robust backup management and monitoring Automation and Scripting: Develop and optimize PowerShell scripts to automate administration, monitoring, and incident response across hybrid environments. Streamline processes to reduce manual efforts and improve operational efficiency. Documentation and … network troubleshooting advanced troubleshooting, and system hardening. Hands-on experience with System Center Configuration Manager (SCCM) for multi-tenant and cloud-hosted setups. Advanced skills in PowerShell scripting for automation in hybrid infrastructures. Familiarity with monitoring tools like Zabbix, PRTG, or other hybrid cloud monitoring platforms. Proven track record of maintaining uptime and security in private and public cloud … to-end, our hard-working process engines deliver exceptional functionality and embed workflows that drive efficiency and best practice with a long-term focus for regulated environments. Through the automation of tasks, the simplification of complex operations, finding scalability as operations evolve, and more effective management of information, we help our customers harness the power of Digital, so they More ❯
London, England, United Kingdom Hybrid / WFH Options
amber labs
and secure before promotion to production. Requirements: Proficiency in AWS services, including EC2, S3, Lambda, IAM, VPC, and more. Strong hands-on experience with Terraform for infrastructure-as-code automation and management. Expertise in configuring and managing NGINX and reverse proxy solutions; Lua scripting experience is a plus. Strong experience with Python for writing automation scripts and creating More ❯
London, England, United Kingdom Hybrid / WFH Options
Idox plc
to industry and regulatory security requirements. Utilize System Center Configuration Manager for software deployments, updates, and compliance in private and public cloud setups. Ensure robust backup management and monitoring Automation and Scripting: Develop and optimize PowerShell scripts to automate administration, monitoring, and incident response across hybrid environments. Streamline processes to reduce manual efforts and improve operational efficiency. Documentation and … network troubleshooting advanced troubleshooting, and system hardening. Hands-on experience with System Center Configuration Manager (SCCM) for multi-tenant and cloud-hosted setups. Advanced skills in PowerShell scripting for automation in hybrid infrastructures. Familiarity with monitoring tools like Zabbix, PRTG, or other hybrid cloud monitoring platforms. Proven track record of maintaining uptime and security in private and public cloud … to-end, our hard-working process engines deliver exceptional functionality and embed workflows that drive efficiency and best practice with a long-term focus for regulated environments. Through the automation of tasks, the simplification of complex operations, finding scalability as operations evolve, and more effective management of information, we help our customers harness the power of Digital, so they More ❯
Slough, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
we combine smart technology with real human empathy to simplify borrowing. As we continue to innovate, we’re looking for a Senior DevOps Engineer who’s passionate about infrastructure, automation, and AI to help us keep things running smoothly – and evolve what’s possible. This is your opportunity to lead and shape the infrastructure that powers a faster, fairer … to improve efficiency and reduce errors. Designing, developing, and implementing services to support product development activities Managing and improving robust CI/CD pipelines across all internal projects Championing automation — using AI tools and smart scripts to eliminate repetitive tasks Troubleshooting and resolving issues across environments, improving uptime and resilience Monitoring system security and performance and proactively managing updates More ❯
disaster recovery strategies. Flex Appliance Management - Experience in deploying, configuring, and maintaining NetBackup Flex Appliances. Access Appliance Management - Experience in deploying, configuring, and maintaining NetBackup Access Appliances. Bash Scripting & Automation - Proficiency in Bash, Powershell, or Python scripts for automating backup tasks, log analysis, and system monitoring. Containerization & Linux Administration - Familiarity with Docker-based NetBackup instances and Linux-based system More ❯
solutions that transform raw data into actionable insights. As part of a collaborative and learning-focused culture, you will follow modern data engineering best practices, including software engineering principles, automation, and cloud technologies, to deliver high-quality solutions. This is an exciting opportunity to influence Pret’s data transformation journey, enabling the business to innovate and thrive. Responsibilities Translate … experience Essential: 3+ years’ experience as a Data Engineer, working with big data solutions and following software engineering best practices. Proficient in Python and Java for data engineering and automation tasks. Strong knowledge of SQL for querying, transforming, and managing datasets. Experience building and optimising cloud-based data pipelines using platforms such as Snowflake. Proficient in version control tools More ❯
knowledge, skills, and experience across our organisation, and our Global Mobility Programme provides the gateway to a whole world of opportunities. This role will focus on design, deployment and automation of technical infrastructure services developed and hosted by BDO Global IT. The emphasis of the role will be on designing, deploying and automating infrastructure as code. The cloud engineer … position: Administration of core cloud platform services across a variety of Microsoft cloud (Azure) technologies. Working with the architects to design the cloud infrastructure needed to run our services. Automation of re-occurring tasks and processes related to Infrastructure. Design of deployment pipelines and creating infrastructure as code templates to support these. Working to resolve tickets as they are More ❯
knowledge, skills, and experience across our organisation, and our Global Mobility Programme provides the gateway to a whole world of opportunities. This role will focus on design, deployment and automation of technical infrastructure services developed and hosted by BDO Global IT. The emphasis of the role will be on designing, deploying and automating infrastructure as code. The cloud engineer … position: Administration of core cloud platform services across a variety of Microsoft cloud (Azure) technologies. Working with the architects to design the cloud infrastructure needed to run our services. Automation of re-occurring tasks and processes related to Infrastructure. Design of deployment pipelines and creating infrastructure as code templates to support these. Working to resolve tickets as they are More ❯
knowledge, skills, and experience across our organisation, and our Global Mobility Programme provides the gateway to a whole world of opportunities. This role will focus on design, deployment and automation of technical infrastructure services developed and hosted by BDO Global IT. The emphasis of the role will be on designing, deploying and automating infrastructure as code. The cloud engineer … position: Administration of core cloud platform services across a variety of Microsoft cloud (Azure) technologies. Working with the architects to design the cloud infrastructure needed to run our services. Automation of re-occurring tasks and processes related to Infrastructure. Design of deployment pipelines and creating infrastructure as code templates to support these. Working to resolve tickets as they are More ❯
knowledge, skills, and experience across our organisation, and our Global Mobility Programme provides the gateway to a whole world of opportunities. This role will focus on design, deployment and automation of technical infrastructure services developed and hosted by BDO Global IT. The emphasis of the role will be on designing, deploying and automating infrastructure as code. The cloud engineer … position: Administration of core cloud platform services across a variety of Microsoft cloud (Azure) technologies. Working with the architects to design the cloud infrastructure needed to run our services. Automation of re-occurring tasks and processes related to Infrastructure. Design of deployment pipelines and creating infrastructure as code templates to support these. Working to resolve tickets as they are More ❯
Leeds, Yorkshire, United Kingdom Hybrid / WFH Options
eTeam Workforce Limited
disaster recovery strategies. Flex Appliance Management - Experience in deploying, configuring, and maintaining NetBackup Flex Appliances. Access Appliance Management - Experience in deploying, configuring, and maintaining NetBackup Access Appliances. Bash Scripting & Automation - Proficiency in Bash, Powershell, or Python scripts for automating backup tasks, log analysis, and system monitoring. Containerization & Linux Administration - Familiarity with Docker-based NetBackup instances and Linux-based system More ❯