Apply strong networking knowledge to optimise performance, security, and reliability. Ensure compliance with financial services regulations and internal security policies. Contribute to CI/CD pipelines and cloud-native observability solutions. Explore and integrate emerging technologies including AI/LLM-based solutions to enhance automation and operational efficiency. Key Skills & Experience Essential: Strong hands-on experience with AWS (EC2, VPC … or working with AI/LLM solutions Familiarity with Terraform, Ansible, GitLab CI/CD, or similar tools Exposure to financial services or other highly regulated industries Experience with observability stacks (Prometheus, Grafana, ELK, etc. More ❯
based microservices. Troubleshoot production issues, ensuring uptime and documenting processes on the internal wiki. Automate deployments, testing processes, and infrastructure provisioning (Terraform, Ansible, GitHub Actions). Implement monitoring and observability solutions for proactive issue detection. Provide occasional support for internal IT infrastructure (e.g., laptops, printers, office networking). Occasionally maintain and support CMS platforms (Magento, Joomla, WordPress). Experience Required … management) Docker containerization Python scripting for automation Git version control Desirable (Future-Facing Skills): Infrastructure as Code (Terraform, Pulumi, Ansible) Container orchestration (Kubernetes) Go development for microservice utilities Modern observability tools (Prometheus, Grafana, Datadog) CI/CD pipeline management (GitHub Actions, GitLab CI, Jenkins) Firewall-as-a-Service solutions (e.g., Cloudflare) Endpoint/device management (e.g., Intune, NinjaOne) Exposure to More ❯
performance, scalability, failover, DR, resilience, alerting, and monitoring. Design and execute load, stress, endurance, and failover tests using industry-standard tools such as JMeter, LoadRunner, or ADS. Set up observability dashboards (Grafana, Splunk, Dynatrace, Kibana, or Datadog) to monitor test execution and system performance. Analyse results to identify performance bottlenecks, system vulnerabilities, and areas for optimisation. Report findings and recommendations … business teams. Experience working in Agile delivery environments with cross-functional teams. Nice to Have Background in financial services or experience supporting legacy-to-modernisation migrations. Understanding of infrastructure observability, cloud platforms, and microservice orchestration. Exposure to automation frameworks and scripting for performance testing. This is a key role within a global transformation programme — offering the chance to shape how More ❯
performance, scalability, failover, DR, resilience, alerting, and monitoring. Design and execute load, stress, endurance, and failover tests using industry-standard tools such as JMeter, LoadRunner, or ADS. Set up observability dashboards (Grafana, Splunk, Dynatrace, Kibana, or Datadog) to monitor test execution and system performance. Analyse results to identify performance bottlenecks, system vulnerabilities, and areas for optimisation. Report findings and recommendations … business teams. Experience working in Agile delivery environments with cross-functional teams. Nice to Have Background in financial services or experience supporting legacy-to-modernisation migrations. Understanding of infrastructure observability, cloud platforms, and microservice orchestration. Exposure to automation frameworks and scripting for performance testing. This is a key role within a global transformation programme — offering the chance to shape how More ❯
City of London, London, United Kingdom Hybrid/Remote Options
Harnham
the following: Technical tasks Architecting and scaling cloud infrastructure (GCP preferred) and high-performance computing environments Leading the design and implementation of DevOps platforms, CI/CD pipelines, and observability tools (Terraform, Docker, Kubernetes, Jenkins) Partnering with engineering and R&D to define technical roadmaps for compute and infrastructure products Other key responsibilities Managing and mentoring a team, fostering a … GitHub Actions; Terraform or CloudFormation; Prometheus, Grafana, Datadog, or New Relic; Slurm, Torque, LSF; MPI; Hadoop or Spark;Director of In Experience with high-performance computing, distributed systems, and observability tools Strong communication and executive presence, with the ability to translate complex technical concepts for diverse audiences Familiarity with AI/ML operations is a plus BENEFITS The successful Director More ❯
the following: Technical tasks Architecting and scaling cloud infrastructure (GCP preferred) and high-performance computing environments Leading the design and implementation of DevOps platforms, CI/CD pipelines, and observability tools (Terraform, Docker, Kubernetes, Jenkins) Partnering with engineering and R&D to define technical roadmaps for compute and infrastructure products Other key responsibilities Managing and mentoring a team, fostering a … GitHub Actions; Terraform or CloudFormation; Prometheus, Grafana, Datadog, or New Relic; Slurm, Torque, LSF; MPI; Hadoop or Spark;Director of In Experience with high-performance computing, distributed systems, and observability tools Strong communication and executive presence, with the ability to translate complex technical concepts for diverse audiences Familiarity with AI/ML operations is a plus BENEFITS The successful Director More ❯
experience designing and working with relational database schemas Excellent problem solving and communication skills, with a collaborative mindset Proficient in incremental software delivery leveraging agile processes Experience with software observability practices (distributed tracing, OpenTelemetry, etc.) Basic understanding of artificial intelligence concepts, with curiosity and enthusiasm for learning how AI tools can be used to improve processes and drive efficiency. Interest … systems Collaborate with cross functional teams including Product, QA, and DevOps Mentor junior engineers and promote engineering best practices Ensure code quality, security, and performance across all deliverables Champion observability and ensure software is observable, maintainable and resilient About the team Our Corp & Gov Technology team is responsible for delivering innovative software solutions that support Moody's public and private More ❯
Edinburgh, Midlothian, United Kingdom Hybrid/Remote Options
Lloyds Bank plc
a key role in ensuring the reliability, scalability, and security of our cloud-native data platforms. This is a hands-on engineering role with a strong focus on automation, observability, incident response, and cross-team collaboration Job Description JOB TITLE: Senior Site Reliability Engineer SALARY: £70,929 - £78,810 LOCATION: Edinburgh or Leeds WORKING PATTERN: Hybrid, 40% (or two days … Cloud Engineering roles. Strong knowledge of Cloud platforms: GCP (preferred), AWS or Azure. Proficiency in Terraform, Docker, Kubernetes, and CI/CD tools (e.g., Jenkins, Harness). Experience with observability tools and distributed tracing. Solid understanding of cloud security principles and vulnerability management. Excellent communication and documentation skills. A collaborative mindset and a bias for action. You'll help shape More ❯
Bristol, Avon, South West, United Kingdom Hybrid/Remote Options
Hargreaves Lansdown
HL version control set) with quality gates, automated testing, security scanning, and progressive delivery. Introduce and run GitOps for Kubernetes (AKS preferred), patterns and multi-environment promotions. Own platform observability: metrics, logs and traces using Azure Monitor/Log Analytics/Application Insights, plus Datadog/Grafana where appropriate. Embed security by design: Azure Policy, Defender for Cloud, secrets management … cluster operations, node pools, networking (CNI), ingress, secrets, RBAC and workload identity. Experience with GitOps, and container build pipelines (e.g., ACR, OPA policies, image scanning). Working knowledge of observability tooling (Azure Monitor, Log Analytics, Application Insights, Datadog/Grafana) and alerting/response workflows. Understanding of the Microsoft Cloud Adoption Framework, Azure Landing Zones and the Well-Architected Framework. More ❯
Employment Type: Permanent, Part Time, Work From Home
looking for an experienced Data Engineer to support on an initial 6 Month Contract engagement. You will own their data platform end to end, from ingestion & modelling to orchestration, observability & governance. You'll be responsible for designing & building robust, reliable pipelines, evolving their lakehouse/warehouse layers & enable fast, trustworthy analytics for multiple teams. Tech you'll be working with More ❯
distributed systems Contribute to ongoing improvements in reliability, latency, and scalability Qualifications: Linux expertise with a solid understanding of networking and containerisation Proficiency in at least Python Experience with observability tooling Proven track record in designing and maintaining highly distributed systems Apply now for a confidential chat More ❯
distributed systems Contribute to ongoing improvements in reliability, latency, and scalability Qualifications: Linux expertise with a solid understanding of networking and containerisation Proficiency in at least Python Experience with observability tooling Proven track record in designing and maintaining highly distributed systems Apply now for a confidential chat More ❯
Azure Container Instances), and ACA (Azure Container Apps). Create and maintain comprehensive documentation on newly implemented features suitable for an enterprise environment. Design and implement robust monitoring and observability tools to track container performance and health. Automate testing processes by utilizing public cloud elasticity and ephemeral resources, ensuring streamlined operations and reduced manual efforts. Contribute to the software development More ❯
Azure Container Instances), and ACA (Azure Container Apps). Create and maintain comprehensive documentation on newly implemented features suitable for an enterprise environment. Design and implement robust monitoring and observability tools to track container performance and health. Automate testing processes by utilizing public cloud elasticity and ephemeral resources, ensuring streamlined operations and reduced manual efforts. Contribute to the software development More ❯
how to manage workloads at scale. Proficient with Infrastructure as Code tools and practices. Comfortable writing automation, configuration, and tooling to simplify operations and reduce manual effort. Knowledgeable about observability tools & best practices. Ability to collaborate across teams with excellent written and verbal communication skills. Nice to Have Qualifications: Experience with multi-cloud and/or hybrid deployments. Knowledge of More ❯
how to manage workloads at scale. Proficient with Infrastructure as Code tools and practices. Comfortable writing automation, configuration, and tooling to simplify operations and reduce manual effort. Knowledgeable about observability tools & best practices. Ability to collaborate across teams with excellent written and verbal communication skills. Nice to Have Qualifications: Experience with multi-cloud and/or hybrid deployments. Knowledge of More ❯
Birmingham, West Midlands, United Kingdom Hybrid/Remote Options
Robert Walters
to improve performance Develop strategies to improve performance across group technology DevOps Lead: Experience Technical dept across but not limited to: Java, UNIX, Linux, Middleware, Web-Logic, Cloud Platforms Observability tools Designing/Developing/Implementing technology advancements Experience of improving resilience of complex production environments The permanent opportunity for a DevOps Lead will pay a salary range of More ❯
optimise BI dashboards and data products using Tableau, translating business needs into visual insights. Orchestrate and monitor data pipelines, ensuring data quality and timely delivery. Implement data quality checks, observability, and maintain data cataloging and lineage. Drive CI/CD practices using GitHub Actions or similar tools. Collaborate with cross-functional teams to improve platform capabilities and analytics maturity. Requirements More ❯
optimise BI dashboards and data products using Tableau, translating business needs into visual insights. Orchestrate and monitor data pipelines, ensuring data quality and timely delivery. Implement data quality checks, observability, and maintain data cataloging and lineage. Drive CI/CD practices using GitHub Actions or similar tools. Collaborate with cross-functional teams to improve platform capabilities and analytics maturity. Requirements More ❯
Design schemas and pipelines across Postgres and MongoDB Run CI and CD, improve build times, handle deployments and rollbacks Collaborate with data and ML to productionise models Instrument for observability and own incidents end to end What you will bring 1+ year engineering with strong Python in production Hands on Elasticsearch experience Solid SQL plus practical MongoDB CI and CD More ❯
ETL/ELT workflows, and reporting environments Ensure system stability, uptime, and SLA compliance through proactive monitoring Lead incident management, root cause analysis, and production deployments Implement automation and observability to improve performance and reduce manual effort Manage L2 support issues and coordinate fixes with Engineering and DevOps teams Drive improvements in data quality, governance, and workflow efficiency Collaborate with More ❯
scalable APIs in C#/.NET + Azure - Shape API standards, gateway strategy, versioning & authentication - Drive event-driven integrations and seamless third-party connectivity - Lead API performance, reliability, and observability improvements - Mentor engineers and influence architecture across multiple teams What you bring: - Deep C#/.NET experience in production systems - Strong REST API design + OpenAPI/Swagger knowledge - SQL More ❯