bradley stoke, south west england, united kingdom Hybrid / WFH Options
Hargreaves Lansdown Asset Management Limited
Excited to grow your career? Our purpose is to empower people to save and invest with confidence. We are looking for great people to join us, so please come and invest in YOUR future at HL. We know that sometimes More ❯
on experience) + Equity + Private Healthcare Start: Immediate start and interview process Requirements: Extensive experience with Ruby Strong background in platform engineering (DevOps, Observability, infrastructure) Familiarity with Rails FE is a nice to have (Hotwire, Stimulus, Turbo - nice to have) Collaborative engineering culture More ❯
wrangles Kubernetes like a pro, and makes DevOps teams breathe a sigh of relief. Your day might include onboarding a new customer, crafting smart observability strategies, and somehow explaining complex concepts to a C-level exec without using interpretive dance (unless that’s your thing, then we support it). More ❯
london (city of london), south east england, United Kingdom
Selby Jennings
services that power a research platform used by quants and engineers across the business Designing orchestration systems to manage job scheduling, compute clusters, and observability across multiple environments Collaborating with cross-functional teams to deliver a secure and seamless platform experience Helping shape the direction of a platform that supports More ❯
host on AWS and include services such as Lambda, S3, RDS, Cloudfront, Cognito, Appsync, CFT and CDK. We use 3rd party tooling for messaging, observability, product metrics & insights. More ❯
london (city of london), south east england, United Kingdom Hybrid / WFH Options
Harrington Starr
using Git and Python. Implement Infrastructure as Code practices using Terraform. Manage containerised environments with Docker or Kubernetes. Collaborate with dev teams to improve observability, deployment processes, and platform reliability. Build observability and monitoring solutions using Grafana, integrating key metrics to support proactive platform operations. Create and enforce internal standards More ❯
ML infrastructure: model deployment, training pipelines, inference tooling. Diagnose and optimise performance of large-scale ML models. Build and maintain experiment tracking, monitoring, and observability systems. Collaborate with SWE and infra colleagues to build tooling for data access, cleaning, and delivery. Contribute to the internal “toolbox” enabling repeatable, scalable ML More ❯
Vitals for optimal performance. Integrate third-party software into the platform, including tag management using Google Tag Manager (GTM) . Improve and maintain platform observability tools and systems. Manage and enhance automated CI/CD pipelines for efficient and reliable deployments. Ensure sites are accessible to all users, meeting WCAG More ❯
blockers to keep teams moving forward Champion engineering best practices including clean code, secure design, CI/CD, and test automation Oversee infrastructure resilience, observability, and incident response processes Act as the bridge between Product and Engineering teams. Provide transparency on engineering progress, challenges, and decisions to senior stakeholders. Key More ❯
valueIntegrate AI models into operational workflows Ensure reliability through fail-safes, self-healing, and fallback mechanisms Monitor & improve AI performance with feedback loops and observability tools Collaborate with Data Engineers to ensure AI has accurate, real-time data Implement human-in-the-loop systems where needed Skills and Qualifications Experience More ❯
Data Science, Platform, and Product teams to ensure AI and ML models are deployed, monitored, and maintained effectively. Promote best practices in model deployment, observability, CI/CD for ML, and responsible AI principles, helping engineers embed these into their workflows. Oversee the development and implementation of scalable AI and More ❯
software deployment and scalability. CI/CD Expertise: Automate software build, test, and deployment pipelines following agile methodologies. Terraform Exposure: Beneficial experience with Terraform. Observability Tools: Experience with Grafana and Splunk is beneficial, particularly in developing and applying an observability strategy across a large organisation. Learn More For more information More ❯
Key responsibilities include integrating external supplier APIs, implementing Software Reliability Engineering (SRE) best practices, and ensuring seamless collaboration across teams. The team enhances resilience, observability, incident management, and disaster recovery (DR) practices while working closely with Peri Pantry, Stock Management, and Accounting, Banking, and Property (ABP) teams. Key Responsibilities Technical … to align technology decisions with business needs. Solution Design : Ensure the right technologies and architectures are used to enhance system performance, maintainability, and security. Observability & Resilience : Establish best practices for monitoring, incident response, and disaster recovery. Best Practices & Governance : Define engineering standards and drive their adoption across teams. Vendor & API … ability to drive initiatives with a strategic mindset. Ability to communicate effectively with technical and non-technical stakeholders , ensuring alignment across teams. Experience improving observability, monitoring, and incident response processes. Security-first mindset, focusing on least privilege access, automated secrets management, and compliance automation . Why Join Us? As a More ❯
Sheffield, Burngreave, South Yorkshire, United Kingdom Hybrid / WFH Options
Ada Meher
projects simultaneously using Agile practices. The ideal candidate will also have knowledge around or an interest in learning other key DevOps areas such as observability, CI/CD pipeline development and config management. The company have a personal development budget available to all staff for such courses and accreditations, to … services and architecture Strong experience working with Terraform (or other IaC technology) Proven team leadership experience Experience working with CI/CD pipelines (Jenkins), Observability (Grafana) & Configuration Management (Ansible, Chef, Puppet) Excellent communication skills are a must Along with an excellent work/life balance, this company also offer a More ❯
Employment Type: Permanent
Salary: £70000 - £75000/annum Flexible/Remote Working | AWS, Terra
ABOUT ORGANOX: OrganOx is an innovative, fast-paced, global medical device company with a mission to save lives by making every donated organ count. We are a commercial stage organ technology company, spun out of the University of Oxford in More ❯
in both support and engineering. What You’ll Do: Troubleshoot and resolve issues in live trading and analytics systems Monitor production systems and develop observability tools Build and enhance features in Python and C++ Manage configuration and deployment processes Support onboarding of new teams and systems What We’re Looking More ❯
reporting and security leads to ensure data platforms are meeting product needs to service client expectations. Guide teams to ensure a high degree of observability of data platform reliability and performance, working alongside the Head of Platform to enhance visibility of these metrics throughout the business. Drive innovation in related More ❯
by 100x to open bigger enterprise opportunities ensure fast queries, high availability , and low - latency data processing across the platform drive best practices around observability , ensuring data integrity and service uptime 💥 They're seeking someone with: a track record designing and building complex , cloud - native SaaS platforms confidence developing large More ❯
software deployment and scalability. CI/CD Expertise: Automate software build, test, and deployment pipelines following agile methodologies. Terraform Exposure: Beneficial experience with Terraform. Observability Tools: Experience with Grafana and Splunk is beneficial, particularly in developing and applying an observability strategy across a large organization. Learn More For more information More ❯
london, south east england, United Kingdom Hybrid / WFH Options
Harrington Starr
and help build the next generation of scalable, cloud-native infrastructure. This role sits in a high-impact platform engineering team focused on automation, observability, and empowering development teams to ship faster and more securely. Why You Should Apply: Work with a forward-thinking, global financial firm Hybrid setup … and maintaining AWS-based infrastructure using Terraform Improving CI/CD pipelines with Python and Git workflows Supporting containerised environments (Docker/K8s) Driving observability with Grafana and proactive monitoring tools Enhancing developer experience through smart automation and tooling What We’re Looking For: 3 years of experience in Platform More ❯
Reigate, Surrey, United Kingdom Hybrid / WFH Options
Willis Towers Watson
in a product team to develop and support operationally resilient cloud infrastructure. The ideal candidate will have a track record in Microsoft Azure and Observability platforms in complex SaaS environments and have excellent communication skills. You will be joining our growing engineering organization building a wide range of market-leading … on high cadence and cost effectiveness Implement infrastructure as code with Pulumi Support the team in infrastructure and networking related issues Maintain and configure observability platforms such as Datadog Proactively monitor production and other environments to ensure stability, availability, security and integrity Participate in incident response, troubleshooting, and root cause … skills (PowerShell, Terraform, ARM, Pulumi, Bicep etc.) Experience of Microsoft Azure in areas such as networking, storage, integration, compute and analytics Experience of cloud observability concerns (logging, tracing, metrics, monitoring & alerting) Experience of Windows & Linux containers and orchestration platforms (Docker, Kubernetes) Strong interpersonal skills, with the ability to work effectively More ❯
ensuring the platform is stable To drive and own the Monitoring strategy, defining clear goals, objectives, and deliverables. Optimise and reduce operational overheads through observability and service automation. Lead the definition and track Service Level Objectives (SLO) to measure service availability in combination with service, product and engineering communities. Collaborate … to prioritize and manage multiple tasks in a fast-paced environment. Experience in software development, infrastructure, or operations roles Strong background/appreciation in observability principles, techniques and toolsets. Demonstrable knowledge of developing and managing RESTful API services written within a modern OO language such as Java or Python Knowledge … C# Understand or worked within an Incident Management Process (ITSM) Desirable Requirements: AWS Linux - Debian, CentOS, Alpine and AWS Linux Terraform, Docker, Kubernetes, Git Observability/APM Platforms Jenkins, Nginx, MySQL Benefits We are actively committed to promoting a fully diverse and inclusive workforce and we welcome applications for this More ❯