CD Pipeline Development: Develop and maintain robust CI/CD pipelines for continuous integration and deployment of ML models and related infrastructure Monitoring and Observability: Build and maintain comprehensive monitoring and alerting systems for our ML infrastructure and models, leveraging tools like DataDog to ensure system health and performance Collaboration More β―
framework to support the development and operations of web applications. Desirable Skills: Serverless & Microservices: Experience with AWS Lambda, Azure Functions, and event-driven architectures. Observability & Monitoring: Familiarity with monitoring tools like Splunk, Datadog, or New Relic for enhanced visibility and observability. Networking: Knowledge of VPCs, VPNs, and load balancing in More β―
Develop a baseline monitoring and tooling concept for cloud to address the need for compliance infrastructure reporting within agile deliveries as part of our Observability strategy. Develop concepts and tools for chargeback and showback (Financial Instrumentation) in a multicloud context. Implement and mature a cloud forecasting and capacity management solution More β―
london, south east england, united kingdom Hybrid / WFH Options
Premier Group
GitLab CI/Jenkins) Automate deployments and monitoring for multiple environments Implement Infrastructure as Code using Terraform Manage containerised environments with Docker & Kubernetes Enhance observability with tools like Prometheus , Grafana , and Datadog Collaborate closely with developers, testers, and platform teams π§° Tech Stack You'll Use: Cloud: AWS (core services: EC2 More β―
london, south east england, United Kingdom Hybrid / WFH Options
LHH
or CloudFormation. Implement CI/CD pipelines, enabling continuous integration and continuous deployment for mission-critical applications. Monitor system performance, availability, and security, implementing observability best practices. Work in an Agile environment, engaging with stakeholders to understand requirements and deliver iterative improvements. Your skills and experience Essential: Experience deploying and More β―
london, south east england, united kingdom Hybrid / WFH Options
Digital Skills ltd
level experience in AWS Networking/TCP/Firewalls/Certs Advanced proficiency with containers and container orchestration tools such as Docker and Kubernetes Observability champion, experience in designing and building monitoring and logging tools such as CloudWatch, ELK, and Grafana Strong scripting skills in Bash, JavaScript or similar Knowledge More β―
high availability, optimal performance, and reliability across production and non-production environments. This includes working on incident response, capacity planning, WAN optimization, and system observability using tools like Prometheus and Grafana . Key Responsibilities: Administer and maintain Solace PubSub+ appliances and software brokers across environments (on-prem and cloud). More β―
Altrincham, Greater Manchester, United Kingdom Hybrid / WFH Options
Future Talent Group
Promote DevOps culture by leading knowledge-sharing sessions and supporting issue resolution. Skills Strong grounding in SRE principles and operational best practices. Proficient with observability tools (Prometheus, Grafana, OTEL, Cloudwatch) and telemetry pipelines. Solid programming skills in Python and/or Go; Java experience a plus. Hands-on AWS expertise More β―
Leigh, Greater Manchester, United Kingdom Hybrid / WFH Options
Future Talent Group
Promote DevOps culture by leading knowledge-sharing sessions and supporting issue resolution. Skills Strong grounding in SRE principles and operational best practices. Proficient with observability tools (Prometheus, Grafana, OTEL, Cloudwatch) and telemetry pipelines. Solid programming skills in Python and/or Go; Java experience a plus. Hands-on AWS expertise More β―
Bolton, Greater Manchester, United Kingdom Hybrid / WFH Options
Future Talent Group
Promote DevOps culture by leading knowledge-sharing sessions and supporting issue resolution. Skills Strong grounding in SRE principles and operational best practices. Proficient with observability tools (Prometheus, Grafana, OTEL, Cloudwatch) and telemetry pipelines. Solid programming skills in Python and/or Go; Java experience a plus. Hands-on AWS expertise More β―
Leeds, West Yorkshire, United Kingdom Hybrid / WFH Options
Future Talent Group
Promote DevOps culture by leading knowledge-sharing sessions and supporting issue resolution. Skills Strong grounding in SRE principles and operational best practices. Proficient with observability tools (Prometheus, Grafana, OTEL, Cloudwatch) and telemetry pipelines. Solid programming skills in Python and/or Go; Java experience a plus. Hands-on AWS expertise More β―
Bury, Greater Manchester, United Kingdom Hybrid / WFH Options
Future Talent Group
Promote DevOps culture by leading knowledge-sharing sessions and supporting issue resolution. Skills Strong grounding in SRE principles and operational best practices. Proficient with observability tools (Prometheus, Grafana, OTEL, Cloudwatch) and telemetry pipelines. Solid programming skills in Python and/or Go; Java experience a plus. Hands-on AWS expertise More β―
london, south east england, united kingdom Hybrid / WFH Options
Future Talent Group
Promote DevOps culture by leading knowledge-sharing sessions and supporting issue resolution. Skills Strong grounding in SRE principles and operational best practices. Proficient with observability tools (Prometheus, Grafana, OTEL, Cloudwatch) and telemetry pipelines. Solid programming skills in Python and/or Go; Java experience a plus. Hands-on AWS expertise More β―
london (city of london), south east england, united kingdom Hybrid / WFH Options
Future Talent Group
Promote DevOps culture by leading knowledge-sharing sessions and supporting issue resolution. Skills Strong grounding in SRE principles and operational best practices. Proficient with observability tools (Prometheus, Grafana, OTEL, Cloudwatch) and telemetry pipelines. Solid programming skills in Python and/or Go; Java experience a plus. Hands-on AWS expertise More β―
london (west end), south east england, united kingdom Hybrid / WFH Options
Future Talent Group
Promote DevOps culture by leading knowledge-sharing sessions and supporting issue resolution. Skills Strong grounding in SRE principles and operational best practices. Proficient with observability tools (Prometheus, Grafana, OTEL, Cloudwatch) and telemetry pipelines. Solid programming skills in Python and/or Go; Java experience a plus. Hands-on AWS expertise More β―
Ashton-Under-Lyne, Greater Manchester, United Kingdom Hybrid / WFH Options
Future Talent Group
Promote DevOps culture by leading knowledge-sharing sessions and supporting issue resolution. Skills Strong grounding in SRE principles and operational best practices. Proficient with observability tools (Prometheus, Grafana, OTEL, Cloudwatch) and telemetry pipelines. Solid programming skills in Python and/or Go; Java experience a plus. Hands-on AWS expertise More β―
Greater Bristol Area, United Kingdom Hybrid / WFH Options
Searchability NS&D
CD pipelines (Jenkins, GitHub Actions, GitLab CI/CD) and automation tools like Terraform and Ansible Programming : Proficiency in Python, Go, or Ruby Monitoring & Observability : Hands-on experience with Prometheus, Grafana, ELK Stack, or similar technologies Core Attributes A passion for solving complex technical challenges in high-availability production environments More β―
enhance internal DevOps culture, tooling, and CI/CD processes. Collaborate cross-functionally to continuously innovate and improve development workflows and system operations. Foster observability and reliability across live systems through best-in-class monitoring and automation. Day to Day: Collaborate with engineers and architects to define and implement cloud More β―
Networks, ExpressRoute, VPNs, NSGs, and Azure Firewall for secure connectivity. Integrate hybrid cloud solutions using Azure Arc and hybrid connectivity strategies. Monitoring & Resilience: Implement observability using Azure Monitor, Log Analytics, App Insights, and Prometheus/Grafana . Design for high availability (HA), disaster recovery (DR), and business continuity (BCP) . More β―
secure or regulated environments (e.g. Defence, Government, Critical National Infrastructure). Desirable: Familiarity with cloud platforms such as AWS, Azure, or OpenStack. Experience with observability tooling (e.g. Prometheus, Grafana, ELK stack). Exposure to infrastructure security principles and compliance frameworks. Whatβs in It for You: Salary from Β£80,000+ More β―
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
Modix International
Actions). Strong troubleshooting skills for cloud infrastructure and application performance. Knowledge of cloud security, compliance , and identity management . Experience with monitoring and observability tools (New Relic, Splunk). A continuous improvement mindset and a desire to optimize systems for security, performance, and cost. AWS Certifications (e.g., AWS Certified More β―
Bristol, Avon, South West, United Kingdom Hybrid / WFH Options
Hargreaves Lansdown Asset Management Limited
end to end testing tools and practices (e.g. Jest, Cypress, Backstop, Playwright). Experience with CI/CD and Trunk Based Development. Experience with observability tools and practices, including monitoring, logging, and tracing to ensure system reliability and performance. Understanding of Microservices & principles of RESTful API development, including structuring, documenting More β―
Bash, or Python Solid understanding of Linux systems, networking, routing, and firewall configurations A deep grasp of AWS operational best practices, particularly in monitoring, observability, and FinOps Expertise in Infrastructure as Code (IaC) tools such as CloudFormation, CDK, and Terraform Additionally, it would be advantageous to have experience with: AWS More β―
environments (e.g. Docker), and IaC tools like Terraform and Ansible for infrastructure performance and cost efficiency. β’ Implement best practices in DevOps and DevSecOps, including observability, security, networking, API integration, and disaster recovery. β’ Mentor junior engineers and contribute technical leadership, ideally with experience in broadcast workflows, audio/video streaming, and More β―