Manchester, Lancashire, United Kingdom Hybrid / WFH Options
Arm Limited
infrastructure "Nice To Have" Skills and Experience: Experience in a GitOps solution such as ArgoCD, Flux or Fleet Implementation of the Security Development Lifecycle (SDL) in infrastructure Monitoring and observability using Prometheus and Grafana, ELK stack or equivalent Use of Kubernetes management systems such as Rancher Familiarity with open source project development cycles and contribution processes, particularly around CI/ More ❯
multiple stakeholders including development teams to implement and maintain reliable and scalable systems while adhering to industry best practices and security standards. Responsibilities and Impact: Design, implement, and maintain observability solutions to track system health and performance. Analyze observability data to identify and troubleshoot potential issues proactively. Develop and implement alerts and notifications for critical events. Collaborate with development teams … in Computer Science, Information Technology, or a related field. 5+ years of experience as a Site Reliability Engineer or equivalent in a similar role. Proficient in application and infrastructure observability, Splunk OpenTelemetry preferred Experienced in production environments running in AWS Comfortable with Infrastructure as Code, Terraform is preferred Comfortable with CI/CD pipelines such as GitHub Actions, Azure DevOps More ❯
or DevOps Expertise in microservices and API design Docker, and container runtime platforms such as Kubernetes, EKS, ECS etc Strong understand of operational concepts on AWS, particularly monitoring and observability, FinOps Utilising CI/CD tools, such as Bamboo, Jenkins, TeamCity, Bitbucket, in order to streamline delivery of new features and fixes Continual testing of code using Automated Testing Frameworks More ❯
or DevOps Expertise in microservices and API design Docker, and container runtime platforms such as Kubernetes, EKS, ECS etc Strong understand of operational concepts on AWS, particularly monitoring and observability, FinOps UtilisingCI/CD tools, such as Bamboo, Jenkins, TeamCity, Bitbucket, in order to streamline delivery of new features and fixes Continual testing of code using Automated Testing Frameworks A More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
BAE Systems (New)
or DevOps Expertise in microservices and API design Docker, and container runtime platforms such as Kubernetes, EKS, ECS etc Strong understand of operational concepts on AWS, particularly monitoring and observability, FinOps UtilisingCI/CD tools, such as Bamboo, Jenkins, TeamCity, Bitbucket, in order to streamline delivery of new features and fixes Continual testing of code using Automated Testing Frameworks A More ❯
ll sit at the heart of our engineering operations, bringing together SRE principles and modern platform engineering practices. This includes combining principles of SRE - such as service-level reliability, observability, incident response - with platform engineering practices like GitOps, Infrastructure as Code, DevSecOps automation, and self-service enablement, to help development teams ship faster, safer, and more cost-efficiently. What you … ll be doing: Designing and operating highly reliable, scalable, and secure Azure-based platforms Applying SRE principles like SLOs, observability, and incident management to drive service reliability Building Infrastructure as Code using Terraform (v1.7+) and GitOps workflows Enabling teams through platform tools, reusable Terraform modules, and self-service infrastructure Enhancing CI/CD pipelines (Azure DevOps, YAML-based) with security … knowledge (AKS, Functions, SQL, Cosmos DB, etc.) Strong Infrastructure as Code skills with Terraform (v1.7+) Experience with CI/CD pipelines, GitOps, and automation tools (PowerShell, Bash) Familiarity with observability and incident tools like Datadog, ELK, and synthetic monitoring Solid understanding of networking (TCP/IP, Load Balancing, DNS, Routing) Good knowledge of DevSecOps practices - including security scanning, IAM, and More ❯
Manchester Area, United Kingdom Hybrid / WFH Options
Revolent Group
related processes like data migrations and environment setup. ✅ Preferred (Nice to Have): Banking/Financial Services knowledge — especially around wholesale lending and Loan IQ . Experience with monitoring and observability tools such as APPD, ELK Stack, or Grafana. Understanding of DevSecOps principles , including vulnerability scanning, secrets management, and compliance automation. Further experience with CI/CD integration and pipeline automation More ❯
position will align to a discipline where you will be expected to build and support solutions aligned with SDLC principles, providing technical excellence with a focus on scripting and observability coupled with a security mindset. What will you be doing day-to-day? Automation and Orchestration: Streamline the delivery and support processes by leveraging automation and IaC principles. Support and More ❯
Salford, Manchester, United Kingdom Hybrid / WFH Options
BBC Group and Public Services
/CD pipelines using GitHub Actions, AWS CodePipeline, Jenkins, and other tools, with an emphasis on reliability, reusability, and performance. Contribute to the design and integration of monitoring and observability solutions (CloudWatch, Prometheus, Grafana) to ensure infrastructure and model health. Champion software engineering excellence through Test-Driven Development (TDD), rigorous test automation, and continuous quality assurance practices. Support architectural decisions More ❯
Manchester, North West, United Kingdom Hybrid / WFH Options
Halian Technology Limited
in the team Contribute to solution architecture and strategic technical direction Build, integrate, and maintain REST APIs and backend services Champion best practices in software quality, CI/CD, observability, and DevOps Collaborate with cross-functional teams including Product, QA, and DevOps Optionally take on people management responsibilities for engineers Stay updated with emerging backend and cloud technologies Key Skills More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
WorksHub
the infrastructure and deployment of those applications. We are actively expanding our Manchester born SRE function, which aims to advance our knowledge and innovation globally in areas such as Observability, Reliability and Availability. We have the autonomy to choose the technologies and processes that help us achieve our objectives. So each team leverages the technology that fits their needs best. More ❯
across multiple squads to ensure our platform is scalable, secure, and designed for rapid deployment and operational excellence. You'll contribute to the development and automation of cloud infrastructure, observability systems, CI/CD pipelines, and event-based services that power key parts of our product ecosystem. About Suits Me Suits Me is a multi-award-winning, ethical fintech dedicated … pipelines (e.g. GitHub Actions) to enable rapid and reliable delivery of services Contributing to the design of scalable and secure platform components that enable developer productivity Building and improving observability tooling (e.g. CloudWatch, Grafana) to support rapid detection and resolution of issues Collaborating with developers and stakeholders across squads to understand infrastructure needs and ensure best practices are applied Writing More ❯
stakeholders to define solutions Mentor junior developers and promote engineering best practices Drive improvements in development processes, CI/CD pipelines, and tooling Investigate and resolve production issues Ensure observability through logging, metrics, and diagnostics Contribute to event-driven architecture and distributed systems design What You Bring 5+ years of backend development experience Expertise in C#, .NET (preferably .NET 6+ More ❯
of practices (e.g., Cloud, Platforms. AI, Strategy, Custom Application Development, Network & Edge, Security, Resiliency, etc.) Articulate the vision for modern engineering (e.g., agile, cloud-native, DevOps) and operations (e.g., observability, automated response, SRE, etc.) and able to articulate a path toward a target operating model (people, process, and tools) SoftServe is an Equal Opportunity Employer. All qualified applicants will receive More ❯
Manchester, North West, United Kingdom Hybrid / WFH Options
We Are Dcoded Limited
Tech Snapshot: Languages & Frameworks: C#, .NET Core Cloud: AWS (cloud-native, not lift-and-shift) Containerisation: Kubernetes IaC & Pipelines: Terraform, CI/CD APIs: GraphQL, REST Practices: TDD, Monitoring, Observability What Youll Be Doing: Building and maintaining distributed services in a high-traffic, cloud-native environment Leading the development of new features and contributing to architectural decisions Collaborating with Product More ❯
Own and evolve incident management, SLAs, and service performance Diagnose complex system issues including networking, deployment, and application-level debugging Drive automation, self-service, and tooling improvements (ticket workflows, observability, etc.) Liaise closely with Engineering and Product teams , feeding technical insights into the roadmap What You’ll Bring 5+ years in technical support leadership , ideally in a SaaS, cloud-native More ❯
Manchester, North West, United Kingdom Hybrid / WFH Options
Daniel James Resourcing Ltd
Design and implement robust, scalable, and secure backend services Contribute to strategic technical decisions around architecture and platform direction Embed engineering best practices across code quality, testing, deployment and observability Mentor and support the growth of team members, promoting a culture of continuous learning Tech Stack The role will suit someone who is confident in modern cloud-native development and More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
djr
and implement robust, scalable, and secure Back End services Contribute to strategic technical decisions around architecture and platform direction Embed engineering best practices across code quality, testing, deployment and observability Mentor and support the growth of team members, promoting a culture of continuous learning Tech Stack The role will suit someone who is confident in modern cloud-native development and More ❯
Manchester, England, United Kingdom Hybrid / WFH Options
bet365
Who we are looking for A Site Reliability Engineer, who will enhance system reliability, observability and performance through a strong engineering approach and assist with incident resolution and best practices. You will have software engineering skills, focusing on system reliability and observability. You will monitor the health, performance and availability of critical systems, directly impacting operational efficiency. Using your engineering … practices and develop features for maintainability. You will also help engineer tools and automation for effective service management. Collaboration is key, working across multiple functions to integrate reliability and observability best practices into the software development life cycle. By supporting governance standards set by the central teams, you will foster a culture where these principles are integral to development. Your … of Site Reliability Engineering principles, including the creation and management of effective Service Level Indicators (SLI) and Service Level Objectives (SLO) for reliability and customer satisfaction. Knowledge of contemporary observability tools, techniques and best practice including Splunk, New Relic, Grafana and Pager Duty. Excellent knowledge of programming languages including Python, Golang and JavaScript. Knowledge and experience of modern software development More ❯
Manchester, North West, United Kingdom Hybrid / WFH Options
InterQuest Group (UK) Limited
all aspects of the product they work on, from ideation through to development, testing and deployment, so you should expect to champion and mentor on best practice like TDD, Observability and IaC. Skills: C#, .NET Core, APIs AWS, Docker, Kubernetes, Terraform CI/CD, TDD, SOLID The money is good too - up to £90k plus benefits including hybrid working More ❯
Bolton, Greater Manchester, United Kingdom Hybrid / WFH Options
Superstars
+ Firehose : Traceability and telemetry routing Cloud Storage (S3, GCP buckets) : Configuration, logs, traceability data Envoy Proxy + ALB + WAF : TLS termination, authentication, traffic routing CloudWatch/Dynatrace : Observability and alerting Terraform + GitLab CI/CD : Infrastructure as Code and deployment pipelines Custom UI for simulations/experiments (planned to support with GenAI features in the future) for More ❯
Leigh, Greater Manchester, United Kingdom Hybrid / WFH Options
Superstars
+ Firehose : Traceability and telemetry routing Cloud Storage (S3, GCP buckets) : Configuration, logs, traceability data Envoy Proxy + ALB + WAF : TLS termination, authentication, traffic routing CloudWatch/Dynatrace : Observability and alerting Terraform + GitLab CI/CD : Infrastructure as Code and deployment pipelines Custom UI for simulations/experiments (planned to support with GenAI features in the future) for More ❯
Altrincham, Greater Manchester, United Kingdom Hybrid / WFH Options
Superstars
+ Firehose : Traceability and telemetry routing Cloud Storage (S3, GCP buckets) : Configuration, logs, traceability data Envoy Proxy + ALB + WAF : TLS termination, authentication, traffic routing CloudWatch/Dynatrace : Observability and alerting Terraform + GitLab CI/CD : Infrastructure as Code and deployment pipelines Custom UI for simulations/experiments (planned to support with GenAI features in the future) for More ❯
Bury, Greater Manchester, United Kingdom Hybrid / WFH Options
Superstars
+ Firehose : Traceability and telemetry routing Cloud Storage (S3, GCP buckets) : Configuration, logs, traceability data Envoy Proxy + ALB + WAF : TLS termination, authentication, traffic routing CloudWatch/Dynatrace : Observability and alerting Terraform + GitLab CI/CD : Infrastructure as Code and deployment pipelines Custom UI for simulations/experiments (planned to support with GenAI features in the future) for More ❯
Ashton-Under-Lyne, Greater Manchester, United Kingdom Hybrid / WFH Options
Superstars
+ Firehose : Traceability and telemetry routing Cloud Storage (S3, GCP buckets) : Configuration, logs, traceability data Envoy Proxy + ALB + WAF : TLS termination, authentication, traffic routing CloudWatch/Dynatrace : Observability and alerting Terraform + GitLab CI/CD : Infrastructure as Code and deployment pipelines Custom UI for simulations/experiments (planned to support with GenAI features in the future) for More ❯