etc. Infrastructure as Code and CI/CD paradigms and systems such as: Ansible, Terraform, Jenkins, Bamboo, Concourse etc. Monitoring utilising products such as: Prometheus, Grafana, ELK, filebeat etc. Observability - SRE Big Data solutions (ecosystems) and technologies such as: Apache Spark and the Hadoop Ecosystem Edge technologies e.g. NGINX, HAProxy etc. Excellent knowledge of YAML or similar languages The following More ❯
Bristol, Avon, England, United Kingdom Hybrid / WFH Options
Robert Walters
of cloud infrastructure and applications on Google Cloud Platform. You will work collaboratively with engineering and infrastructure teams to implement site reliability engineering (SRE) principles, focusing on system reliability, observability, automation, and operational excellence. This role follows a hybrid working model, requiring attendance at the Bristol office for at least two days per week or 40% of the working time. … objectives (SLOs), indicators (SLIs), and monitoring practices Hands-on experience with infrastructure as code (e.g., Terraform) and CI/CD tools (e.g., Jenkins, Azure DevOps) Desirable Knowledge Familiarity with observability and performance tools such as Dynatrace, Stackdriver, Cloud Monitoring, or similar Exposure to cost monitoring, logging frameworks, and cloud consumption analytics Personal Attributes Ability to mentor and support engineers in More ❯
ensure code quality and reliability; Experience of work with Docker for containerisation and application packaging; Experience of implementing and managing monitoring solutions, with experience in Prometheus and Grafana for observability and alerting. Experience of implementing and managing robust security practices, including Encryption (TLS) and Secret Management in the Cloud; Experience of leveraging GitLab API for advanced automation, integration, and reporting More ❯
Whitehall, Bristol, United Kingdom Hybrid / WFH Options
dotdigital
in multiple regions Comfortable working with versioning tools (Git, Azure Devops, GitHub) Excellent CI/CD pipeline authoring. Automation is key after all. We use Azure DevOps. Experience in Observability and Reporting tools and their components (think ElasticSearch, Grafana, Prometheus, Thanos, Raygun) Know your way around Linux and Windows OS. If you come from a sysadmin background this is cool More ❯
teams to build cost-effective solutions on GCP while maintaining agility and fostering innovation. This position is perfect for engineers who are passionate about optimising cloud usage, enhancing cost observability, and championing a FinOps culture. What you'll do Partner with engineering, finance and product teams to drive cost-efficiency across GCP Design and implement automation to boost cost optimisation … had GCP certifications (e.g. Professional Cloud DevOps Engineer, Professional Cloud Architect) FinOps Foundation certifications (e.g. Practitioner, Engineer) Familiarity with security tools e.g. Hashicorp Vault, Aquasec, Nexus IQ. Knowledge of observability tools e.g. Dynatrace. Experience in cost management tools e.g. Cloudability. About working for us Our focus is to ensure we're inclusive every day, building an organisation that reflects modern More ❯
Bristol, Gloucestershire, United Kingdom Hybrid / WFH Options
Curo Resourcing Ltd
domain adjacent technologies/services, such as: Docker, OpenShift, Kubernetes etc. Infrastructure as Code and CI/CD paradigms and systems such as: Ansible, Terraform, Jenkins, Bamboo, Concourse etc. Observability - SRE Big Data solutions (ecosystems) and technologies such as: Apache Spark and the Hadoop Ecosystem Excellent knowledge of YAML or similar languages The following Technical Skills & Experience would be desirable More ❯
Bristol, Avon, South West, United Kingdom Hybrid / WFH Options
Hargreaves Lansdown
Experience with unit, integration, and end to end testing tools and practices (e.g. Jest, Cypress, Backstop, Playwright). Experience with CI/CD and Trunk Based Development. Experience with observability tools and practices, including monitoring, logging, and tracing to ensure system reliability and performance. Understanding of Microservices & principles of RESTful API development, including structuring, documenting, versioning, testing and stubbing/ More ❯
Bristol, Gloucestershire, United Kingdom Hybrid / WFH Options
Leidos
and managing backup, recovery, and disaster recovery strategies to ensure data protection and business continuity Ability to implement robust monitoring and logging solutions e.g., CloudWatch, to ensure system reliability, observability, and proactive incident response Comfortable working in Agile development teams, translating business requirements into technical solutions, and actively participating in sprint planning, retrospectives, and daily stand-ups Capability to design More ❯
South West London, London, England, United Kingdom
Oscar Technology
experienced Site Reliability Engineer (SRE) to join them on a 6-month contract (outside IR35) You'll be leading efforts acriss AWS and Azure Cloud environments, focusing on automation, observability, infrastructure as code and performance at scale. Stakeholder engagements and strong communication is essential in this role, so if you've been in a start-up/smaller team- this … scripting (Python, Bash, PowerShell), and cloud architecture Comfortable with containerisation and orchestration ( Docker, Kubernetes ) Understanding of networking, DNS, IAM, and load balancing in cloud environments Hands-on experience with observability tooling and production-level troubleshooting If this sounds like you, it's a great opportunity so apply now! Site Reliability Engineer - AWS/Azure | Outside IR35 | £450-500/day More ❯
collaboration skills, with the ability to influence and align diverse teams on a shared vision. Knowledge of DevOps practices and tools CI/CD pipelines. Knowledge of Monitoring and Observability tooling. In addition, any experience of these would be useful: Familiarity with data mesh concepts (such as ownership based on specific areas, and thinking about data products). Expertise in More ❯
Site Reliability/DevOp Engineer London - 5 Days Onsite Up to £550 per day (Umbrella, Inside IR35) 12-Month Contract Must hold live and transferrable DV Clearance Are you passionate about reliability, automation, and supporting mission-critical systems? Join this More ❯
Bristol, Avon, South West, United Kingdom Hybrid / WFH Options
Twinstream Limited
Socials & Events Cycle to Work Scheme & Life Assurance Key Responsibilities of the Site Reliability Engineer: Work closely with engineers and sysadmins to increase performance and reduce toil Advance system observability, monitoring and alerting Automate, troubleshoot, and proactively resolve issues before they escalate Improve development environments to meet delivery and quality targets Research and evaluate tools and platforms to support scale More ❯
Bristol, Avon, South West, United Kingdom Hybrid / WFH Options
Hargreaves Lansdown
Excited to grow your career? Our purpose is to empower people to save and invest with confidence. We are looking for great people to join us, so please come and invest in YOUR future at HL. We know that sometimes More ❯
Bristol, Gloucestershire, United Kingdom Hybrid / WFH Options
TwinStream
services. You will be working with multiple feature development teams and the BAU/Support team to define and evolve our cloud & on-prem infrastructure & delivery pipelines, improving system observability, demonstrating performance and capacity improvements and proactively identifying and mitigating reliability risks. Key Responsibilities of the Site Reliability Engineer: Collaborate with Software Engineers to improve reliability and performance in their … subsystems Partner with System Administrators in automating toil and eliminating alerts Evolve observability and monitoring capabilities to identify and solve problems before they impact the business Support development environments to help us achieve our delivery and quality goals Research and evaluate technologies, tools and services to influence buy-vs-build decisions Develop expertise in diverse technical and business domains Expand … in one of our platform languages (Java, Go, Python or similar) Knowledge of cross domain principles & technologies Experience of working in a service management environment Practical applications of using observability patterns in previous systems Creating and monitoring system availability metrics and using those to drive work that reduces downtime There are many great reasons to join our team! Pension Plan More ❯
Bristol, Avon, South West, United Kingdom Hybrid / WFH Options
Halian Technology Limited
in the team Contribute to solution architecture and strategic technical direction Build, integrate, and maintain REST APIs and backend services Champion best practices in software quality, CI/CD, observability, and DevOps Collaborate with cross-functional teams including Product, QA, and DevOps Optionally take on people management responsibilities for engineers Stay updated with emerging backend and cloud technologies Key Skills More ❯
Bristol, Avon, South West, United Kingdom Hybrid / WFH Options
Hargreaves Lansdown
Experience with unit, integration, and end-to-end testing tools and practices (e.g. Jest, Cypress, Backstop, Playwright). Experience with CI/CD and Trunk Based Development. Experience with observability tools and practices, including monitoring, logging, and tracing to ensure system reliability and performance. Experience with integration and onboarding third-party vendors, meeting with vendor engineering contacts, defining integration patterns More ❯
area of the product component or the system in aggregate and at scale. Specific domains include Workload Management (Kubernetes, Ray, and so on); Cloud Development (Cloud Infrastructure Automation); Management & Observability (open source and commercial monitoring, observability and DCIM solutions) Skills and Experience Essential Strong relevant programming experience Python/Go/C infrastructure-as-code scripting or related to the … of the products under test: Containerisation (e.g. Docker), Virtualisation and Provisioning, Workload and job scheduling (e.g. Kubernetes, Ray) on high core-count machines and rack-scale installations, Management and Observability (e.g. Prometheus, OpenTelemetry, DataDog, Splunk, etc.). 10+ years of relevant experience related to quality assurance/testing teams. Experience with the Atlassian suite and CI/CD platforms such More ❯
is seeking a dedicated DevOps Engineer Apprentice to bolster their NHS project team. In this role, the chosen candidate will be instrumental in enhancing the incident management protocols, advancing observability and monitoring strategies, and refining CI/CD practices within the AWS ecosystem. Role Collaborating with cross-functional teams to ensure smooth and reliable incident management using Jira and Service … Now. Developing and implement observability and monitoring solutions to ensure high system availability and performance. Contributing to maintaining and improving CI/CD pipelines, ensuring efficient code integration and deployment on AWS. Supporting the design and execution of automated test strategies to enhance the quality and security of cloud-based applications. Training Why choose our DevOps Engineer Level 4 apprenticeship More ❯
area of the product component or the system in aggregate and at scale. Specific domains include Workload Management (Kubernetes, Ray, and so on); Cloud Development (Cloud Infrastructure Automation); Management & Observability (open source and commercial monitoring, observability and DCIM solutions) Skills and Experience Essential Strong relevant programming experience Python/Go/C infrastructure-as-code scripting or related to the … of the products under test: Containerisation (e.g. Docker), Virtualisation and Provisioning, Workload and job scheduling (e.g. Kubernetes, Ray) on high core-count machines and rack-scale installations, Management and Observability (e.g. Prometheus, OpenTelemetry, DataDog, Splunk, etc.). 10+ years of relevant experience related to quality assurance/testing teams. Experience with the Atlassian suite and CI/CD platforms such More ❯
Bristol, Avon, England, United Kingdom Hybrid / WFH Options
interAct Consulting Limited
as-Code (IaC). Experience of Configuration-as-Code, Containerisation and Orchestration, CI/CD. Proficiency with Kubernetes, Docker and AKS. Familiarity with Azure cloud-native services. Knowledge of observability and site-reliability engineering principles. Proficiency in SQL and experience working with relational databases. This is a fully remote (UN only) position within a fabulous team. Lots of flexibility, opportunity More ❯
learning, knowledge sharing and continuous improvement. You have a passion for DevOps and Platform as a Service. Understanding of security and compliance requirements related to platform infrastructure. Experience with observability practices and tooling, incident management processes and driving operational excellence. Diversity, Equity and Inclusion If you're excited about this role but your experience doesn't align perfectly, we encourage More ❯
practices (Agile, Scrum, Kanban) Proficiency in CI/CD pipelines, infrastructure as code, and cloud data tooling Familiarity with data governance, privacy, and security principles Experience using metrics and observability tools to monitor data platform health and team performance Experience in performance management and setting measurable goals for team members This role isn't for you if. You rely on More ❯
domain. Experience in a strongly/statically typed language. Have a strong understanding of designing, building, and running high-quality, standards-compliant workflow APIs, with a focus on testing, observability, and performance. Have worked with a cloud provider (AWS/Azure/GCP). Have worked with distributed systems and are comfortable debugging through tracing and observability. Willing to be More ❯
at scale, leveraging AWS Organizations, Landing Zones, and multi-account best practices. Develop and maintain Infrastructure as Code solutions using Terraform, CloudFormation, and AWS CDK. Champion security, compliance, and observability by integrating services like AWS Security Hub, GuardDuty, and Inspector. Design CI/CD pipelines to enable seamless deployments and self-service models for customers. Innovate with AWS Networking, KMS … Proficiency in Python, Go, or similar languages for automation and scripting. Expert-level knowledge of AWS Networking, TLS, and security best practices. Experience with container orchestration (Kubernetes, EKS) and observability tools (Grafana, ELK). A passion for innovation, problem-solving, and delivering high-impact solutions. Why Work For Us? 25 days holiday + bank holidays Up to 5% employer pension More ❯
a live service for users Experience with understanding network architectures and troubleshooting network-related issues using Linux tools In-depth expertise in at least one of: Kubernetes, TerraForm, Networking, Observability Flexibility and mobility are required to deliver this role as there may be requirements to spend time onsite with our clients and partners to enable delivery of the first-class More ❯