Bristol, Avon, South West, United Kingdom Hybrid / WFH Options
Halian Technology Limited
in the team Contribute to solution architecture and strategic technical direction Build, integrate, and maintain REST APIs and backend services Champion best practices in software quality, CI/CD, observability, and DevOps Collaborate with cross-functional teams including Product, QA, and DevOps Optionally take on people management responsibilities for engineers Stay updated with emerging backend and cloud technologies Key Skills More ❯
AWS and Azure. Build and optimize CI/CD pipelines using Azure DevOps, GitHub Actions, or Jenkins. Automate everything with Terraform, Bicep, and scripting (PowerShell, Bash, Python). Drive observability with tools like Datadog, LogicMonitor, CloudWatch, and Grafana. Champion cloud security, IAM, RBAC, and compliance best practices. Collaborate across teams, mentor peers, and contribute to a culture of continuous improvement. More ❯
London, England, United Kingdom Hybrid / WFH Options
Experis
AWS and Azure. Build and optimize CI/CD pipelines using Azure DevOps, GitHub Actions, or Jenkins. Automate everything with Terraform, Bicep, and scripting (PowerShell, Bash, Python). Drive observability with tools like Datadog, LogicMonitor, CloudWatch, and Grafana. Champion cloud security, IAM, RBAC, and compliance best practices. Collaborate across teams, mentor peers, and contribute to a culture of continuous improvement. More ❯
development for enterprise solutions. Experience with multiprocessing, async I/O, and performance profiling. Unit testing, performance testing, and BDD. Understanding of OAuth 2.0 and secure authorization. Proficiency with observability tools (Grafana, Prometheus, etc.). DevOps and CI/CD (Jenkins, GitOps). Strong communication and collaboration skills. Understanding of deep learning and ML frameworks (TensorFlow, PyTorch). Secure coding More ❯
development for enterprise solutions. Experience with multiprocessing, async I/O, and performance profiling. Unit testing, performance testing, and BDD. Understanding of OAuth 2.0 and secure authorization. Proficiency with observability tools (Grafana, Prometheus, etc.). DevOps and CI/CD (Jenkins, GitOps). Strong communication and collaboration skills. Understanding of deep learning and ML frameworks (TensorFlow, PyTorch). Secure coding More ❯
Education: Bachelor's degree or equivalent and/or appropriate experience Experience: 8+ years of experience in virtualization, containerization, build, and deployment Extensive experience with SCM, CICD, instrumentation, and observability tools Proficiency with Git, GitHub, Azure DevOps Proficiency with major IaC technologies such as Terraform, Pulumi, or Bicep Proficiency with programming and scripting languages such as C#, PowerShell, Python, and More ❯
the evolution of our platform's microservices ecosystem. What You'll Do Architect, build, and maintain scalable Python microservices deployed in cloud environments Lead architectural decisions focusing on performance, observability, fault tolerance, and scalability Own complex backend features end-to-end-design, implement, test, deploy, and monitor Mentor and guide engineers through code reviews, design discussions, and best practices Collaborate More ❯
integrate with CI/CD pipelines. Infrastructure as Code (IaC): Hands-on experience using Terraform for provisioning and managing cloud infrastructure. Proficient in version control, particularly with GitHub. Monitoring & Observability: Proficient with monitoring and alerting tools (e.g., Prometheus, Grafana, CloudWatch) to track pipeline and infrastructure health. Strong troubleshooting skills for resolving CI/CD pipeline issues and optimizing pipeline performance. More ❯
AWS, Azure, or Google Cloud. Familiarity with database systems, data modelling, and SQL/NoSQL technologies. Comfortable working with a range of open-source tools and frameworks. Experience with observability tools like DataDog, Prometheus, or StackDriver. Knowledge of test automation frameworks and practices. Why This Role Stands Out No sales responsibilities or forced management track—grow deeply in your technical More ❯
or Bash Familiar with ML lifecycle tools, model monitoring, and versioning Exposure to tools like KServe, Ray Serve, Triton, or vLLM is a big plus Bonus Points Experience with observability frameworks like Prometheus or OpenTelemetry Knowledge of ML libraries: TensorFlow, PyTorch, HuggingFace Exposure to Azure or GCP Passion for financial services Qualifications Degree in Computer Science, Engineering, Data Science, or More ❯
with development and platform teams to deliver scalable, secure solutions. Automating infrastructure using tools like Terraform, ARM/BICEP, and scripting languages. Monitoring system performance and troubleshooting issues using observability tools. Supporting cloud services (e.g., EC2, S3, RDS, Azure VMs, Azure Functions). Maintaining and improving system documentation and operational procedures. Mentor team members and contribute to a culture of More ❯
Architectures (Kafka). Collaborate with DevOps teams to implement CI/CD pipelines and infrastructure as code using tools like Terraform, CloudFormation, and Ansible. Implement and manage monitoring and observability tools such as Datadog. Ensure real-time logging, alerting, and troubleshooting capabilities. Collaboration & Stakeholder Management: Work closely with business units, developers, and IT teams to understand requirements and translate them More ❯
Cambridge, Cambridgeshire, United Kingdom Hybrid / WFH Options
Arm Limited
Java or Python. Deep understanding of AWS or other cloud providers (e.g. GCP, Azure). Strong understanding of key security technologies and protocols such as TLS, OAuth and SPIFFE. Observability, alerting, metrics collection and visualisation (e.g. Prometheus, Grafana, Elasticsearch, Dynatrace). "Nice To Have" Skills and Experience: We would be even more impressed if you are passionate about the following More ❯
Cambourne, Cambridgeshire, United Kingdom Hybrid / WFH Options
Remotestar
end tests. Ability to write and understand design documentation using C4, sequence diagrams and workflows. Excellent problem-solving skills and attention to detail. Solid understanding of logging, monitoring and observability to understand if software is functioning as required. Strong communication and teamwork skills. Preferred Skills: Experience with cloud platforms e.g., AWS, Azure, Google Cloud. Knowledge of DevOps practices and CI More ❯
with development and platform teams to deliver scalable, secure solutions. Automating infrastructure using tools like Terraform, ARM/BICEP, and scripting languages. Monitoring system performance and troubleshooting issues using observability tools. Supporting cloud services (e.g., EC2, S3, RDS, Azure VMs, Azure Functions). Maintaining and improving system documentation and operational procedures. Mentor team members and contribute to a culture of More ❯
management best practices Shows the ability to break down large technical concepts into effective communication with stakeholders from across the organization Extensive knowledge of networking best practices, tools, and observability Experiencing developing and deploying automated service configuration at the edge (ex. CDN configuration, certificate renewal) Work consulting with a team being able to advise on their technology, workflows, dev tooling More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
WorksHub
the infrastructure and deployment of those applications. We are actively expanding our Manchester born SRE function, which aims to advance our knowledge and innovation globally in areas such as Observability, Reliability and Availability. We have the autonomy to choose the technologies and processes that help us achieve our objectives. So each team leverages the technology that fits their needs best. More ❯
. Build evaluation pipelines to benchmark LLM performance and continuously monitor production accuracy and relevance. Work across the ML stack-from data preparation and model training to serving and observability-either independently or in collaboration with other specialists. Optimize model pipelines for latency, scalability, and cost-efficiency , and support real-time and batch inference needs. Collaborate with MLOps, DevOps, and More ❯
Bristol, Avon, South West, United Kingdom Hybrid / WFH Options
Hargreaves Lansdown
Experience with unit, integration, and end-to-end testing tools and practices (e.g. Jest, Cypress, Backstop, Playwright). Experience with CI/CD and Trunk Based Development. Experience with observability tools and practices, including monitoring, logging, and tracing to ensure system reliability and performance. Experience with integration and onboarding third-party vendors, meeting with vendor engineering contacts, defining integration patterns More ❯
Platform Engineer - Kubernetes - Openshift - Python - AI/ML - 12 month contract - Glasgow Hybrid (3 days onsite) I am looking for an experienced Platform Engineer to support the design and rollout of a next-generation AI development environment for our international More ❯
creation by building world-class audio infrastructure for our customers. As a Site Reliability Engineer, you'll play a key role in improving our platform's developer operations, including observability, monitoring, and overall reliability. You will be part of a cross-functional team dedicated to implementing robust DevOps practices and enhancing infrastructure and site reliability engineering (SRE). A customer … focused mindset is essential, as the team collaborates closely with stakeholders to ensure solutions meet business and user needs. In addition to a focus on observability, you will contribute hands-on by developing features, automating workflows, and supporting the deployment of advanced machine-learning models. Strong communication skills are vital for working effectively with engineers, product teams, and stakeholders across … about CI/CD to these engineers Identifying and resolving security issues Automating tests and supporting our engineers on building great software Minimum qualifications: Strong experience with monitoring/observability tools (Grafana, Prometheus, or similar) Proficiency in Python, Docker, Kubernetes, and CI/CD pipelines Hands-on cloud experience (AWS or similar) A passion for designing and implementing scalable observabilityMore ❯
area of the product component or the system in aggregate and at scale. Specific domains include Workload Management (Kubernetes, Ray, and so on); Cloud Development (Cloud Infrastructure Automation); Management & Observability (open source and commercial monitoring, observability and DCIM solutions) Skills and Experience Essential Strong relevant programming experience Python/Go/C infrastructure-as-code scripting or related to the … of the products under test: Containerisation (e.g. Docker), Virtualisation and Provisioning, Workload and job scheduling (e.g. Kubernetes, Ray) on high core-count machines and rack-scale installations, Management and Observability (e.g. Prometheus, OpenTelemetry, DataDog, Splunk, etc.). 10+ years of relevant experience related to quality assurance/testing teams. Experience with the Atlassian suite and CI/CD platforms such More ❯