Bristol, Avon, South West, United Kingdom Hybrid / WFH Options
Halian Technology Limited
in the team Contribute to solution architecture and strategic technical direction Build, integrate, and maintain REST APIs and backend services Champion best practices in software quality, CI/CD, observability, and DevOps Collaborate with cross-functional teams including Product, QA, and DevOps Optionally take on people management responsibilities for engineers Stay updated with emerging backend and cloud technologies Key Skills More ❯
AWS and Azure. Build and optimize CI/CD pipelines using Azure DevOps, GitHub Actions, or Jenkins. Automate everything with Terraform, Bicep, and scripting (PowerShell, Bash, Python). Drive observability with tools like Datadog, LogicMonitor, CloudWatch, and Grafana. Champion cloud security, IAM, RBAC, and compliance best practices. Collaborate across teams, mentor peers, and contribute to a culture of continuous improvement. More ❯
AWS and Azure. Build and optimize CI/CD pipelines using Azure DevOps, GitHub Actions, or Jenkins. Automate everything with Terraform, Bicep, and scripting (PowerShell, Bash, Python). Drive observability with tools like Datadog, LogicMonitor, CloudWatch, and Grafana. Champion cloud security, IAM, RBAC, and compliance best practices. Collaborate across teams, mentor peers, and contribute to a culture of continuous improvement. More ❯
development for enterprise solutions. Experience with multiprocessing, async I/O, and performance profiling. Unit testing, performance testing, and BDD. Understanding of OAuth 2.0 and secure authorization. Proficiency with observability tools (Grafana, Prometheus, etc.). DevOps and CI/CD (Jenkins, GitOps). Strong communication and collaboration skills. Understanding of deep learning and ML frameworks (TensorFlow, PyTorch). Secure coding More ❯
development for enterprise solutions. Experience with multiprocessing, async I/O, and performance profiling. Unit testing, performance testing, and BDD. Understanding of OAuth 2.0 and secure authorization. Proficiency with observability tools (Grafana, Prometheus, etc.). DevOps and CI/CD (Jenkins, GitOps). Strong communication and collaboration skills. Understanding of deep learning and ML frameworks (TensorFlow, PyTorch). Secure coding More ❯
Education: Bachelor's degree or equivalent and/or appropriate experience Experience: 8+ years of experience in virtualization, containerization, build, and deployment Extensive experience with SCM, CICD, instrumentation, and observability tools Proficiency with Git, GitHub, Azure DevOps Proficiency with major IaC technologies such as Terraform, Pulumi, or Bicep Proficiency with programming and scripting languages such as C#, PowerShell, Python, and More ❯
the evolution of our platform's microservices ecosystem. What You'll Do Architect, build, and maintain scalable Python microservices deployed in cloud environments Lead architectural decisions focusing on performance, observability, fault tolerance, and scalability Own complex backend features end-to-end-design, implement, test, deploy, and monitor Mentor and guide engineers through code reviews, design discussions, and best practices Collaborate More ❯
integrate with CI/CD pipelines. Infrastructure as Code (IaC): Hands-on experience using Terraform for provisioning and managing cloud infrastructure. Proficient in version control, particularly with GitHub. Monitoring & Observability: Proficient with monitoring and alerting tools (e.g., Prometheus, Grafana, CloudWatch) to track pipeline and infrastructure health. Strong troubleshooting skills for resolving CI/CD pipeline issues and optimizing pipeline performance. More ❯
AWS, Azure, or Google Cloud. Familiarity with database systems, data modelling, and SQL/NoSQL technologies. Comfortable working with a range of open-source tools and frameworks. Experience with observability tools like DataDog, Prometheus, or StackDriver. Knowledge of test automation frameworks and practices. Why This Role Stands Out No sales responsibilities or forced management track—grow deeply in your technical More ❯
or Bash Familiar with ML lifecycle tools, model monitoring, and versioning Exposure to tools like KServe, Ray Serve, Triton, or vLLM is a big plus Bonus Points Experience with observability frameworks like Prometheus or OpenTelemetry Knowledge of ML libraries: TensorFlow, PyTorch, HuggingFace Exposure to Azure or GCP Passion for financial services Qualifications Degree in Computer Science, Engineering, Data Science, or More ❯
Architectures (Kafka). Collaborate with DevOps teams to implement CI/CD pipelines and infrastructure as code using tools like Terraform, CloudFormation, and Ansible. Implement and manage monitoring and observability tools such as Datadog. Ensure real-time logging, alerting, and troubleshooting capabilities. Collaboration & Stakeholder Management: Work closely with business units, developers, and IT teams to understand requirements and translate them More ❯
Cambridge, Cambridgeshire, United Kingdom Hybrid / WFH Options
Arm Limited
Java or Python. Deep understanding of AWS or other cloud providers (e.g. GCP, Azure). Strong understanding of key security technologies and protocols such as TLS, OAuth and SPIFFE. Observability, alerting, metrics collection and visualisation (e.g. Prometheus, Grafana, Elasticsearch, Dynatrace). "Nice To Have" Skills and Experience: We would be even more impressed if you are passionate about the following More ❯
Cambourne, Cambridgeshire, United Kingdom Hybrid / WFH Options
Remotestar
end tests. Ability to write and understand design documentation using C4, sequence diagrams and workflows. Excellent problem-solving skills and attention to detail. Solid understanding of logging, monitoring and observability to understand if software is functioning as required. Strong communication and teamwork skills. Preferred Skills: Experience with cloud platforms e.g., AWS, Azure, Google Cloud. Knowledge of DevOps practices and CI More ❯
Apache NiFi in a containerized environment Experience creating data partitioning strategies and monitoring topics for performance Experience deploying and upgrading Kafka clusters in high availability containerized environments Experience utilizing observability platforms, including Prometheus, Grafana, or Elastic to configure monitoring for data pipelines to ensure high availability and throughput, low latency, and alerting Knowledge of stream processing pipelines and analytics Secret More ❯
with development and platform teams to deliver scalable, secure solutions. Automating infrastructure using tools like Terraform, ARM/BICEP, and scripting languages. Monitoring system performance and troubleshooting issues using observability tools. Supporting cloud services (e.g., EC2, S3, RDS, Azure VMs, Azure Functions). Maintaining and improving system documentation and operational procedures. Mentor team members and contribute to a culture of More ❯
Watford, Hertfordshire, United Kingdom Hybrid / WFH Options
Wickes
continuous improvement and team velocity. You'll have a deep understanding of modern cloud ecosystems, with extensive hands-on experience in Amazon Web Services (AWS). Familiarity with modern observability concepts and tools, including Datadog, and proven experience with the "platform as a product" model and driving adoption of internal tools. Strong familiarity with CI/CD principles and pipelines More ❯
management best practices Shows the ability to break down large technical concepts into effective communication with stakeholders from across the organization Extensive knowledge of networking best practices, tools, and observability Experiencing developing and deploying automated service configuration at the edge (ex. CDN configuration, certificate renewal) Work consulting with a team being able to advise on their technology, workflows, dev tooling More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
WorksHub
the infrastructure and deployment of those applications. We are actively expanding our Manchester born SRE function, which aims to advance our knowledge and innovation globally in areas such as Observability, Reliability and Availability. We have the autonomy to choose the technologies and processes that help us achieve our objectives. So each team leverages the technology that fits their needs best. More ❯
. Build evaluation pipelines to benchmark LLM performance and continuously monitor production accuracy and relevance. Work across the ML stack-from data preparation and model training to serving and observability-either independently or in collaboration with other specialists. Optimize model pipelines for latency, scalability, and cost-efficiency , and support real-time and batch inference needs. Collaborate with MLOps, DevOps, and More ❯
Bristol, Avon, South West, United Kingdom Hybrid / WFH Options
Hargreaves Lansdown
Experience with unit, integration, and end-to-end testing tools and practices (e.g. Jest, Cypress, Backstop, Playwright). Experience with CI/CD and Trunk Based Development. Experience with observability tools and practices, including monitoring, logging, and tracing to ensure system reliability and performance. Experience with integration and onboarding third-party vendors, meeting with vendor engineering contacts, defining integration patterns More ❯
infrastructure environments, including coding, testing, and certifying technology platforms, software, and applications, as well as coach and mentor team members. THE IMPACT YOU WILL MAKE The Lead AWS Monitoring & Observability Engineer - AWS & APM Tools role will offer you the flexibility to make each day your own, while working alongside people who care so that you can deliver on the following … the team in addressing identified issues. Qualifications THE EXPERIENCE YOU BRING TO THE TEAM Minimum Required Experiences 6 years 4 years of hands-on experience managing the Monitoring and Observability platform using Splunk/Dynatrace/Open Telemetry/AWS Cloudwatch in a large-scale Linux and Windows Server environments. Experience in generating and using complex database queries. Skilled in … Python Programming. Desired Experiences Bachelor degree or equivalent 4+ years of hands-on experience managing the Monitoring and Observability platform using Splunk/Dynatrace/Open Telemetry/AWS Cloudwatch in a large-scale Linux and Windows Server environments on-premises and AWS. Experience supporting mission-critical platforms in an on-call setting. AWS/Linux/Windows/Other More ❯
infrastructure environments, including coding, testing, and certifying technology platforms, software, and applications, as well as coach and mentor team members. THE IMPACT YOU WILL MAKE The Lead AWS Monitoring & Observability Engineer - AWS & APM Tools role will offer you the flexibility to make each day your own, while working alongside people who care so that you can deliver on the following … the team in addressing identified issues. Qualifications THE EXPERIENCE YOU BRING TO THE TEAM Minimum Required Experiences 6 years 4 years of hands-on experience managing the Monitoring and Observability platform using Splunk/Dynatrace/Open Telemetry/AWS Cloudwatch in a large-scale Linux and Windows Server environments. Experience in generating and using complex database queries. Skilled in … Python Programming. Desired Experiences Bachelor degree or equivalent 4+ years of hands-on experience managing the Monitoring and Observability platform using Splunk/Dynatrace/Open Telemetry/AWS Cloudwatch in a large-scale Linux and Windows Server environments on-premises and AWS. Experience supporting mission-critical platforms in an on-call setting. AWS/Linux/Windows/Other More ❯
creation by building world-class audio infrastructure for our customers. As a Site Reliability Engineer, you'll play a key role in improving our platform's developer operations, including observability, monitoring, and overall reliability. You will be part of a cross-functional team dedicated to implementing robust DevOps practices and enhancing infrastructure and site reliability engineering (SRE). A customer … focused mindset is essential, as the team collaborates closely with stakeholders to ensure solutions meet business and user needs. In addition to a focus on observability, you will contribute hands-on by developing features, automating workflows, and supporting the deployment of advanced machine-learning models. Strong communication skills are vital for working effectively with engineers, product teams, and stakeholders across … about CI/CD to these engineers Identifying and resolving security issues Automating tests and supporting our engineers on building great software Minimum qualifications: Strong experience with monitoring/observability tools (Grafana, Prometheus, or similar) Proficiency in Python, Docker, Kubernetes, and CI/CD pipelines Hands-on cloud experience (AWS or similar) A passion for designing and implementing scalable observabilityMore ❯