You may occasionally be required to travel for business Looking for details about our benefits? You can learn more about them by clicking HERE Description and Requirements "At BMC trust is not just a word - it's a way of More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
Embarcaderomediagroup
Azure DevOps, YAML-based) with security scanning and progressive delivery Supporting AKS clusters and Azure services (SQL, Cosmos DB, ADF, Functions, Logic Apps, etc.) Improving monitoring and alerting with Datadog, Grafana, ELK, and proactive failure detection Participating in the on-call rota and leading incident response workflows and blameless postmortems Coaching engineers, upskilling teams, and contributing to a culture of … DB, etc.) Strong Infrastructure as Code skills with Terraform (v1.7+) Experience with CI/CD pipelines, GitOps, and automation tools (PowerShell, Bash) Familiarity with observability and incident tools like Datadog, ELK, and synthetic monitoring Solid understanding of networking (TCP/IP, Load Balancing, DNS, Routing) Good knowledge of DevSecOps practices - including security scanning, IAM, and RBAC Experience with FinOps - tagging … Familiarity with security scanning tools (Trivy, tfsec) integrated into pipelines A proactive approach to problem-solving, documentation, and coaching Additional bonus skills include experience with Azure governance tools, advanced Datadog capabilities, Kubernetes autoscaling solutions, GitOps workflows, automated cost dashboards, compliance frameworks, and internal platform development. What You Can Expect: Competitive salary: £70,000 - £80,000 depending on experience 25 days More ❯
GCP, or Azure). Expert of CI/CD processes, containerization (Docker, Kubernetes), and a deep understanding of networking, distributed systems, and databases. Expert with monitoring and troubleshooting utilities (DataDog, Prometheus, Grafana, ELK stack, Splunk, Humio, etc.). Exceptional problem-solving skills and a detail-oriented mindset, coupled with outstanding communication abilities. Desirable Experience with Azure, a background in autonomous More ❯
test: Containerisation (e.g. Docker), Virtualisation and Provisioning, Workload and job scheduling (e.g. Kubernetes, Ray) on high core-count machines and rack-scale installations, Management and Observability (e.g. Prometheus, OpenTelemetry, DataDog, Splunk, etc.). 10+ years of relevant experience related to quality assurance/testing teams. Experience with the Atlassian suite and CI/CD platforms such as Jenkins; GitHub or More ❯
Watford, Hertfordshire, United Kingdom Hybrid / WFH Options
Wickes
You'll have a deep understanding of modern cloud ecosystems, with extensive hands-on experience in Amazon Web Services (AWS). Familiarity with modern observability concepts and tools, including Datadog, and proven experience with the "platform as a product" model and driving adoption of internal tools. Strong familiarity with CI/CD principles and pipelines (e.g., Jenkins, GitLab CI, CircleCI More ❯
GCP, or Azure). Expert of CI/CD processes, containerization (Docker, Kubernetes), and a deep understanding of networking, distributed systems, and databases. Expert with monitoring and troubleshooting utilities (DataDog, Prometheus, Grafana, ELK stack, Splunk, Humio, etc.). Exceptional problem-solving skills and a detail-oriented mindset, coupled with outstanding communication abilities. Experience with Azure, a background in autonomous vehicles More ❯
london, south east england, united kingdom Hybrid / WFH Options
Wayve
GCP, or Azure). Expert of CI/CD processes, containerization (Docker, Kubernetes), and a deep understanding of networking, distributed systems, and databases. Expert with monitoring and troubleshooting utilities (DataDog, Prometheus, Grafana, ELK stack, Splunk, Humio, etc.). Exceptional problem-solving skills and a detail-oriented mindset, coupled with outstanding communication abilities. Experience with Azure, a background in autonomous vehicles More ❯
TypeScript for Frontend. Our backend services are written in TypeScript and Kotlin. Frameworks and Libraries: We use React/Redux and WebAssembly. Monitoring and Logging: We are currently using Datadog for monitoring and logging. Metrics are collected across our agents, taken from the logs using metric filters, and updated directly from lambda function or the application. Infrastructure-as-Code: Most More ❯
building robust and efficient backend solutions. Strong hands-on experience with Terraform for infrastructure as code, enabling scalable and reliable systems. Experience with monitoring and observability tools, such as Datadog or Prometheus. Familiarity with event-driven systems, particularly Kafka and/or RabbitMQ. Deep understanding of messaging and queuing systems, including design patterns for reliability, retries, and scaling. Strong understanding More ❯
building robust and efficient backend solutions. Strong hands-on experience with Terraform for infrastructure as code, enabling scalable and reliable systems. Experience with monitoring ****and observability tools, such as Datadog or Prometheus. Familiarity with event-driven systems, particularly Kafka and/or RabbitMQ. Deep understanding of messaging and queuing systems, including design patterns for reliability, retries, and scaling. Strong understanding More ❯
building robust and efficient backend solutions. Strong hands-on experience with Terraform for infrastructure as code, enabling scalable and reliable systems. Experience with monitoring and observability tools, such as Datadog or Prometheus. Familiarity with event-driven systems, particularly Kafka and/or RabbitMQ. Deep understanding of messaging and queuing systems, including design patterns for reliability, retries, and scaling. Strong understanding More ❯
building robust and efficient backend solutions. Strong hands-on experience with Terraform for infrastructure as code, enabling scalable and reliable systems. Experience with monitoring ****and observability tools, such as Datadog or Prometheus. Familiarity with event-driven systems, particularly Kafka and/or RabbitMQ. Deep understanding of messaging and queuing systems, including design patterns for reliability, retries, and scaling. Strong understanding More ❯
building robust and efficient backend solutions. Strong hands-on experience with Terraform for infrastructure as code, enabling scalable and reliable systems. Experience with monitoring ****and observability tools, such as Datadog or Prometheus. Familiarity with event-driven systems, particularly Kafka and/or RabbitMQ. Deep understanding of messaging and queuing systems, including design patterns for reliability, retries, and scaling. Strong understanding More ❯
building robust and efficient backend solutions. Strong hands-on experience with Terraform for infrastructure as code, enabling scalable and reliable systems. Experience with monitoring ****and observability tools, such as Datadog or Prometheus. Familiarity with event-driven systems, particularly Kafka and/or RabbitMQ. Deep understanding of messaging and queuing systems, including design patterns for reliability, retries, and scaling. Strong understanding More ❯
and optimize CI/CD pipelines using Azure DevOps, GitHub Actions, or Jenkins. Automate everything with Terraform, Bicep, and scripting (PowerShell, Bash, Python). Drive observability with tools like Datadog, LogicMonitor, CloudWatch, and Grafana. Champion cloud security, IAM, RBAC, and compliance best practices. Collaborate across teams, mentor peers, and contribute to a culture of continuous improvement. What You Bring: Proven More ❯
Leeds, England, United Kingdom Hybrid / WFH Options
Anson McCade
frameworks Desirable Experience Delivery of secure software in government, defence, or other regulated sectors Hands-on cloud-native development and deployment Knowledge of logging and monitoring tools such as DataDog, Prometheus, or StackDriver Experience working with product lifecycle tooling and engineering in complex domains If you’re looking to focus on real engineering work that drives meaningful outcomes and want More ❯
Leeds, West Yorkshire, England, United Kingdom Hybrid / WFH Options
Anson McCade Ltd - IT and Finance Recruitment
Nice to Have (But Not Essential) Cloud experience: AWS, Azure or GCP Solid grasp of databases and data modelling Familiarity with open-source tools and monitoring platforms (e.g., Prometheus, DataDog) Experience with test automation frameworks and performance tools More ❯
test: Containerisation (e.g. Docker), Virtualisation and Provisioning, Workload and job scheduling (e.g. Kubernetes, Ray) on high core-count machines and rack-scale installations, Management and Observability (e.g. Prometheus, OpenTelemetry, DataDog, Splunk, etc.). 10+ years of relevant experience related to quality assurance/testing teams. Experience with the Atlassian suite and CI/CD platforms such as Jenkins; GitHub or More ❯
london, south east england, united kingdom Hybrid / WFH Options
PeopleCheck
present past case studies and guide stakeholders Preferred Qualifications Background in compliance or background-screening services Experience with microservices design and orchestration (Kubernetes, ECS) Knowledge of advanced observability tools (Datadog, New Relic, ELK) Why Join Us? Impact : Help define the technical roadmap together with our tech lead of a mission-critical compliance platform. Ownership : Lead key initiatives end-to-end More ❯
Wandsworth, Greater London, UK Hybrid / WFH Options
PeopleCheck
present past case studies and guide stakeholders Preferred Qualifications Background in compliance or background-screening services Experience with microservices design and orchestration (Kubernetes, ECS) Knowledge of advanced observability tools (Datadog, New Relic, ELK) Why Join Us? Impact : Help define the technical roadmap together with our tech lead of a mission-critical compliance platform. Ownership : Lead key initiatives end-to-end More ❯
TypeScript for Frontend. Our backend services are written in TypeScript and Kotlin. Frameworks and Libraries: We use React/Redux and WebAssembly. Monitoring and Logging: We are currently using Datadog for monitoring and logging. Metrics are collected across our agents, taken from the logs using metric filters, and updated directly from lambda function or the application. Infrastructure-as-Code: Most More ❯
Cambourne, Cambridgeshire, United Kingdom Hybrid / WFH Options
Remotestar
Job description RemoteStar is looking to hire a Senior Site Reliability Engineering Manager on behalf of our client based in the UK with a fully remote work policy. About Client: The client building, the B2B marketplace for diamonds. It's More ❯
Portsmouth, Hampshire, United Kingdom Hybrid / WFH Options
Checkatrade
Senior Platform Engineer Experience in Cloud Native technologies? Come join us! Are you looking for a new role? We have an exciting opportunity at Checkatrade for a Senior Platform Engineer to join our mission of making home improvements easy by More ❯
Docker, Helm, Python and Bash scripting. Supporting developers and other engineers with any pipeline issues. General management and operations of GitLab source control system. Monitoring and management of our Datadog instance, including log management and APM More ❯
Docker, Helm, Python and Bash scripting. Supporting developers and other engineers with any pipeline issues. General management and operations of GitLab source control system. Monitoring and management of our Datadog instance, including log management and APM More ❯