Cloud DevOps, SaaS, or observability, with 5+ years in leadership roles. Strong hands-on experience with AWS, GCP, Azure, K8S, Terraform and observability tools: Prometheus, Grafana, OpenTelemetry, ELK, Splunk, Datadog, and similar. Proficiency with metrics, logs, traces and APM. Leadership & Global Operations Proven success leading multi-regional or global technical teams with direct management of managers. Demonstrated ability to build More ❯
Java 17 or Java 21 Strong understanding of Spring Boot and Spring Batch frameworks Familiarity with test automation using Cucumber or similar frameworks Exposure to observability tools such as Datadog Hands-on experience with relational databases such as PostgreSQL and Oracle Practical knowledge of AWS Cloud services and infrastructure Experience with CI/CD pipelines, especially Jenkins Why join us More ❯
building and running cloud platforms and leading teams that sit at the intersection of infrastructure and product. Great Expertise in AWS best practices, infrastructure-as-code (Terraform), and monitoring (Datadog) Strong Experience in AWS utilizing Lambda, ECS, SQS, API Gateway etc. Any Programming Language experience such as Python, Golang, Typescript, Nodejs etc. If this sounds like an interesting opportunity to More ❯
building and running cloud platforms and leading teams that sit at the intersection of infrastructure and product. Great Expertise in AWS best practices, infrastructure-as-code (Terraform), and monitoring (Datadog) Strong Experience in AWS utilizing Lambda, ECS, SQS, API Gateway etc. Any Programming Language experience such as Python, Golang, Typescript, Nodejs etc. If this sounds like an interesting opportunity to More ❯
building and running cloud platforms and leading teams that sit at the intersection of infrastructure and product. Great Expertise in AWS best practices, infrastructure-as-code (Terraform), and monitoring (Datadog) Strong Experience in AWS utilizing Lambda, ECS, SQS, API Gateway etc. Any Programming Language experience such as Python, Golang, Typescript, Nodejs etc. If this sounds like an interesting opportunity to More ❯
building and running cloud platforms and leading teams that sit at the intersection of infrastructure and product. Great Expertise in AWS best practices, infrastructure-as-code (Terraform), and monitoring (Datadog) Strong Experience in AWS utilizing Lambda, ECS, SQS, API Gateway etc. Any Programming Language experience such as Python, Golang, Typescript, Nodejs etc. If this sounds like an interesting opportunity to More ❯
london (city of london), south east england, united kingdom
Harvey Nash
building and running cloud platforms and leading teams that sit at the intersection of infrastructure and product. Great Expertise in AWS best practices, infrastructure-as-code (Terraform), and monitoring (Datadog) Strong Experience in AWS utilizing Lambda, ECS, SQS, API Gateway etc. Any Programming Language experience such as Python, Golang, Typescript, Nodejs etc. If this sounds like an interesting opportunity to More ❯
Reigate, Surrey, England, United Kingdom Hybrid / WFH Options
Client Server Ltd
in Azure (will also consider AWS or GCP experience) You have a deep understanding of cloud infrastructure and services including best practices around monitoring, scaling and security tools e.g. DataDog You have strong scripting skills with PowerShell (or Python) You have a good knowledge of basic networking, TCP/IP You have a good understanding of IaC, they use Pulumi More ❯
A track record in mentoring other engineers, leading cross-team projects without authority, and driving design and technology decisions. Technologies we use (nice to have experience) Monitoring and alerting: Datadog, Falcon LogScale (formerly Humio) • Database management systems: PostgreSQL, ClickHouse Deployment tools: Flux, Helm, Kustomize Frontend frameworks: React, Angular Infrastructure as code: Terraform, Terragrunt Cloud provider: AWS Event streaming platform: Kafka More ❯
Masters or PhD in Computer Science, Physics, Engineering or Math. Knowledge of IP networking, VPNs, DNS, load balancing and firewalls Experience with monitoring and log aggregating frameworks like CloudWatch, Datadog, Splunk, Opentracing, AWS X-Ray, and APM tools. Experience with revision control source code repositories Experience with development and automated testing. Understanding of microservices and distributed application architecture. Strong verbal More ❯
ClaimCenter and other systems, including PAS, document management systems, and external data providers. Platform Monitoring : Determine requirements for specific alerts, set up alerts for various events and thresholds, utilise Datadog logs and dashboards for error analysis, and track DXC downtime while communicating updates to users. Platform Updates : Conduct a 3-way merge of updated code, validate new versions, and implement More ❯
North West London, London, United Kingdom Hybrid / WFH Options
ByteHire
of infrastructure setup and management Exposure to designing or building distributed systems, preferably in a cloud environment Company Tech Stack PHP, Laravel, ReactJS, TypeScript, Inertia, WordPress MySQL, Redis, ElasticSearch, DataDog, AWS, Terraform, Docker Benefits Hybrid working 1-2 days per week in the London office. Collaborate directly with the founding team and take ownership of product features. Be part of More ❯
as-code: Terraform, Pulumi Data Management and Orchestration: Airflow, dbt Databases and Data Warehouses: SQL Server, PostgreSQL, MongoDB, Qdrant, Pinecone GenAI: OpenAI APIs, HuggingFace, LangChain, Talk-to-data Monitoring: Datadog About You We are looking for someone who can wear two hats - the data architect and the strategic business consultant - so you'll need to show both advanced technical acumen More ❯
reliable, secure, and easy to use. You've led or contributed to modern cloud-native environments and are fluent in AWS best practices, infrastructure-as-code (Terraform), and monitoring (Datadog). You thrive in an environment where you're empowered to define direction , drive delivery, and represent your team's work to the wider business. You want to be part More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Eligo Recruitment
and capacity planning for mission-critical systems Develop secure backup, recovery, and disaster recovery procedures Explore multi-tenant and sharded architectures to support growth Implement monitoring strategies using Grafana, Datadog, and CI/CD integrations Champion database best practices, mentor teams, and standardize tooling and automation What You’ll Bring Extensive experience managing cloud-hosted PostgreSQL at scale Proficiency in More ❯
React Testing Library, Cypress, TurborepoBackend/Infrastructure- PHP 8 (Symfony), Kotlin, MySQL, RabbitMQ, AWS, Docker, KubernetesTooling- ArgoCD and Github Actions- Github, Jira & Confluence, for our code & product management processes- Datadog & Sentry, for debugging and reporting- Figma, for our design processWhat we're looking for- You have experience working with product owners to break down business requirements into deliverable tasks and More ❯
of building and operating systems at scale Advanced knowledge of configuration management systems, such as: Puppet, Chef, Ansible, or related systems Significant experience of monitoring frameworks (such as CloudWatch, Datadog, Grafana, Elastic or similar) Amazon is an equal opportunities employer. We believe passionately that employing a diverse workforce is central to our success. We make recruiting decisions based on your More ❯
as Code (IaC) using Terraform and Ansible. Design highly reliable, scalable, and secure infrastructure supporting performance-critical workloads. Build proactive monitoring, observability, and alerting with Prometheus, Grafana, Azure Monitor, DataDog, and Dynatrace. Troubleshoot complex system issues spanning applications, networks, and infrastructure. Define platform SLAs, SLOs, and governance standards for self-service use. Collaborate closely with Salesforce DevOps teams to ensure … scripting in PowerShell, Python, or Bash Experience implementing GitOps workflows and managing platform SLAs, SLOs, and governance standards Familiarity with observability and monitoring tools including Prometheus, Grafana, Azure Monitor, DataDog, or Dynatrace Preferred experience supporting Salesforce DevOps pipelines and working with Java, .NET, or Node.js application environments Exposure to AI/ML platforms, real-time data pipelines, and basic networking More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Eligo Recruitment
Bring Strong experience with GCP , Terraform , and Infrastructure-as-Code Deep knowledge of cloud networking, security automation, and compliance standards Proficiency in CI/CD pipelines , monitoring tools (Grafana, Datadog), and scripting A collaborative mindset with excellent communication and mentoring skills Why Join? Shape a next-gen AI infrastructure with autonomy and purpose Hybrid working with regular meetups in our More ❯
Lincolnshire, England, United Kingdom Hybrid / WFH Options
Akkodis
be helping to design and manage ETL/ELT pipelines, making sure data flows smoothly and reliably across the business. You'll also get hands-on with tools like Datadog or CloudWatch to monitor performance and keep things secure and efficient. If you enjoy writing clean Python code, working with SQL, and collaborating with analysts and engineers, this could be More ❯
East Midlands, United Kingdom Hybrid / WFH Options
Akkodis
be helping to design and manage ETL/ELT pipelines, making sure data flows smoothly and reliably across the business. You'll also get hands-on with tools like Datadog or CloudWatch to monitor performance and keep things secure and efficient. If you enjoy writing clean Python code, working with SQL, and collaborating with analysts and engineers, this could be More ❯
customer focused and continuously suggest how the backend can provide the best Customer Experience A passion for crypto and the transformations it enables We use Kotlin, PostgreSQL, Kafka, Redis, Datadog, Amplitude, Grafana, BigQuery, ApacheSpark and more COMPENSATION & PERKS Unlimited vacation policy; work hard and take time when you need it Unlimited learning policy; order the technical resources you need or More ❯
yourself on consistent high levels of test coverage, strong technical documentation and effective monitoring Preferably exposure to technologies such as Kafka, PostgreSQL, Redis We use Kotlin, PostgreSQL, Kafka, Redis, Datadog, Amplitude, Grafana, BigQuery, ApacheSpark and more A passion for crypto and the transformations it enables COMPENSATION & PERKS Full-time salary based on experience and meaningful equity in an industry-leading More ❯
levels, from application to network to host PREFERRED QUALIFICATIONS Exposure to cloud computing concepts and design considerations Experience in a production environment Experience of monitoring frameworks (such as CloudWatch, Datadog, Grafana, Elastic or similar) Amazon is an equal opportunities employer. We believe passionately that employing a diverse workforce is central to our success. We make recruiting decisions based on your More ❯