London, Bloomsbury, United Kingdom Hybrid / WFH Options
IntaPeople
or AWS CodePipeline Support and train technical staff in upskilling necessary for ongoing operations Monitor and ensure system reliability, availability, and performance using tools likeCloudWatch, Prometheus, Icinga2, Grafana, and Datadog Automate deployment, scaling, and management of containerized applications using Docker and Kubernetes Desirable skills Travis CI Monitoring – Grafana, Icinga Prometheus Rabbit MQ/AMQP Working knowledge of security best practices More ❯
as GitLab , GitHub Actions, or CircleCI Strong testing capabilities using JUnit , RestAssured , or similar frameworks Proactive with monitoring, observability, and system health Desirable Skills: Exposure to monitoring platforms like Datadog, Grafana, Prometheus , or PagerDuty Familiarity with Python scripting Experience with Kubernetes and deployment tools such as Helm Why Join H&B Tech? Help define the future of digital health & wellness More ❯
codebase, currently in Java (11+), and ideally Spring Boot. You will be working with SQL and large SQL databases, Docker, Kubernetes, OpenAPI specifications, and distributed system observability tooling (e.g., Datadog APM). Infrastructure automation is primarily owned by the infrastructure team, but you will be a consumer of their work; familiarity with AWS, Terraform and Docker is beneficial. Testing approaches More ❯
configuration management tools (e.g., Ansible, Puppet, Chef). Knowledge of infrastructure as code (IaC) tools (e.g., Terraform, CloudFormation). Experience with monitoring and logging tools (e.g., Prometheus, ELK Stack, Datadog). Passion for continuous learning and professional development. IBM is committed to creating a diverse environment and is proud to be an equal-opportunity employer. All qualified applicants will receive More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Noir
and maintaining CI/CD pipelines, and be confident scripting in Python, C# or similar scripting languages. You'll also be comfortable working with monitoring and performance tools like Datadog or Prometheus, and ideally, you'll have worked in a fast-moving SaaS or product-led business before. Bonus points if you've helped shape DevOps roadmaps, mentored others, or More ❯
are JVM based with the majority running on Java 21. We're in the process of moving our backend services to Spring Boot. We've invested heavily in our DataDog integration to bring world class observability and monitoring to our systems. We've recently moved to Gitlab and are currently building out our next generation of automated deployment pipelines. We More ❯
Mesh (ie. Istio) and GitOps (ie. ArgoCD), with a focus on streamlined deployments and managing complex service-oriented architectures. Experienced in leveraging observability tools, such as Honeycomb (OpenTelemetry) and DataDog, to support data-driven decisions across the wider engineering team. Comprehensive understanding of networking in cloud environments, including VPN solutions, efficient network configuration, load balancing, and troubleshooting. Extensive experience designing More ❯
building robust and efficient backend solutions. Strong hands-on experience with Terraform for infrastructure as code, enabling scalable and reliable systems. Experience with monitoring and observability tools, such as Datadog or Prometheus. Familiarity with event-driven systems, particularly Kafka and/or RabbitMQ. Deep understanding of messaging and queuing systems, including design patterns for reliability, retries, and scaling. Strong understanding More ❯
Experience of using Git or similar to track changes Experience of both the full .NET Framework and .NET Core Experience of using observability systems such as Elastic APM or DataDog to track and diagnose issues in production A solid understanding of security principles and secure coding including OWASP Top 10 Nice to haves: o Experience in VOIP, (SIP and RTP More ❯
tuning. Lead technical triage and root cause analysis for infrastructure-related issues Develop and deploy applications using Docker and AWS FARGATE Use CloudWatch, CloudTrail, and third-party tools like Datadog for performance and cost efficiency Configure AWS networking (VPCs, TGWs), enforce governance via AWS Config and tagging policies Maintain architecture diagrams, SOPs, and collaborate across engineering and product teams Should More ❯
tuning. Lead technical triage and root cause analysis for infrastructure-related issues Develop and deploy applications using Docker and AWS FARGATE Use CloudWatch, CloudTrail, and third-party tools like Datadog for performance and cost efficiency Configure AWS networking (VPCs, TGWs), enforce governance via AWS Config and tagging policies Maintain architecture diagrams, SOPs, and collaborate across engineering and product teams Should More ❯
tuning. Lead technical triage and root cause analysis for infrastructure-related issues Develop and deploy applications using Docker and AWS FARGATE Use CloudWatch, CloudTrail, and third-party tools like Datadog for performance and cost efficiency Configure AWS networking (VPCs, TGWs), enforce governance via AWS Config and tagging policies Maintain architecture diagrams, SOPs, and collaborate across engineering and product teams Should More ❯
tuning. Lead technical triage and root cause analysis for infrastructure-related issues Develop and deploy applications using Docker and AWS FARGATE Use CloudWatch, CloudTrail, and third-party tools like Datadog for performance and cost efficiency Configure AWS networking (VPCs, TGWs), enforce governance via AWS Config and tagging policies Maintain architecture diagrams, SOPs, and collaborate across engineering and product teams Should More ❯
Cloud DevOps, SaaS, or observability, with 5+ years in leadership roles. Strong hands-on experience with AWS, GCP, Azure, K8S, Terraform and observability tools: Prometheus, Grafana, OpenTelemetry, ELK, Splunk, Datadog, and similar. Proficiency with metrics, logs, traces and APM. Leadership & Global Operations Proven success leading multi-regional or global technical teams with direct management of managers. Demonstrated ability to build More ❯
building and running cloud platforms and leading teams that sit at the intersection of infrastructure and product. Great Expertise in AWS best practices, infrastructure-as-code (Terraform), and monitoring (Datadog) Strong Experience in AWS utilizing Lambda, ECS, SQS, API Gateway etc. Any Programming Language experience such as Python, Golang, Typescript, Nodejs etc. If this sounds like an interesting opportunity to More ❯
building and running cloud platforms and leading teams that sit at the intersection of infrastructure and product. Great Expertise in AWS best practices, infrastructure-as-code (Terraform), and monitoring (Datadog) Strong Experience in AWS utilizing Lambda, ECS, SQS, API Gateway etc. Any Programming Language experience such as Python, Golang, Typescript, Nodejs etc. If this sounds like an interesting opportunity to More ❯
building and running cloud platforms and leading teams that sit at the intersection of infrastructure and product. Great Expertise in AWS best practices, infrastructure-as-code (Terraform), and monitoring (Datadog) Strong Experience in AWS utilizing Lambda, ECS, SQS, API Gateway etc. Any Programming Language experience such as Python, Golang, Typescript, Nodejs etc. If this sounds like an interesting opportunity to More ❯
building and running cloud platforms and leading teams that sit at the intersection of infrastructure and product. Great Expertise in AWS best practices, infrastructure-as-code (Terraform), and monitoring (Datadog) Strong Experience in AWS utilizing Lambda, ECS, SQS, API Gateway etc. Any Programming Language experience such as Python, Golang, Typescript, Nodejs etc. If this sounds like an interesting opportunity to More ❯
building and running cloud platforms and leading teams that sit at the intersection of infrastructure and product. Great Expertise in AWS best practices, infrastructure-as-code (Terraform), and monitoring (Datadog) Strong Experience in AWS utilizing Lambda, ECS, SQS, API Gateway etc. Any Programming Language experience such as Python, Golang, Typescript, Nodejs etc. If this sounds like an interesting opportunity to More ❯
london (city of london), south east england, united kingdom
Harvey Nash
building and running cloud platforms and leading teams that sit at the intersection of infrastructure and product. Great Expertise in AWS best practices, infrastructure-as-code (Terraform), and monitoring (Datadog) Strong Experience in AWS utilizing Lambda, ECS, SQS, API Gateway etc. Any Programming Language experience such as Python, Golang, Typescript, Nodejs etc. If this sounds like an interesting opportunity to More ❯
A track record in mentoring other engineers, leading cross-team projects without authority, and driving design and technology decisions. Technologies we use (nice to have experience) Monitoring and alerting: Datadog, Falcon LogScale (formerly Humio) • Database management systems: PostgreSQL, ClickHouse Deployment tools: Flux, Helm, Kustomize Frontend frameworks: React, Angular Infrastructure as code: Terraform, Terragrunt Cloud provider: AWS Event streaming platform: Kafka More ❯
Masters or PhD in Computer Science, Physics, Engineering or Math. Knowledge of IP networking, VPNs, DNS, load balancing and firewalls Experience with monitoring and log aggregating frameworks like CloudWatch, Datadog, Splunk, Opentracing, AWS X-Ray, and APM tools. Experience with revision control source code repositories Experience with development and automated testing. Understanding of microservices and distributed application architecture. Strong verbal More ❯
ClaimCenter and other systems, including PAS, document management systems, and external data providers. Platform Monitoring : Determine requirements for specific alerts, set up alerts for various events and thresholds, utilise Datadog logs and dashboards for error analysis, and track DXC downtime while communicating updates to users. Platform Updates : Conduct a 3-way merge of updated code, validate new versions, and implement More ❯