standards and conventions Deep desire and practice maintaining uniformity and cleanliness in a large codebases and infrastructure projects Desirable Skills & Experience Hands on experience monitoring large production infrastructure using DataDog and CloudWatch Previously owned end-to-end responsibility in a service, including development and production support Experience using configuration management tools such as Chef, Ansible or Puppet Proficient writing code More ❯
. Preferred Qualifications Experience in hybrid cloud environments and integration with on-premise systems. Background in DevOps, SRE, or Infrastructure Engineering. Knowledge of monitoring/logging tools (e.g., CloudWatch, Datadog, Prometheus, ELK). Experience with enterprise security and compliance frameworks (e.g., ISO 27001, SOC 2, GDPR). Familiarity with cost modeling and optimization strategies in AWS. More ❯
and other relevant tools. Security Best Practices: IAM, MFA, data encryption, firewall configurations. Programming/Scripting: Python, Terraform, or similar languages. Event-Driven Architectures: Kafka. Monitoring and Logging: Datadog, ELK Stack, Prometheus, etc. Experience in agile methodologies and DevOps practices. Location: Hybrid. Office located in London. (Hayes area). Office presence required: Yes. Frequency: 2-3 times a week at More ❯
Proficiency in scripting and automation using Python, Bash, or Go. Experience with Infrastructure as Code (Terraform, CloudFormation, or Ansible). Familiarity with monitoring, logging, and observability tools (Prometheus, Grafana, Datadog, ELK, etc.). Strong understanding of networking concepts (VPC, Load Balancers, DNS, Firewalls). Experience with DevOps methodologies, CI/CD pipelines, and GitOps practices. Experience with high-performance and More ❯
ARM templates) Proficiency with container technologies like Docker and orchestration (Kubernetes, ECS, AKS, etc.) Strong scripting skills in Python, Bash, or PowerShell Experience with monitoring and logging tools (CloudWatch, Datadog, Prometheus, ELK stack, etc.) Familiarity with CI/CD tools (GitLab CI, Jenkins, GitHub Actions, etc.) The successful candidate must hold and maintain a high level of Security Clearance. Preferred More ❯
as GitLab , GitHub Actions, or CircleCI Strong testing capabilities using JUnit , RestAssured , or similar frameworks Proactive with monitoring, observability, and system health Desirable Skills: Exposure to monitoring platforms like Datadog, Grafana, Prometheus , or PagerDuty Familiarity with Python scripting Experience with Kubernetes and deployment tools such as Helm Why Join H&B Tech? Help define the future of digital health & wellness More ❯
needed About You 5+ years' experience in Site Reliability Engineer roles Expert+ level Linux administration, scripting, and troubleshooting Demonstrable knowledge of Observability tools (Prometheus/Grafana, New Relic, Splunk, DataDog) Comprehensive experience with AWS (Amazon Web Services) and its core capabilities (VPC, EC2, ECS, Route53, Fargate, ALB/NLB distributions, etc) Extensive experience with cloud automation and infrastructure-as-code More ❯
needed About You 5+ years' experience in Site Reliability Engineer roles Expert+ level Linux administration, scripting, and troubleshooting Demonstrable knowledge of Observability tools (Prometheus/Grafana, New Relic, Splunk, DataDog) Comprehensive experience with AWS (Amazon Web Services) and its core capabilities (VPC, EC2, ECS, Route53, Fargate, ALB/NLB distributions, etc) Extensive experience with cloud automation and infrastructure-as-code More ❯
/CD tools such as GitlabCI, CircleCI, Github Actions, and GitOps using ArgoCD, FluxCD Troubleshooting and debugging applications using Observability tooling across microservices and serverless applications such as Splunk, DataDog Managing ephemeral secrets and credentials using Hashicorp Vault Managing least privileged access to cloud resources using TPAM solutions such as Hashicorp Boundary Bonus Points for experience with: Production experience architecting More ❯
CircleCI also welcome Proficiency in testing frameworks like JUnit and RestAssured A passion for monitoring, observability , and maintaining resilient systems Desirable Skills: Experience with monitoring and alerting tools like Datadog, Prometheus, Grafana, or PagerDuty Exposure to Python scripting Familiarity with deployment platforms such as Kubernetes and tools like Helm Why Join H&B Tech? Be part of a fast-moving More ❯
in financial services with an understanding of financial market data would be helpful. Level 3 production support. ISTQB Foundation in Software Testing Familiarity with technologies such as (Zephyr, JIRA, Datadog, JavaScript/Typescript, Selenium, JMeter/Gatling or other non-functional testing tools) What We Value We love solving problems, communicating clearly and turning business challenges into technical triumphs! LSEG More ❯
in financial services with an understanding of financial market data would be helpful. Level 3 production support. ISTQB Foundation in Software Testing Familiarity with technologies such as (Zephyr, JIRA, Datadog, JavaScript/Typescript, Selenium, JMeter/Gatling or other non-functional testing tools) What We Value We love solving problems, communicating clearly and turning business challenges into technical triumphs! LSEG More ❯
in financial services with an understanding of financial market data would be helpful. Level 3 production support. ISTQB Foundation in Software Testing Familiarity with technologies such as (Zephyr, JIRA, Datadog, JavaScript/Typescript, Selenium, JMeter/Gatling or other non-functional testing tools) What We Value We love solving problems, communicating clearly and turning business challenges into technical triumphs! LSEG More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Harnham - Data & Analytics Recruitment
optimisation Nice to Have Experience with ML tooling (MLflow, Kubeflow) Knowledge of FastAPI , Databricks, or Snowflake Exposure to SRE practices or cloud security certifications Familiarity with Prometheus , Grafana , or Datadog Interested? If you want to be part of a world-class AI team at an early stage-where your infrastructure decisions will directly shape the company's success-apply today More ❯
roles 3+ years' experience with an object-oriented language (preferably Java, .NET or C++) Expert+ level Linux administration, scripting, and troubleshooting Demonstrable knowledge of Observability tools (New Relic, Splunk, DataDog) Comprehensive experience with AWS (Amazon Web Services) and its core capabilities (VPC, EC2, ECS, Route53, Fargate, ALB/NLB distributions, etc) Extensive experience with cloud automation and infrastructure-as-code More ❯
Actions, Gitlab, Argo CD, AzureDevops). Experience with Devops processes and practices using different tools and methods to monitor systems in production, using such tools as ELK, Grafana, Prometheus, Datadog or AWS CloudWatch. Strong problem-solving skills and the ability to debug and optimize code, Clear concise technical documentation; creating and maintaining runbooks and end user documentation. Comfortable working in More ❯
expand documentation for system behavior, runbooks, and escalation flows. Tech Stack & Tooling Languages: Python (primary), Bash, T-SQL OS/Infrastructure: Linux, Windows, Docker, AWS Cloud services Monitoring & Alerting: DataDog, Grafana, custom tooling Automation/CI/CD: Git, TeamCity, Ansible, Terraform (optional) Databases: MS SQL Server, Snowflake General Any other duties commensurate with the post holder's position and More ❯
frontend architecture (e.g., Module Federation or Single-SPA). Experience with cloud-native DevOps tooling: Docker, Kubernetes, AWS/GCP deployments. Proficiency in analytics and observability tools like Sentry, Datadog, or LogRocket. Soft Skills Strategic thinker with strong problem-solving and decision-making skills. Ability to work in fast-paced, agile environments with cross-functional teams. Clear communication and documentation More ❯
factor principles and fit into our microservices architecture Cloud-related tools, services, and distributed system observability to support these applications, such as Docker, Kubernetes, ElasticSearch, log management systems, and Datadog APM, to name but a few API specifications, conforming to the OpenAPI (Swagger) standard, provide a clean boundary both externally between our customers and our product, and internally between our More ❯
or similar GitHub Actions, CircleCI) Understands the importance of monitoring and proactive in resolving critical issues. Fluent in testing frameworks Junit , RestAssured Desirable: Exposure with monitoring and alerting platforms. Datadog , PagerDuty, Graphana, Prometheus Exposure in Python Scripting Exposure in deployment platforms like Kubernetes and tools like Helm. Ready to shape the future of health and wellness through tech? Apply now More ❯
Desirable Technical Skills Operating Systems: Ubuntu (18-22) Middleware: Apache Tomca Databases: Microsoft SQL Server (T-SQL) Scripting: Bash Cloud Platforms: Amazon Web Services (AWS) Containers: Docker Monitoring & Logging: Datadog General Skills & Attributes Strong problem-solving abilities with a strategic mindset Self-starter who works independently with minimal guidance Effective communicator able to simplify complex information for diverse audiences Proven More ❯
/ESI). Previous usage of workflow tools such as JIRA and/or Trello. Performance and Load Testing (Jmeter/Blazemeter). Maven Build tool. Swagger. Monitoring & Alerting (Datadog, New Relic, Elasticsearch, Cloudwatch). Caching (Akamai, Fastly, CloudFront). Exposure to Adobe Experience Manager and/or NextJS Cypress The nature of our industry means life at the Telegraph More ❯
Bash, or PowerShell. o Contribute to Infrastructure-as-Code (Terraform, CloudFormation, ARM Templates) for database provisioning. Monitoring & Troubleshooting: o Set up alerts and dashboards using Observability tools like NewRelic, Datadog etc o Handle incidents, perform root cause analysis, and contribute to ongoing operational improvements. Security & Compliance Support: o Enforce access controls, encryption, and secure connections. o Contribute to audits and More ❯
building robust and efficient backend solutions. Strong hands-on experience with Terraform for infrastructure as code, enabling scalable and reliable systems. Experience with monitoring and observability tools, such as Datadog or Prometheus. Familiarity with event-driven systems, particularly Kafka and/or RabbitMQ. Deep understanding of messaging and queuing systems, including design patterns for reliability, retries, and scaling. Strong understanding More ❯
building robust and efficient backend solutions. Strong hands-on experience with Terraform for infrastructure as code, enabling scalable and reliable systems. Experience with monitoring and observability tools, such as Datadog or Prometheus. Familiarity with event-driven systems, particularly Kafka and/or RabbitMQ. Deep understanding of messaging and queuing systems, including design patterns for reliability, retries, and scaling. Strong understanding More ❯