Lead Observability Engineer

Senior / Lead Observability & Cloud Infrastructure Engineer

We are seeking an experienced Senior / Lead Observability & Cloud Infrastructure Engineer to join a large-scale digital transformation programme. The successful candidate will play a key role in designing, implementing and enhancing observability capabilities across modern cloud-native platforms, with a particular focus on Dynatrace.

This position requires a strong blend of hands-on observability expertise, AWS infrastructure knowledge, and experience supporting distributed microservice-based applications running in containerised environments.

Key Responsibilities

  • Lead the design, implementation and optimisation of Dynatrace monitoring solutions across complex cloud environments.
  • Configure and maintain dashboards, alerting frameworks and end-to-end observability for customer-facing digital services.
  • Implement Dynatrace instrumentation and monitoring across cloud infrastructure, APIs, microservices, containers and databases.
  • Work closely with engineering, platform and operations teams to improve service visibility and operational resilience.
  • Analyse and troubleshoot performance, availability and reliability issues across distributed systems.
  • Support the adoption of observability best practices and drive continuous improvement initiatives.
  • Design and implement proactive alerting strategies to reduce incident impact and improve service reliability.
  • Document monitoring architectures, operational procedures and technical solutions.

Required Experience

  • Strong hands-on experience implementing and administering Dynatrace in enterprise-scale environments.
  • Experience deploying and configuring Dynatrace monitoring, dashboarding, alerting and integrations.
  • Strong AWS cloud experience including services such as:
  • EC2
  • ECS
  • EKS
  • Lambda
  • S3
  • RDS
  • IAM
  • VPC
  • CloudFormation
  • Strong understanding of cloud-native and microservice-based architectures.
  • Experience working with container technologies including Docker, ECS and/or Kubernetes.
  • Strong troubleshooting and root cause analysis skills within distributed environments.
  • Experience with monitoring and observability tooling such as Dynatrace, CloudWatch and related platforms.
  • Knowledge of Infrastructure as Code and automation tooling including CloudFormation and/or Terraform.
  • Experience working within DevOps, Platform Engineering or Site Reliability Engineering environments.

Desirable Experience

  • Experience within large-scale enterprise or consultancy-led environments.
  • Knowledge of CI/CD pipelines and deployment automation.
  • Experience defining service-level objectives (SLOs), KPIs and operational metrics.
  • Exposure to additional observability or APM platforms such as Datadog, AppDynamics, New Relic or Splunk.

Job Details

Company
TechNET IT Recruitment Ltd
Location
Basingstoke, England, United Kingdom
Posted