Oldham, Greater Manchester, North West, United Kingdom
Innovative Technology
CI/CD systems (GitHub Actions, GitLab CI, Jenkins, etc.) Hands-on experience with infrastructure-as-code tools (e.g., Terraform, CloudFormation) Knowledge of observability tools (Prometheus, Grafana, ELK stack, Datadog, etc.). Solid grasp of Linux systems and networking fundamentals Strong problem-solving and debugging skills Your Package & Perks: A competitive salary Flexible working hours 32 days holiday, (pro rata More ❯
CI/CD systems (GitHub Actions, GitLab CI, Jenkins, etc.) Hands-on experience with infrastructure-as-code tools (e.g., Terraform, CloudFormation) Knowledge of observability tools (Prometheus, Grafana, ELK stack, Datadog, etc.). Solid grasp of Linux systems and networking fundamentals Strong problem-solving and debugging skills Your Package & Perks: A competitive salary Flexible working hours 32 days holiday, (pro rata More ❯
such as Azure, AWS or GCP Proficiency using Infrastructure as Code (IaC) tools such as Terraform (preferred), Ansible, or CloudFormation. Experience with monitoring, observability and logging tools such as DataDog, Prometheus, Grafana, or similar. Proven track record of maintaining highly-available and performant production environments. Ability to identify and implement effective mitigation strategies and operational playbooks. Useful/Bonus Skills More ❯
Edinburgh, Midlothian, United Kingdom Hybrid / WFH Options
Aberdeen
tech talks to share knowledge and promote adoption of tools and practices. About the Candidate The ideal candidate will possess the following: Experience with observability tools (eg, Grafana, Prometheus, Datadog). Background in DevOps, SRE, or platform engineering with a security first mindset. Strong programming skills in languages such as .Net, JavaScript, Python or similar. Experience with CI/CD More ❯
as ECS, Kubernetes, and Docker. Experience in observability such as white and black box monitoring, service level objective alerting, and telemetry collection using tools such as Grafana, Dynatrace, Prometheus, Datadog, Splunk, and others Preferred qualifications, capabilities, and skills Knowledge of using GENAI tools such as Copilot or Windsurf and how to use them as Code Assistants Ability to expand and More ❯
configuration management tools (e.g., Ansible, Puppet, Chef). Knowledge of infrastructure as code (IaC) tools (e.g., Terraform, CloudFormation). Experience with monitoring and logging tools (e.g., Prometheus, ELK Stack, Datadog). Passion for continuous learning and professional development. ABOUT BUSINESS UNIT IBM Consulting is IBM's consulting and global professional services business, with market leading capabilities in business and technology More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Harnham - Data & Analytics Recruitment
optimisation Nice to Have Experience with ML tooling (MLflow, Kubeflow) Knowledge of FastAPI , Databricks, or Snowflake Exposure to SRE practices or cloud security certifications Familiarity with Prometheus , Grafana , or Datadog Interested? If you want to be part of a world-class AI team at an early stage-where your infrastructure decisions will directly shape the company's success-apply today More ❯
with modern CI CD systems and automation pipelines Hands on experience with infrastructure as code frameworks e.g. Terraform An understanding of monitoring, logging, and alerting practices, using tools like Datadog A curious mindset, always looking for better ways to solve problems and improve developer experience The confidence to lead technical direction and the humility to learn and adapt along the More ❯
Oldham, Greater Manchester, North West, United Kingdom
Cathcart Technology
with modern CI CD systems and automation pipelines ** Hands on experience with infrastructure as code frameworks e.g. Terraform ** An understanding of monitoring, logging, and alerting practices, using tools like Datadog ** A curious mindset, always looking for better ways to solve problems and improve developer experience ** The confidence to lead technical direction and the humility to learn and adapt along the More ❯
frontend architecture (e.g., Module Federation or Single-SPA). Experience with cloud-native DevOps tooling: Docker, Kubernetes, AWS/GCP deployments. Proficiency in analytics and observability tools like Sentry, Datadog, or LogRocket. Soft Skills Strategic thinker with strong problem-solving and decision-making skills. Ability to work in fast-paced, agile environments with cross-functional teams. Clear communication and documentation More ❯
tuning. Lead technical triage and root cause analysis for infrastructure-related issues Develop and deploy applications using Docker and AWS FARGATE Use CloudWatch, CloudTrail, and third-party tools like Datadog for performance and cost efficiency Configure AWS networking (VPCs, TGWs), enforce governance via AWS Config and tagging policies Maintain architecture diagrams, SOPs, and collaborate across engineering and product teams Should More ❯
tuning. Lead technical triage and root cause analysis for infrastructure-related issues Develop and deploy applications using Docker and AWS FARGATE Use CloudWatch, CloudTrail, and third-party tools like Datadog for performance and cost efficiency Configure AWS networking (VPCs, TGWs), enforce governance via AWS Config and tagging policies Maintain architecture diagrams, SOPs, and collaborate across engineering and product teams Should More ❯
Experience of using Git or similar to track changes Experience of both the full .NET Framework and .NET Core Experience of using observability systems such as Elastic APM or DataDog to track and diagnose issues in production A solid understanding of security principles and secure coding including OWASP Top 10 Nice to haves: o Experience in VOIP, (SIP and RTP More ❯
Experience of using Git or similar to track changes Experience of both the full .NET Framework and .NET Core Experience of using observability systems such as Elastic APM or DataDog to track and diagnose issues in production A solid understanding of security principles and secure coding including OWASP Top 10 Nice to haves: o Experience in VOIP, (SIP and RTP More ❯
re looking for someone with deep expertise in: oInfrastructure as Code: Terraform, CloudFormation o Security best practices: IAM, KMS, encryption in transit/at rest, DevSecOps o Monitoring & observability: Datadog, Prometheus, Grafana, ELK, or similar What You Bring o 6+ years in DevOps or platform engineering, with experience in a technical lead role. o Proven experience designing and operating cloud More ❯
Out in Science, Technology, Engineering, and Mathematics
technical, ambiguous domains. Strong knowledge of REST APIs , distributed system design, and performance optimization. Experience with both SQL and NoSQL data stores , caching layers, and observability tooling (e.g., Prometheus, Datadog). Nice to have: Experience deploying or integrating LLMs or NLP models in production systems. Comfortable balancing short-term execution with long-term architectural thinking . Passion for building highly More ❯
Manchester, Lancashire, England, United Kingdom Hybrid / WFH Options
Clarke Recruitment Solutions
needed) Working with Docker and container orchestration (ECS/EKS, Helm) Streamlining and optimising CI/CD pipelines (GitHub Actions/GitLab CI) Setting up and managing observability with Datadog, CloudWatch, Prometheus/Grafana Automating deployments and improving recovery, redundancy, and capacity planning Supporting Linux environments (Ubuntu/CentOS) Getting involved in incident response and helping us prevent problems before … and automation tools Hands-on with containers and orchestration (Docker, ECS/EKS, Helm) Experience with CI/CD pipelines (GitHub Actions or GitLab CI) Familiarity with monitoring tools (Datadog, CloudWatch, Prometheus, Grafana) Confident scripting in Python and Bash Strong communication skills and collaborative mindset Nice to have (not essential): Experience with Azure or GCP Knowledge of networking (VPC Peering … of your game A collaborative, supportive team environment where your input matters Tech stack you’ll work with AWS | Terraform | Ansible | Docker | ECS/EKS | GitHub Actions | GitLab CI | Datadog | CloudWatch | Prometheus | Grafana | Linux | Python | Bash If you’re passionate about automation, thrive on solving complex problems, and want your work to make a genuine difference when it matters most More ❯
Reigate, Surrey, England, United Kingdom Hybrid / WFH Options
Client Server Ltd
in Azure (will also consider AWS or GCP experience) You have a deep understanding of cloud infrastructure and services including best practices around monitoring, scaling and security tools e.g. DataDog You have strong scripting skills with PowerShell (or Python) You have a good knowledge of basic networking, TCP/IP You have a good understanding of IaC, they use Pulumi More ❯
ClaimCenter and other systems, including PAS, document management systems, and external data providers. Platform Monitoring : Determine requirements for specific alerts, set up alerts for various events and thresholds, utilise Datadog logs and dashboards for error analysis, and track DXC downtime while communicating updates to users. Platform Updates : Conduct a 3-way merge of updated code, validate new versions, and implement More ❯
level production incidents The Person: 5+ years in SRE, DevOps, or infrastructure engineering Strong experience with AWS, EKS/Kubernetes, and Terraform Familiar with Kafka and observability tools like Datadog or Grafana Able to troubleshoot issues across infrastructure and application layers Reference number: BBBH259300 To apply for this role or for to be considered for further roles, please click "Apply More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Rise Technical Recruitment
level production incidents The Person: *5+ years in SRE, DevOps, or infrastructure engineering *Strong experience with AWS, EKS/Kubernetes, and Terraform *Familiar with Kafka and observability tools like Datadog or Grafana *Able to troubleshoot issues across infrastructure and application layers Reference number: BBBH(phone number removed) To apply for this role or for to be considered for further roles More ❯
Employment Type: Permanent
Salary: £80000 - £90000/annum 38 Days Holiday, Healthcare, Pension
London, South East, England, United Kingdom Hybrid / WFH Options
Rise Technical Recruitment Limited
level production incidents The Person: *5+ years in SRE, DevOps, or infrastructure engineering*Strong experience with AWS, EKS/Kubernetes, and Terraform*Familiar with Kafka and observability tools like Datadog or Grafana*Able to troubleshoot issues across infrastructure and application layers Reference number: BBBH259300 To apply for this role or for to be considered for further roles, please click "Apply More ❯