york, yorkshire and the humber, united kingdom Hybrid / WFH Options
Electus Recruitment Solutions
Engineer , reviewing and reporting on multiple systems performance and trending data. Developing, maintaining, and utilising necessary trending and monitoring tools and scripts using languages such as Perl, Python, and Grafana . Overseeing and leading the review, update, and validation of Standard Operating Procedures and training material for multiple system generations. Leading 24/7 on-call anomaly support and resolution More ❯
plus. Capable of writing clean, maintainable and well-tested code. Comfortable working in on-prem and cloud-native environments with an interest in observability, using tools like Prometheus and Grafana to keep services healthy and maintainable. Familiarity with AWS services and how to integrate them into modern applications. A keen focus on quality and security, combining testing and scanning into More ❯
City of London, London, United Kingdom Hybrid / WFH Options
M-XR
data models (MongoDB, PostgreSQL) Implement asset storage, retrieval, and management systems (AWS S3) Build job queue management for async ML workflows (SNS, SQS) Setup application monitoring and logging (CloudWatch, Grafana, Prometheus) Implement CI/CD for application deployment (Bitbucket Pipelines) Create API documentation and developer tools What we are looking for 5+ years backend development experience with production applications Track More ❯
data models (MongoDB, PostgreSQL) Implement asset storage, retrieval, and management systems (AWS S3) Build job queue management for async ML workflows (SNS, SQS) Setup application monitoring and logging (CloudWatch, Grafana, Prometheus) Implement CI/CD for application deployment (Bitbucket Pipelines) Create API documentation and developer tools What we are looking for 5+ years backend development experience with production applications Track More ❯
london, south east england, united kingdom Hybrid / WFH Options
M-XR
data models (MongoDB, PostgreSQL) Implement asset storage, retrieval, and management systems (AWS S3) Build job queue management for async ML workflows (SNS, SQS) Setup application monitoring and logging (CloudWatch, Grafana, Prometheus) Implement CI/CD for application deployment (Bitbucket Pipelines) Create API documentation and developer tools What we are looking for 5+ years backend development experience with production applications Track More ❯
slough, south east england, united kingdom Hybrid / WFH Options
M-XR
data models (MongoDB, PostgreSQL) Implement asset storage, retrieval, and management systems (AWS S3) Build job queue management for async ML workflows (SNS, SQS) Setup application monitoring and logging (CloudWatch, Grafana, Prometheus) Implement CI/CD for application deployment (Bitbucket Pipelines) Create API documentation and developer tools What we are looking for 5+ years backend development experience with production applications Track More ❯
london (city of london), south east england, united kingdom Hybrid / WFH Options
M-XR
data models (MongoDB, PostgreSQL) Implement asset storage, retrieval, and management systems (AWS S3) Build job queue management for async ML workflows (SNS, SQS) Setup application monitoring and logging (CloudWatch, Grafana, Prometheus) Implement CI/CD for application deployment (Bitbucket Pipelines) Create API documentation and developer tools What we are looking for 5+ years backend development experience with production applications Track More ❯
Reliability Engineer to manage and administer monitoring, observability, and cost optimisation tools alongside infrastructure as code and CI/CD pipelines. This role includes expertise with Dynatrace, Prometheus/Grafana, FinOps practices, Terraform, and GitHub Actions. You will ensure systems are monitored effectively, costs are optimised, and automation workflows are maintained and patched to support scalable and resilient cloud infrastructure. … Key Responsibilities: Manage and maintain monitoring and observability tools such as Dynatrace and Prometheus/Grafana Implement and support FinOps practices to optimise cloud spend and resource utilisation Develop, maintain, and patch infrastructure as code using Terraform Build, maintain, and troubleshoot CI/CD automation pipelines with GitHub Actions Monitor alerting rules, dashboards, and performance metrics to ensure reliability Collaborate … and updates to monitoring agents, dashboards, and automation tools Document operational procedures and system configurations Required Skills & Experience: Proven experience in administering monitoring tools like Dynatrace and Prometheus/Grafana Strong knowledge of cloud cost management principles (FinOps) Proficiency in Terraform and infrastructure as code best practices Hands-on experience with CI/CD tooling such as GitHub Actions Ability More ❯
Reliability Engineer to manage and administer monitoring, observability, and cost optimisation tools alongside infrastructure as code and CI/CD pipelines. This role includes expertise with Dynatrace, Prometheus/Grafana, FinOps practices, Terraform, and GitHub Actions. You will ensure systems are monitored effectively, costs are optimised, and automation workflows are maintained and patched to support scalable and resilient cloud infrastructure. … Key Responsibilities: Manage and maintain monitoring and observability tools such as Dynatrace and Prometheus/Grafana Implement and support FinOps practices to optimise cloud spend and resource utilisation Develop, maintain, and patch infrastructure as code using Terraform Build, maintain, and troubleshoot CI/CD automation pipelines with GitHub Actions Monitor alerting rules, dashboards, and performance metrics to ensure reliability Collaborate … and updates to monitoring agents, dashboards, and automation tools Document operational procedures and system configurations Required Skills & Experience: Proven experience in administering monitoring tools like Dynatrace and Prometheus/Grafana Strong knowledge of cloud cost management principles (FinOps) Proficiency in Terraform and infrastructure as code best practices Hands-on experience with CI/CD tooling such as GitHub Actions Ability More ❯
Reliability Engineer to manage and administer monitoring, observability, and cost optimisation tools alongside infrastructure as code and CI/CD pipelines. This role includes expertise with Dynatrace, Prometheus/Grafana, FinOps practices, Terraform, and GitHub Actions. You will ensure systems are monitored effectively, costs are optimised, and automation workflows are maintained and patched to support scalable and resilient cloud infrastructure. … Key Responsibilities: Manage and maintain monitoring and observability tools such as Dynatrace and Prometheus/Grafana Implement and support FinOps practices to optimise cloud spend and resource utilisation Develop, maintain, and patch infrastructure as code using Terraform Build, maintain, and troubleshoot CI/CD automation pipelines with GitHub Actions Monitor alerting rules, dashboards, and performance metrics to ensure reliability Collaborate … and updates to monitoring agents, dashboards, and automation tools Document operational procedures and system configurations Required Skills & Experience: Proven experience in administering monitoring tools like Dynatrace and Prometheus/Grafana Strong knowledge of cloud cost management principles (FinOps) Proficiency in Terraform and infrastructure as code best practices Hands-on experience with CI/CD tooling such as GitHub Actions Ability More ❯
/CD pipelines using GitLab and ArgoCD. Design and operate containerised workloads with EKS, Fargate, and Kubernetes. Manage Kubernetes deployments using Helm charts. Implement observability solutions using OpenTelemetry (OTel), Grafana, and Splunk. Optimise infrastructure with Karpenter for autoscaling and cost efficiency. Ensure robust AWS networking (VPC, Transit Gateway, PrivateLink, Route 53) and enforce security best practices. Drive incident response, monitoring … and performance tuning. Key Technologies: AWS (EKS, Fargate, EC2, S3), Terraform, CloudFormation, GitLab, ArgoCD, Docker, Kubernetes, Helm, Cassandra, OTel, Grafana, Splunk, Karpenter, Python, Bash. Desirable: Experience with Google Cloud Platform (GCP), Apigee Hybrid, and hybrid/multi-cloud environments. Carbon60, Lorien & SRG - The Impellam Group STEM Portfolio are acting as an Employment Business in relation to this vacancy. More ❯
GitLab Create Ansible Tower runbooks and playbooks for infrastructure automation Deploy and manage containerized applications using Docker, Kubernetes, or OpenShift Implement privileged access management using CyberArk Utilize Splunk and Grafana for monitoring, logging, and performance analysis Perform SQL and Oracle database administration and troubleshooting Maintain Windows Server and Linux (Red Hat) environments Collaborate with engineering teams, project managers, and suppliers … Oracle database administration Windows Server (2012/2016/2019) and Linux (Red Hat) system administration Desirable Skills: CyberArk, containerization (Docker/Kubernetes/OpenShift), monitoring tools (Splunk/Grafana), web servers (IIS/Tomcat), Active Directory, networking, virtualization (VMware), and Agile methodologies To be considered, please ensure you complete your application on the Computappoint website. Services offered by Computappoint More ❯
GitLab Create Ansible Tower runbooks and playbooks for infrastructure automation Deploy and manage containerized applications using Docker, Kubernetes, or OpenShift Implement privileged access management using CyberArk Utilize Splunk and Grafana for monitoring, logging, and performance analysis Perform SQL and Oracle database administration and troubleshooting Maintain Windows Server and Linux (Red Hat) environments Collaborate with engineering teams, project managers, and suppliers … Oracle database administration Windows Server (2012/2016/2019) and Linux (Red Hat) system administration Desirable Skills: CyberArk, containerization (Docker/Kubernetes/OpenShift), monitoring tools (Splunk/Grafana), web servers (IIS/Tomcat), Active Directory, networking, virtualization (VMware), and Agile methodologies To be considered, please ensure you complete your application on the Computappoint website. Services offered by Computappoint More ❯
GitLab Create Ansible Tower runbooks and playbooks for infrastructure automation Deploy and manage containerized applications using Docker, Kubernetes, or OpenShift Implement privileged access management using CyberArk Utilize Splunk and Grafana for monitoring, logging, and performance analysis Perform SQL and Oracle database administration and troubleshooting Maintain Windows Server and Linux (Red Hat) environments Collaborate with engineering teams, project managers, and suppliers … Oracle database administration Windows Server (2012/2016/2019) and Linux (Red Hat) system administration Desirable Skills: CyberArk, containerization (Docker/Kubernetes/OpenShift), monitoring tools (Splunk/Grafana), web servers (IIS/Tomcat), Active Directory, networking, virtualization (VMware), and Agile methodologies To be considered, please ensure you complete your application on the Computappoint website. Services offered by Computappoint More ❯
GitLab Create Ansible Tower runbooks and playbooks for infrastructure automation Deploy and manage containerized applications using Docker, Kubernetes, or OpenShift Implement privileged access management using CyberArk Utilize Splunk and Grafana for monitoring, logging, and performance analysis Perform SQL and Oracle database administration and troubleshooting Maintain Windows Server and Linux (Red Hat) environments Collaborate with engineering teams, project managers, and suppliers … Oracle database administration Windows Server (2012/2016/2019) and Linux (Red Hat) system administration Desirable Skills: CyberArk, containerization (Docker/Kubernetes/OpenShift), monitoring tools (Splunk/Grafana), web servers (IIS/Tomcat), Active Directory, networking, virtualization (VMware), and Agile methodologies To be considered, please ensure you complete your application on the Computappoint website. Services offered by Computappoint More ❯
GitLab Create Ansible Tower runbooks and playbooks for infrastructure automation Deploy and manage containerized applications using Docker, Kubernetes, or OpenShift Implement privileged access management using CyberArk Utilize Splunk and Grafana for monitoring, logging, and performance analysis Perform SQL and Oracle database administration and troubleshooting Maintain Windows Server and Linux (Red Hat) environments Collaborate with engineering teams, project managers, and suppliers … Oracle database administration Windows Server (2012/2016/2019) and Linux (Red Hat) system administration Desirable Skills: CyberArk, containerization (Docker/Kubernetes/OpenShift), monitoring tools (Splunk/Grafana), web servers (IIS/Tomcat), Active Directory, networking, virtualization (VMware), and Agile methodologies To be considered, please ensure you complete your application on the Computappoint website. Services offered by Computappoint More ❯
london (city of london), south east england, united kingdom
Computappoint
GitLab Create Ansible Tower runbooks and playbooks for infrastructure automation Deploy and manage containerized applications using Docker, Kubernetes, or OpenShift Implement privileged access management using CyberArk Utilize Splunk and Grafana for monitoring, logging, and performance analysis Perform SQL and Oracle database administration and troubleshooting Maintain Windows Server and Linux (Red Hat) environments Collaborate with engineering teams, project managers, and suppliers … Oracle database administration Windows Server (2012/2016/2019) and Linux (Red Hat) system administration Desirable Skills: CyberArk, containerization (Docker/Kubernetes/OpenShift), monitoring tools (Splunk/Grafana), web servers (IIS/Tomcat), Active Directory, networking, virtualization (VMware), and Agile methodologies To be considered, please ensure you complete your application on the Computappoint website. Services offered by Computappoint More ❯
and maintain Infrastructure as Code (IaC) using Terraform and Ansible. Design highly reliable, scalable, and secure infrastructure supporting performance-critical workloads. Build proactive monitoring, observability, and alerting with Prometheus, Grafana, Azure Monitor, DataDog, and Dynatrace. Troubleshoot complex system issues spanning applications, networks, and infrastructure. Define platform SLAs, SLOs, and governance standards for self-service use. Collaborate closely with Salesforce DevOps … Ansible, along with scripting in PowerShell, Python, or Bash Experience implementing GitOps workflows and managing platform SLAs, SLOs, and governance standards Familiarity with observability and monitoring tools including Prometheus, Grafana, Azure Monitor, DataDog, or Dynatrace Preferred experience supporting Salesforce DevOps pipelines and working with Java, .NET, or Node.js application environments Exposure to AI/ML platforms, real-time data pipelines More ❯
and maintain Infrastructure as Code (IaC) using Terraform and Ansible. Design highly reliable, scalable, and secure infrastructure supporting performance-critical workloads. Build proactive monitoring, observability, and alerting with Prometheus, Grafana, Azure Monitor, DataDog, and Dynatrace. Troubleshoot complex system issues spanning applications, networks, and infrastructure. Define platform SLAs, SLOs, and governance standards for self-service use. Collaborate closely with Salesforce DevOps … Ansible, along with scripting in PowerShell, Python, or Bash Experience implementing GitOps workflows and managing platform SLAs, SLOs, and governance standards Familiarity with observability and monitoring tools including Prometheus, Grafana, Azure Monitor, DataDog, or Dynatrace Preferred experience supporting Salesforce DevOps pipelines and working with Java, .NET, or Node.js application environments Exposure to AI/ML platforms, real-time data pipelines More ❯
and maintain Infrastructure as Code (IaC) using Terraform and Ansible. Design highly reliable, scalable, and secure infrastructure supporting performance-critical workloads. Build proactive monitoring, observability, and alerting with Prometheus, Grafana, Azure Monitor, DataDog, and Dynatrace. Troubleshoot complex system issues spanning applications, networks, and infrastructure. Define platform SLAs, SLOs, and governance standards for self-service use. Collaborate closely with Salesforce DevOps … Ansible, along with scripting in PowerShell, Python, or Bash Experience implementing GitOps workflows and managing platform SLAs, SLOs, and governance standards Familiarity with observability and monitoring tools including Prometheus, Grafana, Azure Monitor, DataDog, or Dynatrace Preferred experience supporting Salesforce DevOps pipelines and working with Java, .NET, or Node.js application environments Exposure to AI/ML platforms, real-time data pipelines More ❯
CI/CD and GitOps practices with GitHub Actions and ArgoCD, including automated testing, vulnerability scanning, and environment promotion workflows. Drive the definition and implementation of observability standards - Prometheus, Grafana, Loki/ELK, Jaeger, Sentry - enabling end-to-end visibility and SLA tracking. Define scalability and reliability patterns (KEDA, HPA, circuit breakers, bulkheads, caching tiers) and ensure resilience of critical … patterns. Proficiency in API and event contract design using OpenAPI and AsyncAPI; knowledge of GraphQL federation is a plus. Strong background in observability, monitoring, and tracing , with Prometheus/Grafana/ELK or equivalent. Familiarity with cloud agnostic deployments (AWS, GCP, or Azure) and cost/performance trade offs. Excellent technical leadership, communication, and documentation skills in English (upper intermediate More ❯
Manchester, England, United Kingdom Hybrid / WFH Options
Lorien
technologies. with clear progression routes available. Key Requirements: Strong troubleshooting and fault-resolution experience across infrastructure and applications Hands-on experience with monitoring tools such as Instana, Splunk, Prometheus, Grafana, or SolarWinds Confident supporting both Windows and Linux operating systems Experience working in ITIL-aligned support environments Understanding of web hosting technologies (DNS, HTTP/S, SSL Certs, and basic More ❯
Manchester, Lancashire, England, United Kingdom Hybrid / WFH Options
Lorien
technologies. with clear progression routes available. Key Requirements: Strong troubleshooting and fault-resolution experience across infrastructure and applications Hands-on experience with monitoring tools such as Instana, Splunk, Prometheus, Grafana, or SolarWinds Confident supporting both Windows and Linux operating systems Experience working in ITIL-aligned support environments Understanding of web hosting technologies (DNS, HTTP/S, SSL Certs, and basic More ❯
warrington, cheshire, north west england, united kingdom Hybrid / WFH Options
Lorien
technologies. with clear progression routes available. Key Requirements: Strong troubleshooting and fault-resolution experience across infrastructure and applications Hands-on experience with monitoring tools such as Instana, Splunk, Prometheus, Grafana, or SolarWinds Confident supporting both Windows and Linux operating systems Experience working in ITIL-aligned support environments Understanding of web hosting technologies (DNS, HTTP/S, SSL Certs, and basic More ❯
bolton, greater manchester, north west england, united kingdom Hybrid / WFH Options
Lorien
technologies. with clear progression routes available. Key Requirements: Strong troubleshooting and fault-resolution experience across infrastructure and applications Hands-on experience with monitoring tools such as Instana, Splunk, Prometheus, Grafana, or SolarWinds Confident supporting both Windows and Linux operating systems Experience working in ITIL-aligned support environments Understanding of web hosting technologies (DNS, HTTP/S, SSL Certs, and basic More ❯