monitoring) Strong experience in distributed system design, development and deployment using agile/devops practices. Experience with CI/CD pipelines (GitHub Actions, or similar) Experience implementing monitoring and observability using Prometheus, Grafana or Databricks-native solutions. Good communication skills, excellent teamwork experience, ability to mentor and develop more junior developers, including participating in constructive code reviews Preferred Skills: Experience More ❯
Liverpool, England, United Kingdom Hybrid / WFH Options
Love2shop
Cover on-call rotation for production support (1 week out of 6) As well as making improvements to: • Deployment automation and release management processes • Application and infrastructure monitoring and observability • Security scanning and vulnerability management in pipelines • Performance optimization and capacity planning • Development team productivity through tooling and automation What we would like from you • Strong experience with CI/ More ❯
Welwyn Garden City, England, United Kingdom Hybrid / WFH Options
PayPoint plc
Cover on-call rotation for production support (1 week out of 6) As well as making improvements to: • Deployment automation and release management processes • Application and infrastructure monitoring and observability • Security scanning and vulnerability management in pipelines • Performance optimization and capacity planning • Development team productivity through tooling and automation What we would like from you • Strong experience with CI/ More ❯
technical direction Desirable Qualifications Experience in fintech, payments, or enterprise SaaS platforms Exposure to event-driven architecture (Kafka, RabbitMQ) Familiarity with infrastructure-as-code tools (Terraform, CloudFormation) Understanding of observability tools (Prometheus, Grafana, ELK stack) Apply now and Vibe with Us! (blob:)0:00/0:26We are looking for new employees who will embrace the Edenred adventure with the More ❯
City Of Westminster, London, United Kingdom Hybrid / WFH Options
Additional Resources
high-volume processing. Deploying and managing containerised workloads through Kubernetes, Helm, and Docker. Automating infrastructure using Infrastructure-as-Code tools such as Terraform and Ansible. Ensuring system reliability through observability, monitoring, and proactive issue resolution. Collaborating with cross-functional teams to align data solutions with wider business needs. Supporting the continuous improvement of processes, deployment, and data quality standards. What More ❯
Westminster, City of Westminster, Greater London, United Kingdom Hybrid / WFH Options
Additional Resources
high-volume processing. Deploying and managing containerised workloads through Kubernetes, Helm, and Docker. Automating infrastructure using Infrastructure-as-Code tools such as Terraform and Ansible. Ensuring system reliability through observability, monitoring, and proactive issue resolution. Collaborating with cross-functional teams to align data solutions with wider business needs. Supporting the continuous improvement of processes, deployment, and data quality standards. What More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Additional Resources Ltd
high-volume processing. Deploying and managing containerised workloads through Kubernetes, Helm, and Docker. Automating infrastructure using Infrastructure-as-Code tools such as Terraform and Ansible. Ensuring system reliability through observability, monitoring, and proactive issue resolution. Collaborating with cross-functional teams to align data solutions with wider business needs. Supporting the continuous improvement of processes, deployment, and data quality standards. What More ❯
deployment pipelines. Excellent problem-solving skills and ability to debug complex systems. Strong communication and collaboration skills, with a commitment to mentoring and team development. Preferred skills Understanding of observability practices, including logging, metrics, and tracing. Experience with monitoring tools such as Prometheus and Grafana. Awareness of cloud security best practices, including IAM policies and secret management. Exposure to Agile More ❯
Edinburgh, Midlothian, United Kingdom Hybrid / WFH Options
Aberdeen
internal workshops, brown bags, or tech talks to share knowledge and promote adoption of tools and practices. About the Candidate The ideal candidate will possess the following: Experience with observability tools (eg, Grafana, Prometheus, Datadog). Background in DevOps, SRE, or platform engineering with a security first mindset. Strong programming skills in languages such as .Net, JavaScript, Python or similar. More ❯
built on Solace PubSub+, ensuring high availability, optimal performance, and reliability across production and non-production environments. This includes working on incident response, capacity planning, WAN optimization, and system observability using tools like Prometheus and Grafana . Key Responsibilities: Administer and maintain Solace PubSub+ appliances and software brokers across environments (on-prem and cloud). Provide production support for messaging More ❯
built on Solace PubSub+, ensuring high availability, optimal performance, and reliability across production and non-production environments. This includes working on incident response, capacity planning, WAN optimization, and system observability using tools like Prometheus and Grafana . Key Responsibilities: Administer and maintain Solace PubSub+ appliances and software brokers across environments (on-prem and cloud). Provide production support for messaging More ❯
to architect secure, performant, and highly available cloud solutions. Proficiency with monitoring and log analytics tools such as AWS CloudWatch, ELK Stack, Prometheus, Datadog, or New Relic, to maintain observability and ensure operational excellence. Demonstrated leadership skills in managing complex, high-pressure situations and guiding teams through incident resolution. Exceptional communication and presentation skills, with proven experience engaging with senior More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Advanced Resource Managers
to architect secure, performant, and highly available cloud solutions. Proficiency with monitoring and log analytics tools such as AWS CloudWatch, ELK Stack, Prometheus, Datadog, or New Relic, to maintain observability and ensure operational excellence. Demonstrated leadership skills in managing complex, high-pressure situations and guiding teams through incident resolution. Exceptional communication and presentation skills, with proven experience engaging with senior More ❯
and core infrastructure - from development and deployment to monitoring and continuous improvement. Build and maintain robust CI/CD pipelines for both software and ML workflows. Ensure reliability, scalability, observability, and security of production systems and ML infrastructure. Automate deployment, orchestration, and environment management using modern DevOps tooling. Collaborate closely with software engineers, data scientists, and product teams to bring More ❯
and core infrastructure - from development and deployment to monitoring and continuous improvement. Build and maintain robust CI/CD pipelines for both software and ML workflows. Ensure reliability, scalability, observability, and security of production systems and ML infrastructure. Automate deployment, orchestration, and environment management using modern DevOps tooling. Collaborate closely with software engineers, data scientists, and product teams to bring More ❯
streaming platforms. Security & Compliance: Identity and access management (IAM), Secure design principles, awareness of regulatory frameworks (e.g., GDPR, HIPAA, SOX, SOC2) Tools & Platforms : Familiarity with enterprise platforms, monitoring and observability tools, API gateways and service meshes.Location:COL Work-at-HomeLanguage Requirements:English (Required)Time Type:Full time2025-10-31 If you are a California resident, by submitting your information, you More ❯
AI-enhanced automation. Build and maintain CI/CD (Jenkins, GitLab CI, GitHub Actions, ArgoCD). Cloud infrastructure (AWS, Azure, GCP), container orchestration (Kubernetes, Docker). Logging, monitoring, and observability (Prometheus, Grafana, ELK/EFK), including AI-driven log analysis and incident prediction. Experience supporting MLOps: deploying ML workflows, ensuring model traceability and compliance. Use of AI assistants and workflow More ❯
Mansfield, England, United Kingdom Hybrid / WFH Options
Future Talent Group
using Terraform. Implement and optimise CI/CD pipelines using GitHub Actions, Docker, and GitOps practices. Deploy, orchestrate, and manage Kubernetes (AKS/Container Apps) workloads. Configure monitoring and observability with Azure Monitor, Application Insights, Log Analytics, and OpenTelemetry. Partner with software engineering and infrastructure teams to drive DevOps best practices across the organisation. Manage security and compliance in Azure More ❯
real-time services. Automate workflows and deployments using Terraform and CI/CD tools (GitHub Actions, CircleCI, or Jenkins). Support and optimize containerized environments (Docker, Kubernetes). Build observability and monitoring solutions with Prometheus, Grafana, and ELK . Manage and tune messaging systems (Kafka, RabbitMQ) for low-latency event handling. Enhance reliability, scalability, and infrastructure security through DevSecOps practices. More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Intellect Group
real-time services. Automate workflows and deployments using Terraform and CI/CD tools (GitHub Actions, CircleCI, or Jenkins). Support and optimize containerized environments (Docker, Kubernetes). Build observability and monitoring solutions with Prometheus, Grafana, and ELK . Manage and tune messaging systems (Kafka, RabbitMQ) for low-latency event handling. Enhance reliability, scalability, and infrastructure security through DevSecOps practices. More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Signify Technology
understanding of containerisation (Docker, ECS, or Kubernetes). Strong scripting skills in Python , Bash , or similar. Familiarity with Linux administration , networking, and system security. Experience with monitoring, logging, and observability tools (e.g., Prometheus, Grafana, ELK stack, Datadog). Desirable Skills Exposure to infrastructure security best practices (e.g., CIS Benchmarks, AWS Well-Architected Framework). Knowledge of configuration management (Ansible, Chef More ❯
understanding of containerisation (Docker, ECS, or Kubernetes). Strong scripting skills in Python , Bash , or similar. Familiarity with Linux administration , networking, and system security. Experience with monitoring, logging, and observability tools (e.g., Prometheus, Grafana, ELK stack, Datadog). Desirable Skills Exposure to infrastructure security best practices (e.g., CIS Benchmarks, AWS Well-Architected Framework). Knowledge of configuration management (Ansible, Chef More ❯
Leeds, England, United Kingdom Hybrid / WFH Options
Fruition Group
DynamoDB, S3, IAM, and RDS. Understanding of DevOps practices, including CI/CD pipelines and automation. Strong knowledge of cloud security best practices, IAM policies, and networking. Experience with observability tools like CloudWatch, Prometheus, or Grafana. Preferred: Experience mentoring junior team members and promoting DevOps practices. Familiarity with multi-cloud environments (e.g., GCP, Azure). Knowledge of database performance optimisation. More ❯
FinOps practices. Experience with infrastructure-as-code tools (eg, Terraform, Helm, Ansible). Familiarity with CI/CD pipelines and automation (eg, GitHub Actions, ArgoCD, Jenkins). Experience on observability tools like Prometheus, Grafana Knowledge of Linux systems administration and networking fundamentals and experience with policy-as-code. Passion for platform engineering, developer experience, and site reliability UAL is a More ❯
FinOps practices. Experience with infrastructure-as-code tools (e.g., Terraform, Helm, Ansible). Familiarity with CI/CD pipelines and automation (e.g., GitHub Actions, ArgoCD, Jenkins). Experience on observability tools like Prometheus, Grafana Knowledge of Linux systems administration and networking fundamentals and experience with policy-as-code. Passion for platform engineering, developer experience, and site reliability UAL is a More ❯