Southampton, Hampshire, United Kingdom Hybrid / WFH Options
NICE
experience of Grafana Observability Suite (Loki, Mimir, Tempo). Administration and/or development experience of standard monitoring and automation tools such as Splunk, Datadog, Pagerduty, Rundeck. Familiarity with configuration management tools like Ansible, Puppet, or Chef. Certifications such as AWS Certified DevOps Engineer, Google Cloud Professional DevOps Engineer, or More ❯
GCP. Proficiency using Infrastructure as Code (IaC) tools such as Terraform (preferred), Ansible, or CloudFormation. Experience with monitoring, observability and logging tools such as DataDog, Prometheus, Grafana, or similar. Proven track record of maintaining highly-available and performant production environments. Ability to identify and implement effective mitigation strategies and operational More ❯
orchestration (Kubernetes). Understanding of CI/CD pipelines. Familiarity with scripting languages like Python, Bash, or Go. Experience with monitoring tools such as Datadog, Prometheus, Grafana, or ELK stack. Strong problem-solving, communication skills, and ability to work independently or in teams. Additional notes We value diverse backgrounds and More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
Smart DCC
Develop automated test suites for data pipelines, ensuring data quality and transformation integrity. Monitoring & Performance Optimization: Monitor data pipelines with tools like Prometheus and Datadog to ensure optimal performance and health. Proactively implement anomaly detection and optimize system performance and resource allocation. Collaborate with cross-functional teams to align DataOps More ❯
and Active Directory. Experience with disaster recovery and redundancy strategies in both cloud and on-premises environments. Proficiency with leading monitoring tools, such as Datadog, Splunk , Prometheus, Grafana, ELK Stack, and New Relic. Programming expertise, especially in systems programming languages (e.g., Java, Kotlin, Scala) and databases (e.g., SQL Server, PostgreSQL More ❯
and Active Directory. Experience with disaster recovery and redundancy strategies in both cloud and on-premises environments. Proficiency with leading monitoring tools, such as Datadog, Splunk , Prometheus, Grafana, ELK Stack, and New Relic. Programming expertise, especially in systems programming languages (e.g., Java, Kotlin, Scala) and databases (e.g., SQL Server, PostgreSQL More ❯
Dublin, City of Dublin, Republic of Ireland Hybrid / WFH Options
The Recruitment Company
plus) Deep knowledge of Kubernetes, containers, and cloud-native architectures Proficient in scripting and automation (Python, Shell, Go) Comfortable with tools like Terraform, Jenkins, DataDog, Prometheus, Splunk Solid background in networking, Linux systems, and infrastructure as code If you’re passionate about cloud reliability, automation, and solving complex problems at More ❯
an object-oriented language (preferably Java, .NET or C++) Expert+ level Linux administration, scripting, and troubleshooting Demonstratable knowledge of Observability tools (New Relic, Splunk, DataDog) Comprehensive experience with AWS (Amazon Web Services) and its core capabilities (VPC, EC2, ECS, Route53, Fargate, ALB/NLB distributions, etc) Extensive experience with cloud More ❯
an object-oriented language (preferably Java, .NET or C++) Expert+ level Linux administration, scripting, and troubleshooting Demonstratable knowledge of Observability tools (New Relic, Splunk, DataDog) Comprehensive experience with AWS (Amazon Web Services) and its core capabilities (VPC, EC2, ECS, Route53, Fargate, ALB/NLB distributions, etc) Extensive experience with cloud More ❯
Herndon, Virginia, United States Hybrid / WFH Options
Marathon TS Inc
and Code Pipeline • Experience with Configuration as Code, including AWS SSM, Ansible, PowerShell, or Bash • Monitoring log and System performance using tools like Grafana, Datadog, and Prometheus. • Experience with multiple CI/CD and Agile Development tools, including GitLab, Atlassian, or Jenkins • Experience working within an Agile and version-controlled More ❯
pipelines, and be confident scripting in Python, C# or similar scripting languages. You’ll also be comfortable working with monitoring and performance tools like Datadog or Prometheus, and ideally, you’ll have worked in a fast-moving SaaS or product-led business before. Bonus points if you’ve helped shape More ❯
london, south east england, united kingdom Hybrid / WFH Options
Noir
pipelines, and be confident scripting in Python, C# or similar scripting languages. You’ll also be comfortable working with monitoring and performance tools like Datadog or Prometheus, and ideally, you’ll have worked in a fast-moving SaaS or product-led business before. Bonus points if you’ve helped shape More ❯
Security Best Practices: IAM, MFA, data encryption, firewall configurations. Programming/Scripting: Python, Terraform, or similar languages. Event-Driven Architectures: Kafka. Monitoring and Logging: Datadog, ELK Stack, Prometheus, etc. Experience in agile methodologies and DevOps practices. Location: Hybrid. Office located in London. (Hayes area). Office presence required: Yes. Frequency More ❯
london, south east england, united kingdom Hybrid / WFH Options
Parser
Security Best Practices: IAM, MFA, data encryption, firewall configurations. Programming/Scripting: Python, Terraform, or similar languages. Event-Driven Architectures: Kafka. Monitoring and Logging: Datadog, ELK Stack, Prometheus, etc. Experience in agile methodologies and DevOps practices. Location: Hybrid. Office located in London. (Hayes area). Office presence required: Yes. Frequency More ❯
clusters Knowledge of security best practices and the ability to implement security controls at the infrastructure level Experience with monitoring and logging tools like DataDog or Grafana's observability stack (Prometheus, Tempo, Loki, Grafana) Familiarity with the open standard OpenTelemetry Excellent written and verbal communication skills, we're a collaborative More ❯
Proficient in cloud platforms (AWS, Azure, GCP) and modern DevOps tooling (e.g., Terraform, Jenkins, Kubernetes). Hands-on with observability and monitoring tools (e.g., DataDog, Azure Monitor, AppDynamics). Expert in cyber security practices, identity management, encryption, and secure API development. Familiarity with compliance frameworks such as GDPR and PCI More ❯
london, south east england, united kingdom Hybrid / WFH Options
Merlin Entertainments
Proficient in cloud platforms (AWS, Azure, GCP) and modern DevOps tooling (e.g., Terraform, Jenkins, Kubernetes). Hands-on with observability and monitoring tools (e.g., DataDog, Azure Monitor, AppDynamics). Expert in cyber security practices, identity management, encryption, and secure API development. Familiarity with compliance frameworks such as GDPR and PCI More ❯
secret management tools (e.g., HashiCorp Vault, Azure Key Vault) and SSO/authentication systems (e.g., Okta). Observability: Hands-on experience with platforms like DataDog, Grafana, or Azure Monitor. Networking: Strong understanding of networking principles, DNS, and related technologies. CI/CD: Skilled in creating and maintaining CI/CD More ❯
CI/CD best practices and tools (e.g. GitHub Actions, Jenkins, CodePipeline) Exposure to monitoring and observability tools for ML systems (e.g. Prometheus, Grafana, DataDog, WhyLabs, Evidently, etc.) Experience in building parallelised or distributed model inference pipelines Nice-to-Have Skills Familiarity with feature stores and model registries (e.g. Feast More ❯
with policy as code tools (Kyverno, OPA Gatekeeper, ) • Familiarity with container security scanning tools • Experience with monitoring and observability platforms such as Dynatrace, Grafana, DataDog, Elastic, • Knowledge of SIEM solutions and security monitoring tools. Why Join Us? • Impactful Work - Play a key role in managing a highly secure banking application More ❯
Reigate, Surrey, United Kingdom Hybrid / WFH Options
Willis Towers Watson
cost effectiveness Implement infrastructure as code with Pulumi Support the team in infrastructure and networking related issues Maintain and configure observability platforms such as Datadog Proactively monitor production and other environments to ensure stability, availability, security and integrity Participate in incident response, troubleshooting, and root cause analysis to mitigate and More ❯
code software such as Terraform Experience in continuous integration practices & tools such as GitHub Actions, etc. Experience in observability implementations using tools such as DataDog Analytical and detail-oriented, aligned to the DORA principles Excellent problem solving, communication, and teamwork skills Excellent time management and organisational skills Comfortable dealing with More ❯
code software such as Terraform Experience in continuous integration practices & tools such as GitHub Actions, etc. Experience in observability implementations using tools such as DataDog Analytical and detail-oriented, aligned to the DORA principles Excellent problem solving, communication, and teamwork skills Excellent time management and organisational skills Comfortable dealing with More ❯
Cambridge, Cambridgeshire, United Kingdom Hybrid / WFH Options
AI Tech Suite
Find the latest job opportunities in AI and tech. RunPod offers GPU cloud computing for AI/ML, providing secure and community cloud options, on-demand and spot pods, and serverless GPU scaling. The flexibility of remote work with an More ❯
Find the latest job opportunities in AI and tech. RunPod offers GPU cloud computing for AI/ML, providing secure and community cloud options, on-demand and spot pods, and serverless GPU scaling. The flexibility of remote work with an More ❯