Leeds, England, United Kingdom Hybrid / WFH Options
KnowBe4, Inc
communication skills. Some of the technologies we use: Programming Languages - Python, Ruby, Rust Infrastructure as Code - Terraform, AWS CDK Source Code Management and CI/CD - GitLab, Snyk Observability - DataDog, Airbrake Cloud-native infrastructure in AWS - ECS, Lambda, Step Functions, SNS/SQS, Transit Gateway, Aurora, DynamoDB, CloudFront, S3, AppSync, API Gateway, and many more. Responsibilities: Work with other Site More ❯
software applications and optimizing fleet utilization - Strong understanding of network fundamentals (DNS, DHCP, TCP/IP, routing, load balancing, load shedding) and experience with monitoring frameworks (such as CloudWatch, Datadog, Grafana, Elastic or similar) - Experience scripting operating system tasks in Bash, Python, etc. and with Infrastructure as Code, (such as CDK, CloudFormation, Puppet, Chef, Ansible, or similar) - Experience operating services More ❯
London, England, United Kingdom Hybrid / WFH Options
Canada Life
Azure certifications are a plus Observability Designing, implementing and day-to-day use of logging and monitoring tools to capture data for alerting and issue identification and resolution using DataDog, App Insights or similar tools. Designing applications and infrastructure for observability, security, and reliability. Networking & Security Monitor and enhance network performance, ensuring high levels of security and scalability across all More ❯
gating into the SDLC. Ensure pipeline scalability and governance while maintaining developer velocity. Observability & Troubleshooting Lead the implementation and usage of modern observability stacks (e.g., OpenTelemetry, Prometheus, Grafana, Splunk, Datadog). Establish SLOs, SLIs, and error budgets with product and engineering teams. Drive root cause identification using distributed tracing, advanced log analysis, and anomaly detection. Security, Audit & Compliance Partner with More ❯
London, England, United Kingdom Hybrid / WFH Options
Take-Two Interactive
Strong proficiency in AWS, Azure, or GCP, with hands-on experience with Terraform. Experience with configuration management tools like Ansible or Puppet, and observability tools like Prometheus, Grafana, and Datadog, etc. Design, develop, automate testing, and deploy custom tools using languages like Python or C#. Great to Have: Experience with database administration and performance tuning. Experience in optimizing cloud costs More ❯
London, England, United Kingdom Hybrid / WFH Options
London Stock Exchange Group
activities for new tools and technologies. Key Tools and Technologies DevOps Tools: GitLab, Jenkins, Ansible, Terraform, Cloud Technologies: AWS Containerization and Orchestration: Docker, Kubernetes, EKS, AKS Monitoring and Logging: DataDog Scripting Languages: Python, Bash, Shell Personal Attributes Proactive and dedicated with a passion for continuous improvement. Diligent with a focus on delivering high-quality work. Ability to work under pressure More ❯
using code analysis and security tools such as Coverity and Blackduck. Experience with scripting languages such as groovy, bash or python. Hands-on experience with monitoring tools such as Datadog etc. Successful Scaled Agile delivery experience. Ability to use wide range of Open-source tools. Experience working with Anthos/GKE Enterprise on AWS (Preferred) Proficiency securing cloud-native applications More ❯
GitLab CI). Write clean, production-grade code in Python (Scala is a bonus). Build infrastructure using Terraform, AWS CloudFormation, or SAM. Drive observability across the platform using Datadog or CloudWatch. Actively mentor Data Engineers and Associates, and lead technical discussions and design sessions. Key requirements: Must-Have: Strong experience with AWS services: Glue, Lambda, S3, Athena, Step Functions … operate services in production. Good to Have: Experience with Scala for data applications. Familiarity with serverless/event-driven architectures. Experience designing scalable, low-latency data services. Exposure to Datadog or CloudWatch monitoring tools. Nice to Have: Experience with LLM-powered applications or OpenAI APIs . Professional experience in a similar environment or high-scale system. Key Roles and Responsibilities More ❯
network management. Proficiency in scripting and automation (e.g., Bash, Python, PowerShell). Familiarity with CI/CD pipelines and deployment automation. Experience with environment monitoring tools (e.g., Prometheus, Nagios, Datadog). Knowledge of security best practices and compliance standards in IT environments. Excellent problem-solving, troubleshooting, and analytical skills. Strong communication skills, with the ability to collaborate across technical and More ❯
London, England, United Kingdom Hybrid / WFH Options
So Energy
across the stack: Frontend conversations: Vue.js, modern component-driven design, API design for seamless integration. Infrastructure: GCP stack, Terraform, Kubernetes, Docker, CI/CD pipelines (GitHub Actions, SonarCloud), observability (Datadog, Grafana). Data: BigQuery, SQL/NoSQL, event-driven architecture, data pipelines. Bring holistic thinking to system design, including scalability, latency, operational excellence, and future-proofing. This role will be … and event-driven architectures. Experience with cloud-native development (GCP preferred; AWS experience relevant). Infrastructure-as-code expertise: Terraform, Kubernetes. Database mastery: PostgreSQL, BigQuery, NoSQL. Observability and monitoring: Datadog, Grafana, logging pipelines. Security best practices: OAuth, SSO, data protection, and secure coding principles. Familiarity with frontend frameworks (React, Vue) and mobile technologies (Ionic, Swift, Android) a plus. Hands-on More ❯
teams on renewals, expansions, and QBRs. What You Bring: Strong hands-on experience with cloud platforms (AWS, GCP, Azure) and DevOps tooling Familiarity with observability stacks like Grafana, Prometheus, Datadog, Splunk, Kibana, etc. Experience with technical integrations (OpenTelemetry, Fluentd, Fluentbit, Filebeat, etc.) Skilled in troubleshooting Kubernetes and containerised environments Strong communication skills — able to engage with technical teams and senior More ❯
sync, order processing, and internal APIs in a multi-system e-commerce environment Understanding of architecture patterns: Microservices , SOA , Hexagonal , Modular Monolith Monitoring & Observability: Grafana , Prometheus , CloudWatch , New Relic , Datadog , etc. Solid grasp of AI trends in software development , particularly in using GPT tools and agentic systems Education: Mathematics or Computer Science degree (or equivalent experience) Desirable Skills Working knowledge More ❯
language such as Python or Java. Experience maintaining a cloud-based infrastructure. Familiarity with site reliability principles, concepts, and practices. Knowledge of observability tools such as Grafana, Dynatrace, Prometheus, Datadog, Splunk, New Relic, CloudWatch, or AppDynamics. Familiarity with containers or common server operating systems like Linux and Windows. Emerging knowledge of software, applications, and technical processes within disciplines like Cloud More ❯
solutions Transform routine tasks through automation, dramatically improving system efficiency Optimise AWS cloud infrastructure to deliver unmatched performance, scalability, and security Leverage advanced monitoring tools including Grafana, CloudWatch, and DataDog to ensure peak system performance Troubleshoot complex cloud platform challenges across development, testing, and production environments Implement fortress-like security practices across all AWS infrastructure and services Collaborate with brilliant More ❯
automation. Effective communication skills. Ability to work independently and manage multiple priorities. Experience mentoring junior DevOps team members. Preferred Skills & Experience: Experience with modern tooling such as Ansible, Terraform, DataDog, Jenkins, GitLab, ServiceNow. Source control with GIT, Bitbucket, Nexus, Artifactory. Strong problem-solving skills and root cause analysis. Networking diagnostics experience. AWS certifications in Developer, SysOps, or DevOps. Ability to More ❯
Nottingham, Nottinghamshire, United Kingdom Hybrid / WFH Options
Commify Group
with CI/CD pipelines and tools, including Azure DevOps or Jenkins. Strong scripting skills in languages such as PowerShell, Bash, or Python. Previous experience with monitoring tools like Datadog, Azure Monitor, or Prometheus. Excellent communication skills and the ability to work effectively in a team environment. Desirable Qualifications: Experience with hybrid and multi-cloud environments. Knowledge of database management More ❯
3+) as an SRE, L3 Support Engineer or Devops Engineer Experience in Flux/ArgoCD with Helm and Customize Experience with observability tools, such as Prometheus, Grafana/Loki, DataDog, ElasticSearch Experience with performance tuning, scalability, reliability and capacity planning on Azure and K8S services Experience with SaaS in a B2B heavily regulated environment (telco, banking, pharma) using Kubernetes, Docker More ❯
native technologies: Experience in deploying to cloud platforms (e.g., AWS, GCP or Azure), an understanding of containerisation (e.g., Docker), infrastructure-as-code software (e.g., Terraform), and observability platforms (e.g., Datadog or Grafana). Curiosity: A hunger to learn and grow your skills. Problem solving: Strong analytical problem-solving skills and attention to detail. You have the ability to break down More ❯
observability engineering or a related field experience with performance testing tools, such as K6 (preferred), Gatling, LoadRunner, Blazemeter and Jmeter. Experience with monitoring tools, such as Prometheus, InfluxDB, Grafana, DataDog, Dynatrace, New Relic or AppDynamics Experience with performance tuning, scalability and capacity planning Experience with SaaS in a B2B heavily regulated environment (telco, banking, pharma) using Kubernetes, Docker Basic knowledge More ❯
native technologies: Experience in deploying to cloud platforms (e.g., AWS, GCP or Azure), an understanding of containerisation (e.g., Docker), infrastructure-as-code software (e.g., Terraform), and observability platforms (e.g., Datadog or Grafana). Curiosity: A hunger to learn and grow your skills. Problem solving: Strong analytical problem-solving skills and attention to detail. You have the ability to break down More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Zettafleet
native technologies: Experience in deploying to cloud platforms (e.g., AWS, GCP or Azure), an understanding of containerisation (e.g., Docker), infrastructure-as-code software (e.g., Terraform), and observability platforms (e.g., Datadog or Grafana). Curiosity: A hunger to learn and grow your skills. Problem solving: Strong analytical problem-solving skills and attention to detail. You have the ability to break down More ❯
East London, London, United Kingdom Hybrid / WFH Options
Zettafleet
native technologies: Experience in deploying to cloud platforms (e.g., AWS, GCP or Azure), an understanding of containerisation (e.g., Docker), infrastructure-as-code software (e.g., Terraform), and observability platforms (e.g., Datadog or Grafana). Curiosity: A hunger to learn and grow your skills. Problem solving: Strong analytical problem-solving skills and attention to detail. You have the ability to break down More ❯
Bury, Greater Manchester, United Kingdom Hybrid / WFH Options
Zettafleet
native technologies: Experience in deploying to cloud platforms (e.g., AWS, GCP or Azure), an understanding of containerisation (e.g., Docker), infrastructure-as-code software (e.g., Terraform), and observability platforms (e.g., Datadog or Grafana). Curiosity: A hunger to learn and grow your skills. Problem solving: Strong analytical problem-solving skills and attention to detail. You have the ability to break down More ❯
Leeds, West Yorkshire, United Kingdom Hybrid / WFH Options
Zettafleet
native technologies: Experience in deploying to cloud platforms (e.g., AWS, GCP or Azure), an understanding of containerisation (e.g., Docker), infrastructure-as-code software (e.g., Terraform), and observability platforms (e.g., Datadog or Grafana). Curiosity: A hunger to learn and grow your skills. Problem solving: Strong analytical problem-solving skills and attention to detail. You have the ability to break down More ❯
Leigh, Greater Manchester, United Kingdom Hybrid / WFH Options
Zettafleet
native technologies: Experience in deploying to cloud platforms (e.g., AWS, GCP or Azure), an understanding of containerisation (e.g., Docker), infrastructure-as-code software (e.g., Terraform), and observability platforms (e.g., Datadog or Grafana). Curiosity: A hunger to learn and grow your skills. Problem solving: Strong analytical problem-solving skills and attention to detail. You have the ability to break down More ❯