Docker and orchestration tools like Kubernetes. Collaborate with software engineering teams to integrate DevOps best practices into development workflows. Implement logging, monitoring, and alerting solutions (e.g., Prometheus, Grafana, ELK, Datadog). Drive automation initiatives to reduce manual effort and improve system reliability. Participate in on-call rotations and incident response processes. Required Skills & Qualifications: Bachelors degree in Computer Science, Engineering More ❯
OWASP Top 10, and threat modeling. Proficiency in cloud platforms (AWS, Azure, GCP) and associated reliability tools. Hands-on experience with monitoring and logging tools such as Prometheus, Grafana, Datadog, Splunk, or ELK stack. Familiarity with containerization and orchestration tools (Docker, Kubernetes). Strong understanding of distributed systems, fault tolerant design, and high availability architectures. Experience in root cause analysis More ❯
as Kubernetes or Amazon ECS to streamline application deployment, scaling, and management. Monitoring and Logging: Implement monitoring and logging solutions using tools such as Prometheus, Grafana, ELK Stack, or Datadog to monitor system performance, detect issues, and troubleshoot problems proactively. Security and Compliance: Implement security best practices and compliance standards within DevOps processes and infrastructure, ensuring the security and integrity More ❯
London, Bloomsbury, United Kingdom Hybrid / WFH Options
IntaPeople
or AWS CodePipeline Support and train technical staff in upskilling necessary for ongoing operations Monitor and ensure system reliability, availability, and performance using tools likeCloudWatch, Prometheus, Icinga2, Grafana, and Datadog Automate deployment, scaling, and management of containerized applications using Docker and Kubernetes Desirable skills Travis CI Monitoring – Grafana, Icinga Prometheus Rabbit MQ/AMQP Working knowledge of security best practices More ❯
as GitLab , GitHub Actions, or CircleCI Strong testing capabilities using JUnit , RestAssured , or similar frameworks Proactive with monitoring, observability, and system health Desirable Skills: Exposure to monitoring platforms like Datadog, Grafana, Prometheus , or PagerDuty Familiarity with Python scripting Experience with Kubernetes and deployment tools such as Helm Why Join H&B Tech? Help define the future of digital health & wellness More ❯
Stoke-On-Trent, Staffordshire, West Midlands, United Kingdom
Evolution Funding Limited
AWS CDK, Serverless Framework, CloudFormation). Knowledge of microservices and event-driven architectures. Exposure to container technologies (Docker, ECS, EKS, Kubernetes). Experience with monitoring and observability tools (CloudWatch, Datadog, OpenTelemetry). More ❯
including custom applications, integrations, AI & flows. Develop applications and integrations across platforms such as ITSM, ITOM, PA, CSM, SPM, CSDM, CMDB, Employee Centre, Integration Hub, and observability tools (e.g., Datadog, Splunk, AWS CloudWatch, Prometheus, etc.). Ensure seamless interoperability between service operations tooling and cloudnative environments. Technical Leadership & Collaboration: Serve as a technical lead, providing guidance & best practices across service More ❯
bradford, yorkshire and the humber, united kingdom
Mastek
including custom applications, integrations, AI & flows. Develop applications and integrations across platforms such as ITSM, ITOM, PA, CSM, SPM, CSDM, CMDB, Employee Centre, Integration Hub, and observability tools (e.g., Datadog, Splunk, AWS CloudWatch, Prometheus, etc.). Ensure seamless interoperability between service operations tooling and cloudnative environments. Technical Leadership & Collaboration: Serve as a technical lead, providing guidance & best practices across service More ❯
orchestration (ECS, EKS, or Kubernetes) Experience setting up CI/CD pipelines using GitHub Actions or similar tools Familiarity with monitoring and alerting tools (e.g. Prometheus, Grafana, CloudWatch, Sentry, DataDog) A security-first mindset when designing and managing infrastructure Nice to Haves Experience working in regulated or high-trust environments Knowledge of zero-downtime deployment patterns and rollback strategies Exposure More ❯
configuration management tools (e.g., Ansible, Puppet, Chef). Knowledge of infrastructure as code (IaC) tools (e.g., Terraform, CloudFormation). Experience with monitoring and logging tools (e.g., Prometheus, ELK Stack, Datadog). Passion for continuous learning and professional development. ABOUT BUSINESS UNIT IBM Consulting is IBM's consulting and global professional services business, with market leading capabilities in business and technology More ❯
codebase, currently in Java (11+), and ideally Spring Boot. You will be working with SQL and large SQL databases, Docker, Kubernetes, OpenAPI specifications, and distributed system observability tooling (e.g., Datadog APM). Infrastructure automation is primarily owned by the infrastructure team, but you will be a consumer of their work; familiarity with AWS, Terraform and Docker is beneficial. Testing approaches More ❯
distributed systems, microservices architecture, and RESTful API design. Hands-on experience with Kubernetes and container orchestration. Familiarity with monitoring, alerting, and logging tools (e.g., Prometheus, Grafana, ELK stack, or Datadog). Experience with Elastic will be highly helpful with this position. Hands-on experience with incident response, including designing and improving incident management processes. Expertise in Observability practices, including metrics More ❯
experience in various libraries Experience with AWS Lambda functions and serverless architectures Knowledge of REST APIs, JSON/XML, and web services integration Familiarity With Cribl, Grafana, Logic Monitor, Datadog, Newrelic or comparable monitoring & APM solutions is a plus. Exposure to SIEM and Service Management toolsets like ServiceNow would be advantageous. Nice to have UNIX/RHEL/Ubuntu with More ❯
backend and IoT systems, ensuring a seamless, fully automated workflow that reliably delivers high-quality code to production with speed and confidence. Elevating Monitoring: Enhance our monitoring capabilities with Datadog, gaining deeper insights and proactively ensuring system health. Scaling for Impact: Contribute to scaling our systems to efficiently manage and operate thousands of parking lots. Championing Engineering Excellence: Play a More ❯
improvement in automation, monitoring, and deployment processes. What we're looking for Experience with AWS services (ECS, S3, RDS, Lambda, CloudFront, etc.). Skilled in monitoring tools such as DataDog , CloudWatch , and Grafana. Familiarity with Docker , ECS , Kubernetes , or similar containerisation tech. Competence in scripting or coding with Bash, Python, or Node.js. Experience with Infrastructure as Code (Terraform, Pulumi, etc. More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Noir
and maintaining CI/CD pipelines, and be confident scripting in Python, C# or similar scripting languages. You'll also be comfortable working with monitoring and performance tools like Datadog or Prometheus, and ideally, you'll have worked in a fast-moving SaaS or product-led business before. Bonus points if you've helped shape DevOps roadmaps, mentored others, or More ❯
Terraform, CloudFormation, or Ansible. Work collaboratively with development, DevOps, and security teams to ensure data governance, compliance, and operational efficiency. Implement monitoring and alerting solutions using tools like CloudWatch, Datadog, or Prometheus. Conduct root cause analysis (RCA) and develop long-term preventive strategies. Maintain and enforce database standards, documentation, and operational procedures. Required Qualifications: 7+ years of experience in database More ❯
as needed. Experience with relational and non-relational databases. Experience delivering high levels of observability and proficiency in improving early warning systems, for example: has worked with Grafana/DataDog/Prometheus. Collaborating with internal/external teams/engineers and fostering an inclusive environment, where all points of view are welcomed and encouraged. Own and lead multiple domains of More ❯
are JVM based with the majority running on Java 21. We're in the process of moving our backend services to Spring Boot. We've invested heavily in our DataDog integration to bring world class observability and monitoring to our systems. We've recently moved to Gitlab and are currently building out our next generation of automated deployment pipelines. We More ❯
o Cloud-native security services (AWS Security Groups, Azure Firewall, etc.). • Monitoring & Troubleshooting: o Cloud-native monitoring tools (e.g., AWS CloudWatch, Azure Monitor). o Third-party tools (Datadog, Splunk, SolarWinds). Professional Certifications • Cloud Certifications: o AWS Certified Advanced Networking - Specialty o AWS Certified Solutions Architect o Microsoft Certified: Azure Network Engineer Associate o Google Professional Cloud Network More ❯
as the operating system for car parking Contribute to the improvement of our CI pipelines for both backend and IoT deployments Improve our monitoring system for our services with Datadog Assist in scaling up our systems for managing thousands of parking lots Shape our engineering culture by employing modern software engineering practices, focusing on writing clean, well-tested, and efficient More ❯
Lisburn, County Antrim, United Kingdom Hybrid / WFH Options
Camlin
e.g., Docker, Kubernetes, Terraform, Ansible, Helm, etc). Familiarity with continuous integration and deployment tools (e.g., GitLab CI, Argo Workflow, ArgoCD). Experience with monitoring/logging solutions (e.g., DataDog, ELK, Prometheus). Good understanding of concepts related to computer architecture, data structures and programming practices. Solid understanding of networking, databases, and security principles. Our Values We work together We More ❯
engineering (SRE), or a similar role. Proficiency in cloud platforms (AWS, Azure, GCP) and associated reliability tools. Hands-on experience with monitoring and logging tools such as Prometheus, Grafana, Datadog, Splunk, or ELK stack. Proficiency in scripting languages like Python, Bash, or Go for automation. Familiarity with containerization and orchestration tools (Docker, Kubernetes). Strong understanding of distributed systems, fault More ❯
frontend architecture (e.g., Module Federation or Single-SPA). Experience with cloud-native DevOps tooling: Docker, Kubernetes, AWS/GCP deployments. Proficiency in analytics and observability tools like Sentry, Datadog, or LogRocket. Soft Skills Strategic thinker with strong problem-solving and decision-making skills. Ability to work in fast-paced, agile environments with cross-functional teams. Clear communication and documentation More ❯
Leeds, West Yorkshire, Yorkshire, United Kingdom Hybrid / WFH Options
Fruition Group
developers and SREs to solve complex problems What we're looking for: Strong experience with AWS (EC2, ECS, Lambda, RDS etc.) Good knowledge of observability tools (Grafana, Prometheus, OpenTelemetry, Datadog, or similar) Background in software engineering (JavaScript/TypeScript & Node.js, although any language is fine) Experience with Infrastructure as Code (Terraform, CloudFormation, or similar) CI/CD pipelines and automation More ❯