or DevOps Expertise in microservices and API design Docker, and container runtime platforms such as Kubernetes, EKS, ECS etc Strong understand of operational concepts on AWS, particularly monitoring and observability, FinOps Utilising CI/CD tools, such as Bamboo, Jenkins, TeamCity, Bitbucket, in order to streamline delivery of new features and fixes Continual testing of code using Automated Testing Frameworks More ❯
CI/CD pipelines, infrastructure as code (IaC), and automated testing. Experience with industry-standard monitoring tools (ITRS or similar) Proficiency in managing Kubernetes clusters, including deployment, scaling, storage, observability, and lifecycle management Understanding of financial regulations and reporting requirements in Europe such as MiFID II Person Profile The role will suit someone who relishes the prospect of supporting an More ❯
Tuesdays, Thursdays WFH) Pay: negotiable, inside IR35 We're looking for an experienced DevOps Engineer to join our team on a contract basis, with a focus on AWS infrastructure, observability tooling, and CI/CD automation. This is a hands-on role supporting high-availability systems, rapid deployments, and production incident response. Key Responsibilities - Manage and monitor AWS infrastructure for … performance and security - Respond to production incidents, perform root cause analysis, and implement fixes - Maintain observability tools (Prometheus, Grafana, Splunk) and write PromQL queries - Improve and operate CI/CD pipelines using GitHub Actions and Kubernetes - Automate infrastructure tasks with Python, Bash, Go or SQL - Work with Git-based workflows for infrastructure as code - Troubleshoot Kubernetes workloads and containerised services More ❯
reliability of cloud and hybrid infrastructure powering some of the most critical client-facing applications in financial services. You will be the strategic and operational leader for platform reliability, observability, incident response, CI/CD modernisation, and developer productivity. You will drive automation, lead with metrics, and build systems and teams that proactively address issues before they impact clients. Key … and services. Implement a comprehensive incident management lifecycle (on-call, escalation, RCA, blameless postmortems). Reduce Mean Time to Detect (MTTD) and Mean Time to Resolve (MTTR) through automated observability, alerting, and playbooks. CI/CD and Platform Engineering Oversee the development and evolution of CI/CD pipelines for all GIDS products using GitHub Actions, ArgoCD, TeamCity, Octopus Deploy … and GitOps principles. Integrate static and dynamic code analysis, vulnerability scanning, artifact promotion, and release gating into the SDLC. Ensure pipeline scalability and governance while maintaining developer velocity. Observability & Troubleshooting Lead the implementation and usage of modern observability stacks (e.g., OpenTelemetry, Prometheus, Grafana, Splunk, Datadog). Establish SLOs, SLIs, and error budgets with product and engineering teams. Drive root cause More ❯
incidents using data-driven decision making to minimise downtime and financial impact while leading root cause analysis and conducting blameless post-mortems.* Enhance application health monitoring by implementing robust observability solutions and automating manual processes to improve system resilience.* Drive cost optimisation initiatives and manage capacity resources to ensure efficient and scalable operations across all FX trading platforms.* Collaborate with … Deep technical expertise in Linux/Unix systems administration combined with strong SQL skills and proficiency in scripting languages such as Python or Java.* Demonstrated experience with monitoring and observability tools including Prometheus, Grafana, Splunk, Geneos, OpenTelemetry or Corvil is highly desirable.* Familiarity with cloud platforms as well as containerisation technologies like Kubernetes or Docker alongside CI/CD pipeline More ❯
/IBM MQ). DevOps Principles: Understanding of DevOps principles and infrastructure as code tools (i.e., Terraform). Performance Tuning: Background in performance tuning, profiling, and monitoring Java applications. Observability and Monitoring: Solid experience with Observability and Monitoring tools (i.e., Splunk/Dynatrace). Leadership and Mentoring: Experience mentoring junior developers or leading small engineering teams. About working for us More ❯
/IBM MQ). DevOps Principles: Understanding of DevOps principles and infrastructure as code tools (i.e., Terraform). Performance Tuning: Background in performance tuning, profiling, and monitoring Java applications. Observability and Monitoring: Solid experience with Observability and Monitoring tools (i.e., Splunk/Dynatrace). Leadership and Mentoring: Experience mentoring junior developers or leading small engineering teams. About working for us More ❯
client developers on modern tooling and DevOps/cloud-native practices, ensuring sustainable ownership after Bain's engagement. Advance cloud-native & DevOps adoption. Champion containerization, infrastructure-as-code, automated observability and secure-by-design principles to improve scalability, reliability and security. Contribute to communities of practice. Share lessons learned and emerging technology trends through internal forums, brown-bag sessions and … Django, .NET Core or Java Spring Boot, including the design of RESTful and GraphQL/gRPC APIs. 3-4 years architecting and operating micro-service ecosystems, emphasizing service discovery, observability, CI/CD automation and blue-/green or canary deployments. Cloud-native delivery on AWS, Azure or GCP - adept with managed services, serverless patterns and infrastructure-as-code (Terraform More ❯
experience leading enterprise backup and disaster recovery initiatives. Working knowledge of cloud-native storage solutions such as Longhorn. Strong Linux administration skills, particularly with RHEL environments. Experience implementing comprehensive observability solutions using Prometheus, Grafana, Loki, and related tools. Ability to establish and enforce security policies through tools like Open Policy Agent. Knowledge of identity management solutions such as Keycloak. Experience More ❯
e.g., Slackbots and integrations) to streamline IT operations and business processes. Monitoring and Maintenance: Manage and maintain network security systems through system patches and periodic maintenance tasks. Establish comprehensive observability and proactive issue-resolution strategies using tools like SNMP, Syslog, Netflow, Elasticsearch (ELK Stack), and Grafana. Collaboration and Communication: Work with CyberEnergiateams to identify functional needs, develop secure architectures, and More ❯
Bristol, Gloucestershire, United Kingdom Hybrid / WFH Options
Curo Resourcing Ltd
domain adjacent technologies/services, such as: Docker, OpenShift, Kubernetes etc. Infrastructure as Code and CI/CD paradigms and systems such as: Ansible, Terraform, Jenkins, Bamboo, Concourse etc. Observability - SRE Big Data solutions (ecosystems) and technologies such as: Apache Spark and the Hadoop Ecosystem Excellent knowledge of YAML or similar languages The following Technical Skills & Experience would be desirable More ❯
London, England, United Kingdom Hybrid / WFH Options
BBC
as-Code with AWS CDK, CloudFormation to provision and manage cloud environments. Build and maintain CI/CD pipelines using GitHub Actions, AWS CodePipeline, CodeBuild, Jenkins. Integrate monitoring and observability tools such as AWS CloudWatch, Prometheus, Grafana for infrastructure and model health tracking. Ensure software quality through Test-Driven Development (TDD), unit testing frameworks (e.g., pytest, unittest), and automated integration More ❯
Bristol, Avon, South West, United Kingdom Hybrid / WFH Options
Hargreaves Lansdown
Experience with unit, integration, and end to end testing tools and practices (e.g. Jest, Cypress, Backstop, Playwright). Experience with CI/CD and Trunk Based Development. Experience with observability tools and practices, including monitoring, logging, and tracing to ensure system reliability and performance. Understanding of Microservices & principles of RESTful API development, including structuring, documenting, versioning, testing and stubbing/ More ❯
mentoring engineers and collaborating with stakeholders. Proven ability to resolve technical incidents in unfamiliar production systems. Technical and process documentation champion. Experience of operationally managing production software components, including observability, logging, metrics, error reporting, debugging, and live incident management. Your time will be spent roughly as follows: 60% - Proactive technical work (e.g. migrating DB hosting provider, new message bus system More ❯
or DevOps Expertise in microservices and API design Docker, and container runtime platforms such as Kubernetes, EKS, ECS etc Strong understand of operational concepts on AWS, particularly monitoring and observability, FinOps Utilising CI/CD tools, such as Bamboo, Jenkins, TeamCity, Bitbucket, in order to streamline delivery of new features and fixes Continual testing of code using Automated Testing Frameworks More ❯
or DevOps Expertise in microservices and API design Docker, and container runtime platforms such as Kubernetes, EKS, ECS etc Strong understand of operational concepts on AWS, particularly monitoring and observability, FinOps Utilising CI/CD tools, such as Bamboo, Jenkins, TeamCity, Bitbucket, in order to streamline delivery of new features and fixes Continual testing of code using Automated Testing Frameworks More ❯
Bristol, Gloucestershire, United Kingdom Hybrid / WFH Options
Leidos
and managing backup, recovery, and disaster recovery strategies to ensure data protection and business continuity Ability to implement robust monitoring and logging solutions e.g., CloudWatch, to ensure system reliability, observability, and proactive incident response Comfortable working in Agile development teams, translating business requirements into technical solutions, and actively participating in sprint planning, retrospectives, and daily stand-ups Capability to design More ❯
as-Code with AWS CDK, CloudFormation to provision and manage cloud environments. Build and maintain CI/CD pipelines using GitHub Actions, AWS CodePipeline, CodeBuild, Jenkins. Integrate monitoring and observability tools such as AWS CloudWatch, Prometheus, Grafana for infrastructure and model health tracking. Ensure software quality through Test-Driven Development (TDD), unit testing frameworks (e.g., pytest, unittest), and automated integration More ❯
as-Code with AWS CDK, CloudFormation to provision and manage cloud environments. Build and maintain CI/CD pipelines using GitHub Actions, AWS CodePipeline, CodeBuild, Jenkins. Integrate monitoring and observability tools such as AWS CloudWatch, Prometheus, Grafana for infrastructure and model health tracking. Ensure software quality through Test-Driven Development (TDD), unit testing frameworks (e.g., pytest, unittest), and automated integration More ❯
Join us as a Cloud Observability Engineer at Barclays, where you will lead our enterprise observability strategy across multi-cloud environments. This senior role combines technical leadership with team management, driving operational excellence while architecting resilient solutions and mentoring high-performing teams. To be successful as a Cloud Observability Engineer, you should have experience with The ability to lead and … scale technical teams in multi-faceted governance environments AWS/Azure cloud platforms and enterprise observability tools (Elastic, Grafana, Splunk, DataDog, or similar) SRE/DevOps methodologies with Python proficiency for automation and infrastructure-as-code practices Some other highly valued skills may include AWS or Azure cloud certifications Experience implementing AI-driven observability and AIOps solutions Background in large More ❯
us on our journey to revolutionize observability. In 2023, Dun & Bradstreet ranked Coralogix as one of the best tech startups to work for. Coralogix is a modern, full-stack observability platform transforming how businesses process and understand their data. Our unique architecture powers in-stream analytics without reliance on expensive indexing or hot storage. We specialize in comprehensive monitoring of … logs, metrics, trace and security events with features such as APM, RUM, SIEM, Kubernetes monitoring and more, all enhancing operational efficiency and reducing observability spend by up to 70%. Technical Account Managers in Coralogix are key in our effort to meet our customer's expectations and help them utilize their observability and security data in the most efficient way … looking for hard-working, sharp, and humble professionals with proven technical customer-facing experience. Our Technical Account Managers are trusted advisors and consult our customers upon their monitoring, security & observability journey. This role embodies the critical intersection of very high technical expertise and a focus on customer satisfaction, renewal and expansion. Technical Account Managers are senior-level roles and are More ❯
operations. Manage and enhance our container orchestration stack using Kubernetes (EKS) and Docker. Develop and maintain robust, scalable CI/CD pipelines with Jenkins, GitHub Actions, and ArgoCD. Strengthen observability across the platform through effective monitoring, logging, and alerting (AWS services, Grafana, etc). Contribute to platform security through infrastructure hardening, role-based access controls, and infrastructure as code (Terraform … CI/CD pipelines using Jenkins, GitHub Actions, and/or ArgoCD. Familiarity with infrastructure as code practices using Terraform, CloudFormation, or similar tools. A solid grasp of system observability, monitoring, and alerting practices (CloudWatch, Grafana, or equivalent). Exposure to platform security principles including identity/access management, secrets handling, and environment isolation. Strong scripting and automation skills (e.g. … Database: MySQL (Aurora DB), Redis (ElastiCache), MongoDB (AWS DocumentDB). Cloud & DevOps: AWS (20+ services), Kubernetes (EKS), Docker, Infrastructure as Code (CloudFormation, Terraform), CI/CD (Jenkins, GitHub Actions), Observability (AWS, Grafana). Development tools: GitHub, Jira, Notion, ChatGPT, Gemini, LangChain, AI-native IDE's (Cursor, JetBrains), LLM-powered internal tools. Test automation: Cypress (E2E), Postman (API), Jest (frontend unit More ❯
Architect for Scale & Resilience: Make critical decisions on system design and performance to support a growing platform with increasing complexity and scale. Elevate Operational Maturity: Lead improvements to monitoring, observability, and developer workflows - ensuring backend systems are resilient and teams can ship confidently. Embed Security by Design: Take responsibility for backend security posture, ensuring systems meet best practices and compliance … and SQS. Infrastructure as Code: Experience with Terraform or similar tools for infrastructure automation. High-Throughput Systems: Strong experience in real production projects handling large-scale data flows. Monitoring & Observability: Proficiency in tools like Datadog, Prometheus, and Grafana. Security & Networking: Solid understanding of networking principles, security best practices, and cloud security. Agile & Fast-Paced Environments: Experience in agile teams, working More ❯
For: 3+ years hands-on experience with Solace PubSub+ in a production environment Strong knowledge of WAN-based distributed systems and networking fundamentals Experience with Prometheus and Grafana for observability and alerting Confident in Linux/Unix systems and scripting (Bash, Python, etc.) Excellent problem-solving instincts and attention to detail Strong communicator who works well across technical teams Bonus More ❯
Burton-On-Trent, Staffordshire, West Midlands, United Kingdom
Amtis Professional Ltd
CloudFormation or ARM templates Scripting & Automation - Proficient in PowerShell, Bash, or Python Infrastructure as Code (IaC) - Hands-on experience with Terraform, Bicep, or ARM Certified: Terraform Associate preferred Monitoring & Observability - Familiarity with tools like Azure Monitor, AWS CloudWatch, Prometheus, Grafana Security & Compliance - Strong understanding of IAM, cloud security, compliance frameworks For immediate consideration apply now More ❯