Cambridge, Cambridgeshire, United Kingdom Hybrid / WFH Options
AI Tech Suite
the edge. Proficiency in Python, Docker, Linux systems, and scripting (Bash, Python). Strong expertise with infrastructure automation tools (Terraform, Ansible). Experience managing observability and monitoring systems, particularly Prometheus. Deep understanding of networking concepts and protocols. Responsibilities: Design, build, and maintain scalable and resilient infrastructure on the edge. Develop … as-code solutions using Terraform, Ansible, and scripting languages (Python, Bash). Deploy and manage containerized applications using Docker and related technologies. Ensure system observability by building and optimizing monitoring systems, particularly using Prometheus. Troubleshoot and optimize Linux-based systems (e.g., Red Hat, CentOS, Ubuntu). xAI's Grok is … technologies such as Prometheus, Grafana, and PagerDuty. Expert knowledge of deployment technologies such as Pulumi or Terraform. Expert knowledge of Kubernetes. Responsibilities: Improving our observability by adding/adjusting metrics. Building easily parsable dashboards. Designing and overseeing our on-call rotations. Improving our deployment process to increase reliability. Luminance is More ❯
docker, north west england, United Kingdom Hybrid / WFH Options
Oho Group Ltd
in the UK You will ideally have: Experience in Budling AWS Native Experience with Kubernetes, Infrastructure Experience with Docker Experience with Containerization Familiar with Observability stacks, i.e. ELK, LGTM. Proficient with IaC tools (Terraform), understanding general use-cases Proficient with at least one scripting language: Python, Ruby, JavaScript, etc. Desirable More ❯
performance, and reliability across production and non-production environments. You will work across real-time incidents and projects, including capacity planning, WAN, and system observability using tools like Prometheus and Grafana. Requirements: Strong experience administering Solace PubSub+ messaging across environments (on-prem and Cloud) Strong knowledge of production support Configure More ❯
s degree or higher in computer science/quantitative field Strong knowledge of CI/CD systems such as Jenkins, TeamCity Solid experience with observability tools such as Prometheus, ELK Stack etc Experience working with AWS or similar cloud platforms Hands on experience with Kubernetes Proficiency in Python or other More ❯
Help manage OS/kernel compatibility for performance-critical apps. Guide developers on infrastructure choices and ensure consistency across environments. Contribute to performance tuning, observability improvements, and tooling evolution. What helps: Solid grounding in infrastructure as code especially with Terraform and Ansible. Experience with AWS and Kubernetes, or the curiosity More ❯
and build scalable backend systems and platform services in Golang Develop and maintain cloud-native infrastructure across AWS/GCP Automate infrastructure, deployment, and observability pipelines Collaborate with security, product, and dev teams to ensure rock-solid reliability Continuously improve developer experience and system performance We’re Looking For: Solid More ❯
sheffield, south yorkshire, yorkshire and the humber, United Kingdom Hybrid / WFH Options
RED Global
integrating Control-M into cloud environments (AWS, Azure, or GCP). Strong understanding of workload automation, job scheduling, and batch processing concepts. Familiarity with observability tools like AppDynamics, Splunk, or Grafana is a plus. Excellent problem-solving skills and the ability to work under pressure. Strong communication skills, both written More ❯
Romsey, Hampshire, South East, United Kingdom Hybrid / WFH Options
Robert Half
practices What You'll Be Doing Infrastructure as Code - Manage, optimize, and automate cloud environments Security & Reliability - Implement best practices for performance, security, and observability Collaboration - Work closely with stakeholders to streamline deployments and CI/CD pipelines Troubleshooting & Monitoring - Ensure high availability and efficiency of systems Must-Have Skills More ❯
contribute to scalable solutions in the cloud. Collaborate with cross-functional teams to deliver secure, reliable, and maintainable code. Optimise performance and contribute to observability, testing, and resilience of services. Take ownership of cloud infrastructure components using GCP or AWS services. Essential Skills and Experience: Strong commercial experience with Java More ❯
taking independent decisions as well as having the ability to work cooperatively within a team, Experience working with microservice architectures and building monitoring/observability metrics, Understanding of cloud native landscapes (AWS or Azure or GCP), Knowledgeable of containerized environments would be beneficial (Docker or Kubernetes). Benefits we offer More ❯
techniques to automate processes and solve challenging business problems Maintain quality, security, reliability, and compliance of all solutions through digital best practices Build robust observability into solutions and monitor production health Advocate for client needs and deliver solutions that exceed expectations Establish and share best practices and methodologies across the More ❯
london, south east england, United Kingdom Hybrid / WFH Options
RedCat Digital
based application architecture and stack, preferably including AWS Good understanding of Docker and experience with CI/CD tooling Good understanding of security and observability best practices and tooling What else? Experience building and maintaining high-traffic server-side web applications Experience with infrastructure-as-code tools such as Terraform More ❯
london, south east england, united kingdom Hybrid / WFH Options
eTeam
for fast responses. Design fault-tolerant and resilient distributed systems using Kubernetes and cloud-native technologies. Utilize Prometheus, Grafana, and Kibana for monitoring and observability of backend systems. Optimize API performance and response times for a seamless user experience. Data Analytics & User Insights Integrate real-time data processing and analytics More ❯
london, south east england, United Kingdom Hybrid / WFH Options
Prism Digital
Actions & OIDC – build and maintain automated CI/CD pipelines with secure authentication. Datadog, Prometheus or similar – implement logging, metrics, and alerting for robust observability – the interim CTO is keen to hear your recommendation(s) on tooling and implementation strategy. Disaster recovery and security tooling – ensure platform resilience and safe More ❯
Leeds, Yorkshire, United Kingdom Hybrid / WFH Options
William Hill PLC
functional requirements into working software alongside your team Collaborate with the team to analyze, debug, and resolve defects Demonstrate a commitment to monitoring and observability Manage technical debt effectively by avoiding its creation and removing it when possible Communicate clearly, translating technical and non-technical requirements as needed Understand timelines More ❯
influencing technical decisions across the different stakeholder levels of the business including non-technical audiences. Ability to foster a culture around data-driven reliability, observability, monitoring, and automation. Due to the global nature of the team, a degree of flexible working will be required to accommodate different time zones. We More ❯
london, south east england, United Kingdom Hybrid / WFH Options
Gotobeat
and collaborate with Product, Data, and Artist Relations to translate business goals into resilient software. 5% Champion DevEx, proposing improvements to CI/CD, observability, and performance. You'll be successful here if you have 7+ years professional experience (at least 3 in a senior/lead capacity) delivering production More ❯
dynamics, competition, and peer group activities Understanding and ability to articulate the vision for modern engineering (e.g., agile, cloud-native, DevOps), and operations (e.g., observability, automated response, SRE, etc.), and articulate a path toward a target operating model (people, process, and tools) REQUIRED SKILLS Strong leadership skills are essential for More ❯
dynamics, competition, and peer group activities. Understanding and ability to articulate the vision for modern engineering (e.g., agile, cloud-native, DevOps), and operations (e.g., observability, automated response, SRE etc.), and articulate a path toward a target operating model (people, process, and tools). Required Skills Leadership: Strong leadership skills are More ❯
Dundee, Angus, United Kingdom Hybrid / WFH Options
Ivanti
a pivotal role in shaping the company's growth trajectory through continuous innovation and customer-centric solutions. What You Will Be Doing Assist in Observability Implementation: Support the development and maintenance of monitoring, logging, and tracing solutions. Monitor & Manage Observability Tools: Help deploy and manage observability platforms such as Azure … Ensure Cloud & Infrastructure Visibility: Contribute to scalable monitoring solutions for AWS and Azure environments. Collaborate with DevOps & SRE Teams: Work with teams to integrate observability best practices into CI/CD pipelines. Documentation & Knowledge Sharing: Contribute to runbooks, dashboards, and best practice guides to support observability initiatives. To Be Successful … in The Role, You Will Have Required Qualifications: 3-5 years of experience in observability, monitoring, or DevOps-related roles. Basic experience with monitoring tools such as Azure AppInsights, New Relic, Prometheus, and Grafana. Understanding of OpenTelemetry, New Relic, AppInsights APM for telemetry data collection. Familiarity with AWS and Azure More ❯
Southampton, Hampshire, United Kingdom Hybrid / WFH Options
NICE
such as Jenkins, GitLab CI/CD, or CircleCI. Strong knowledge of containerization technologies (e.g., Docker, Kubernetes) and microservices architecture. Experience with monitoring and observability tools (e.g., Prometheus, Grafana, ELK stack, Cloudwatch). Excellent problem-solving skills and the ability to troubleshoot complex issues in distributed systems. Experience of Incident … advantage if you also have: Hands-on experience of working with large Kubernetes Cluster. Certification will be an added plus. Working experience of Grafana Observability Suite (Loki, Mimir, Tempo). Administration and/or development experience of standard monitoring and automation tools such as Splunk, Datadog, Pagerduty, Rundeck. Familiarity with More ❯
of title, we are committed to achieving ambitious goals and we have fun celebrating our wins. We are looking for a self-motivated Senior Observability Engineer to join our dedicated Observability Infrastructure team. Anaplan is a high-growth company that is leading the way in enterprise planning. We look for … people who believe in simplicity, agility and performance and can choose and use the best tools for the job. In the role of Senior Observability Engineer, you will be designing and improving our approach to collecting and analyzing Observability telemetry (Logs, Metrics and Traces) and visualizing it in Grafana Cloud. … You will implement best observability practices to enable engineers across the business to track service performance and interaction in a scalable, performant, and cost-effective manner. What you'll be doing: In this role, working a minimum of 2 days a week in our York Office, you will be: Work More ❯
manchester, north west england, united kingdom Hybrid / WFH Options
ECOM
they work on, from ideation through to development, testing and deployment, so you should expect to champion and mentor on best practice like TDD, Observability and IaC. Skills: C#, .NET Core, APIs AWS, Docker, Kubernetes, Terraform CI/CD, TDD, SOLID The money is good too - up to £90k plus More ❯
Manchester, North West, United Kingdom Hybrid / WFH Options
InterQuest Group (UK) Limited
they work on, from ideation through to development, testing and deployment, so you should expect to champion and mentor on best practice like TDD, Observability and IaC. Skills: C#, .NET Core, APIs AWS, Docker, Kubernetes, Terraform CI/CD, TDD, SOLID The money is good too - up to £90k plus More ❯