tooling which supports engineers in writing, building and publishing software for our backend platform. We are embedded in a wider platform division with other teams looking after continuous delivery, observability and infrastructure. Our process Interviewing is a two way process and we want you to have the time and opportunity to get to know us, as much as we are More ❯
cloud-native environment. Have worked with SOC providers, managed security services, or security automation platforms. Have built and scaled incident response and threat detection programs. Have experience improving security observability across distributed infrastructures. Competitive compensation package, including equity. Learn and Grow - we provide mentorship and send you to events that help you build your network and skills. Flexible Time Off. More ❯
integrated across the entire Cisco technology portfolio and beyond, helping customers deploy at scale while also delivering AI-powered assurance insights within Cisco's leading Networking, Security, Collaboration, and Observability portfolios What You'll Do This is a rare opportunity to work with one eye on the future, while staying closely rooted in delivering exciting new innovations today. You will More ❯
in ambiguous problem spaces, aligning business and technical perspectives. Experience mentoring engineers through pairing, code reviews, and knowledge-sharing. Familiarity with CI/CD pipelines, automated testing strategies, and observability tools (e.g., GitHub Actions, Sentry, Datadog). A mindset geared toward experimentation, measurement, and continuous improvement, especially within growth-driven product teams. Nice to Have Previous experience working in a More ❯
Hemel Hempstead, Hertfordshire, United Kingdom Hybrid / WFH Options
Eckoh
a secure, highly available, PCI-compliant AWS platform that underpins Eckoh's mission-critical services. As a senior member of the team, you will drive improvements in platform reliability, observability, and operational excellence. You will collaborate closely with development teams to enable secure, automated delivery of services while championing DevSecOps principles. This role offers the chance to shape the future … secure PCI-compliant cloud platform on AWS to support enterprise-grade applications and services. Architect and operate production workloads with a focus on high availability, scalability, and resilience. Drive observability and monitoring improvements across infrastructure and services to proactively identify issues. Promote and embed a security-first, DevSecOps culture, ensuring best practices are followed at every stage of the software … Strong knowledge of CI/CD pipelines and automation tooling (Gitlab experience preferable). Experience with "infrastructure as code" (Terraform, CloudFormation), containerisation (Docker), and orchestration (Kubernetes). Proficiency with observability and monitoring solutions (e.g., CloudWatch, Prometheus, Grafana, Splunk). Strong understanding of cloud-native development practices and agile ways of working. Confident conducting peer code reviews and providing constructive technical More ❯
Hemel Hempstead, Hertfordshire, South East, United Kingdom Hybrid / WFH Options
Eckoh PLC
a secure, highly available, PCI-compliant AWS platform that underpins Eckoh's mission-critical services. As a senior member of the team, you will drive improvements in platform reliability, observability, and operational excellence. You will collaborate closely with development teams to enable secure, automated delivery of services while championing DevSecOps principles. This role offers the chance to shape the future … secure PCI-compliant cloud platform on AWS to support enterprise-grade applications and services. Architect and operate production workloads with a focus on high availability, scalability, and resilience. Drive observability and monitoring improvements across infrastructure and services to proactively identify issues. Promote and embed a security-first, DevSecOps culture, ensuring best practices are followed at every stage of the software … Strong knowledge of CI/CD pipelines and automation tooling (Gitlab experience preferable). Experience with 'infrastructure as code' (Terraform, CloudFormation), containerisation (Docker), and orchestration (Kubernetes). Proficiency with observability and monitoring solutions (e.g., CloudWatch, Prometheus, Grafana, Splunk). Strong understanding of cloud-native development practices and agile ways of working. Confident conducting peer code reviews and providing constructive technical More ❯
Cambridge, Cambridgeshire, United Kingdom Hybrid / WFH Options
Arm Limited
collaborate across teams to: Modernise our infrastructure by leading the migration from Docker Swarm to Kubernetes Design and operate CI/CD pipelines using CloudBees and GitLab Build out observability with Prometheus, Grafana, OpenTelemetry, and Dynatrace Automate cloud deployments (AWS-first) using Terraform and platform tooling Improve security posture across IAM, secrets, and networking Help the team ship faster and … TypeScript, Python). Validated experience operating distributed systems at scale in production. Cloud AWS (primary), Kubernetes (future), Docker (current), Terraform. Excellent debugging skills across network, systems, and data stack. Observability tooling, e.g. custom metrics pipelines, OpenTelemetry tracing, or integrations across telemetry stacks. Security engineering and practical understanding of IAM hardening, zero-trust network principles, and secrets management in data-heavy More ❯
language models, or high-throughput data processing , Experience working collaboratively in cross-functional teams with diverse technical backgrounds , (Desirable) Experience with GitHub Actions, CI/CD pipelines, monitoring, and observability , Strong problem-solving skills with the ability to debug and optimize systems across different domains , (Desirable) Interest in market research, behavioral science, or business applications of AI , Excellent communication skills More ❯
applications, or files into lakehouses like Snowflake, Databricks, and Redshift. With pipelines that just work and features like advanced data transformation using dbt Core and end-to-end pipeline observability, were focused on making robust data pipelines accessible to everyone. We are looking to add senior engineers to our core engineering team to build the infrastructure that modern data teams More ❯
endpoints Integrate AWS foundation models and optimise their performance across use cases Create abstraction layers so non-technical users can deploy AI agents easily Implement strong logging, monitoring, and observability Work closely with frontend developers to ensure seamless integration Set up and manage CI/CD pipelines using GitLab Contribute to containerisationand deployments on OpenShift What You'll Bring: 3+ More ❯
interfaces using React, Next.js, and Vercel AI SDK Containerising services using Docker and deploying to AWS (ECS, Lambda) Collaborating with researchers to productionise transformer models (e.g. PyTorch, HF) Using observability tools like Langfuse to monitor prompt and model performance Writing clean, modular, testable code that scales in production environments What They're Looking For: 3+ years' experience as a full More ❯
innovation cycles. You will have the opportunity to take ambiguity and refine it into valuable outcomes, taking risks where justified by the reward.You will understand how CI/CD, observability, and SLOs form part of a mature product offering and push for best practices. Use your insight to prevent production issues before they happen. When issues do occur you will More ❯
data is delivered on time and without failure. The ideal candidate will have a strong experience working with streaming and batch data systems, a solid understanding of monitoring a observability, and hands-on experience working with AWS, Apache Flink, Kafka, and Python. This is a fantastic opportunity to step into a SRE role focused on data reliability in a modern More ❯
queues and real-time communication protocols. Databases: Working with relational databases to manage structured data efficiently. Infrastructure & Deployment: Running on Linux-based environments, ensuring high availability and scalability. Monitoring & Observability: Using industry-standard tools for system health and performance tracking. What We're Looking For Essential Experience & Skills: Strong background in software development, with experience building and optimising complex systems. More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Rise Technical Recruitment Limited
data is delivered on time and without failure. The ideal candidate will have a strong experience working with streaming and batch data systems, a solid understanding of monitoring a observability, and hands-on experience working with AWS, Apache Flink, Kafka, and Python. This is a fantastic opportunity to step into a SRE role focused on data reliability in a modern More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Rise Technical Recruitment Limited
reporting data is delivered on time and without failure.The ideal candidate will have a strong experience working with streaming and batch data systems, a solid understanding of monitoring a observability, and hands-on experience working with AWS, Apache Flink, Kafka, and Python.This is a fantastic opportunity to step into a SRE role focused on data reliability in a modern cloud More ❯
and Engineering background Proficient in writing infrastructure as code for public cloud Experience with Python coding/testing or any Cloud-based technology (AWS preferred) Good understanding of Data Observability Good understanding of Hosting Platform Linux/Unix (EKS and Container experience is a plus) Good understanding of Databases, Data Lakes, and Query Engines, SQL/DDLs is preferred We More ❯
leaks, and performance bottlenecks Turn research prototypes into robust, production-ready software modules Lead architecture discussions and enforce clean, scalable design patterns Drive engineering standards across CI/CD, observability, and system modularisation Mentor developers through code reviews, pair programming, and design walkthroughs Bridge the gap between research and deployable robotics software-across embedded and cloud platforms What we're More ❯
manage high-volume data pipelines for energy consumption and system telemetry Lead the development of integration layers and messaging interfaces with third-party services Establish engineering best practices for observability, CI/CD, testing, and scalability Partner closely with product and backend teams to support rapid development cycles Proven track record as a senior software engineer or tech lead, ideally More ❯
manage high-volume data pipelines for energy consumption and system telemetry Lead the development of integration layers and messaging interfaces with third-party services Establish engineering best practices for observability, CI/CD, testing, and scalability Partner closely with product and backend teams to support rapid development cycles Proven track record as a senior software engineer or tech lead, ideally More ❯
storage systems Nice to haves: Experience with ecommerce and marketplace systems in a B2C environment Proficiency with Infrastructure as Code tools like Terraform Experience with Datadog for monitoring and observability Track record of implementing company-wide technical initiatives #J-18808-Ljbffr More ❯
Bring * Proven experience designing and running Kubernetes-based systems, ideally in constrained or disconnected environments * Hands-on expertise with infrastructure-as-code tooling (Terraform, Helm), CI/CD, and observability stacks * Deep understanding of containerisation, service networking, and resource tuning for edge devices or VMs * Practical experience deploying and running machine learning workloads, including LLMs or transcription models * Comfort adapting More ❯
report - Protecting & growing your payments business - Are you passionate about building reliable, scalable, and high-performing systems? Do you thrive on solving complex infrastructure challenges while driving automation and observability best practices? If so, we want to hear from you! At Thredd, we're looking for a Site Reliability Engineer to act as a North Star for this evolving discipline. More ❯
of DevOps you will oversee the Infrastructure and Build Engineering Team, taking ownership of defining and implementing the company's DevOps strategy, including streamlining CI/CD pipelines, automation, observability and release processes to support reliable and scalable software delivery. There's currently a focus on transforming inefficient CI/CD pipelines using Gitlab as well as making improvements to More ❯