edge devices. Deploying machine learning models to production. Optimizing the platform runtime for maximum performance. This is largely C++ code with parts of the pipeline running on GPU. Building observability and telemetry. This is a 5 day a week in the office role. Qualifications 3+ years of experience writing production software in C++ and Python of experience building applications processing More ❯
london (city of london), south east england, united kingdom
Venator Recruitment
edge devices. Deploying machine learning models to production. Optimizing the platform runtime for maximum performance. This is largely C++ code with parts of the pipeline running on GPU. Building observability and telemetry. This is a 5 day a week in the office role. Qualifications 3+ years of experience writing production software in C++ and Python of experience building applications processing More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Harnham - Data & Analytics Recruitment
solutions Essential Skills & Experience: Proven experience in Python/Django Solid frontend skills with React & TypeScript Exposure to Ruby on Rails (nice to have) Experience with distributed systems and observability tools Ability to design solutions across multiple services Strong analytical problem-solving skills and clear communication Nice-to-Have Skills: Next.js Internal tooling for operational teams Experience in highly collaborative More ❯
commercial strategies Ideal profile: Experienced fractional or portfolio CTO with a strong SaaS and engineering background Deep knowledge of AWS and modern tooling including CI/CD, containerisation, and observability Demonstrated capability in leveraging data and AI for commercial impact Strategic and hands-on, comfortable operating in lean environments with evolving teams Strong leadership presence, able to influence and challenge More ❯
commercial strategies Ideal profile: Experienced fractional or portfolio CTO with a strong SaaS and engineering background Deep knowledge of AWS and modern tooling including CI/CD, containerisation, and observability Demonstrated capability in leveraging data and AI for commercial impact Strategic and hands-on, comfortable operating in lean environments with evolving teams Strong leadership presence, able to influence and challenge More ❯
Ruddington, Nottingham, Nottinghamshire, England, United Kingdom
Big Red Recruitment
commercial strategies Ideal profile: Experienced fractional or portfolio CTO with a strong SaaS and engineering background Deep knowledge of AWS and modern tooling including CI/CD, containerisation, and observability Demonstrated capability in leveraging data and AI for commercial impact Strategic and hands-on, comfortable operating in lean environments with evolving teams Strong leadership presence, able to influence and challenge More ❯
client satisfaction. Collaborating with Client Solutions and other teams to understand requirements and deliver tailored solutions. Designing and implementing scalable, future-proof architectures for new connectors and integrations. Enhancing observability with better diagnostics, logging, and tracing to support technical teams. Overseeing the development and management of the public API (REST event streaming functionality). Producing clear, accessible technical documentation, using More ❯
and Azure SQL, Key Vault, and Storage Accounts. User Interfaces: Build user-facing interfaces such as portals or dashboards to present data and manage automation workflows. CI/CD & Observability: Contribute to CI/CD workflows, including code integration, testing, deployment and monitoring/logging. Code Quality: Participate in peer reviews and maintain Git version control practices using VCS. Lifecycle More ❯
Nottingham, Nottinghamshire, East Midlands, United Kingdom
Microlise
skills matrixes and progression ladders Promote a culture of delivery excellence, engineering discipline, and personal accountability Encourage cross-team alignment on shared technical goals, such as implementation of DSC, observability standards, and automation Support long-term platform value creation by ensuring projects are not only delivered but deliver measurable benefit to the organisation What we are looking for: Demonstrable technical More ❯
Bath, Somerset, United Kingdom Hybrid / WFH Options
Cognibox
unstructured data. - Build a modern Data Lakehouse/Warehouse as a single source of truth. - Implement reporting and dashboarding solutions that empower internal teams and customers. - Ensure platform reliability, observability, and security. - Promote reusable, curated datasets and self-service analytics across teams. - Collaborate with BI, Engineering, and business stakeholders to enhance analytics delivery. - Leverage cloud platforms and modern pipeline tooling More ❯
Cardiff, South Glamorgan, United Kingdom Hybrid / WFH Options
Cognibox
unstructured data. - Build a modern Data Lakehouse/Warehouse as a single source of truth. - Implement reporting and dashboarding solutions that empower internal teams and customers. - Ensure platform reliability, observability, and security. - Promote reusable, curated datasets and self-service analytics across teams. - Collaborate with BI, Engineering, and business stakeholders to enhance analytics delivery. - Leverage cloud platforms and modern pipeline tooling More ❯
in workload isolation and service configuration. Create and maintain support documentation, diagrams, and knowledge bases for platform components. Supporting and maintaining existing solutions and identifying/remediating technical debt. Observability & Troubleshooting Ensure observability is embedded across all layers of the infrastructure stack, enabling proactive alerting, monitoring, and root cause analysis. Implement tooling to enhance visibility, debugging, and response times for … deploying workloads using Helm charts and managing container infrastructure. Knowledge of GitOps tools (e.g. ArgoCD). Knowledge of Service Mesh technologies (e.g. Anthos). Exposure to monitoring, logging, and observability tooling (e.g. Prometheus, Grafana, GCP Operations Suite). Behavioural Competencies Cross-Team Collaboration: Works effectively with engineering, security, support, and governance to improve platform maturity. Problem Solving: Identifies platform bottlenecks More ❯
They will be a strong communicator, and may have previously worked in an SRE role, a software engineering role or a systems engineering role. Key Responsibilities: Participate in building observability, monitoring and alerting for key services - continuously improving our SLI & SLOs and observability data enabling faster issue detection and incident resolution Collaborate with senior engineers and product teams to ensure More ❯
Bristol, Gloucestershire, United Kingdom Hybrid / WFH Options
TwinStream
services. You will be working with multiple feature development teams and the BAU/Support team to define and evolve our cloud & on-prem infrastructure & delivery pipelines, improving system observability, demonstrating performance and capacity improvements and proactively identifying and mitigating reliability risks. Key Responsibilities of the Site Reliability Engineer: Collaborate with Software Engineers to improve reliability and performance in their … subsystems Partner with System Administrators in automating toil and eliminating alerts Evolve observability and monitoring capabilities to identify and solve problems before they impact the business Support development environments to help us achieve our delivery and quality goals Research and evaluate technologies, tools and services to influence buy-vs-build decisions Develop expertise in diverse technical and business domains Expand … in one of our platform languages (Java, Go, Python or similar) Knowledge of cross domain principles & technologies Experience of working in a service management environment Practical applications of using observability patterns in previous systems Creating and monitoring system availability metrics and using those to drive work that reduces downtime There are many great reasons to join our team! Pension Plan More ❯
Support for existing & new UNIX infrastructure Optimisation of existing services & infrastructure Support and deploy Ansible Automation Platform Support Oracle Database 19c on Oracle Linux. On KVM Support logging and observability stacks inc time series databases Promote SLO/SLI measurement & Tracking Support pipeline enhancements and deployments Engage and provide guidance to external teams on service consumption Skills & Experience Required Significant … of IAC principles and continuous development Knowledge and experience working with Git reversion control Pipeline configuration and deployment, Ansible Automation Platform/Git runners etc Experience with Monitoring and observability Deployment and configuration of Prometheus/Grafana/Cloudwatch Time series Database experience with Influx Full stack administration - TICK/elastic search/open search/Fluentd Experience of AWS More ❯
Job Title Observability Engineer Location Asda House Employment Type Full time Contract Type Permanent Hours Per Week 37.5 Salary Competitive salary plus benefits Category Software Engineering Closing Date 29 August 2025 We are looking for an Observability Engineer who will report into the Engineering Manager and contribute to the delivery of our mission through a combination of design, build & implementation … configuration and support, and over time evolve to include the wider goals of the team. What You'll Love Design, build, and evolve core features of New Relic's observability platform (APM, logs, traces, infrastructure monitoring) for high throughput and scalability Configure New Relic dashboards, alerts, synthetic monitoring, distributed tracing, and log management Collaborate with cross-functional teams (product, SRE … UX) to translate requirements into resilient, cost-effective observability solutions Implement observability-as-code: define dashboards, alerts, synthetic monitors, notification channels & tags using New Relic Integrate instrumentation standards like OpenTelemetry across distributed systems Actively mentor Associate engineers, and lead incident response and analysis Work with stakeholders to understand problems, analyse requirements, develop ideas and design & deliver solutions that enhance engineering More ❯
Cambridge, Cambridgeshire, East Anglia, United Kingdom Hybrid / WFH Options
La Fosse
infrastructure platform with AI-operable capabilities Oversee key infrastructure components such as data centre expansion, programmable compute, and software-defined network/storage Enable automation-first delivery models with observability, self-healing, and policy-driven control Implement and mature GitOps workflows, IaC pipelines, and CI/CD processes across engineering teams Lead programme governance, risk management, and stakeholder engagement Partner More ❯
Tunbridge Wells, Kent, Royal Tunbridge Wells, United Kingdom Hybrid / WFH Options
FPSG
and industry security standards (e.g. OWASP CI/CD, SAMM) are adhered to across systems Managing and improving cloud security posture (Azure Defender, Prisma Cloud etc) Implementing and optimising observability platforms for holistic system monitoring Supporting and securing software delivery lifecycle, from development to deployment and ongoing operations The successful Security Engineer's essential skills will include: Demonstrated experience in More ❯
Glasgow, Lanarkshire, Scotland, United Kingdom Hybrid / WFH Options
Circle Group
solid understanding of quality engineering principles. Ability to work autonomously and collaboratively in a fast-paced, cross-functional environment. A holistic view of quality , considering everything from testability and observability to scalability and resilience. Ideal Background Degree in Computer Science, Engineering, or a related field. Proven experience in quality engineering roles with a focus on continuous improvement and cross-team More ❯
Manchester, Lancashire, England, United Kingdom Hybrid / WFH Options
Circle Recruitment
solid understanding of quality engineering principles. Ability to work autonomously and collaboratively in a fast-paced, cross-functional environment. A holistic view of quality , considering everything from testability and observability to scalability and resilience. Ideal Background Degree in Computer Science, Engineering, or a related field. Proven experience in quality engineering roles with a focus on continuous improvement and cross-team More ❯
Cardiff, South Glamorgan, Wales, United Kingdom Hybrid / WFH Options
Circle Recruitment
solid understanding of quality engineering principles. Ability to work autonomously and collaboratively in a fast-paced, cross-functional environment. A holistic view of quality , considering everything from testability and observability to scalability and resilience. Ideal Background Degree in Computer Science, Engineering, or a related field. Proven experience in quality engineering roles with a focus on continuous improvement and cross-team More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Circle Recruitment
solid understanding of quality engineering principles. Ability to work autonomously and collaboratively in a fast-paced, cross-functional environment. A holistic view of quality , considering everything from testability and observability to scalability and resilience. Ideal Background Degree in Computer Science, Engineering, or a related field. Proven experience in quality engineering roles with a focus on continuous improvement and cross-team More ❯
to address infrastructure and operational challenges at scale. As part of the Database SRE team, you will be data-driven and work to eliminate TOIL through simplification, automation, and observability, thereby enhancing the reliability of our platforms. With a focus on database scalability, availability, security, and performance, you will work closely with the Engineering team, product managers, and other teams. More ❯
Reading, Berkshire, South East, United Kingdom Hybrid / WFH Options
Ignite Digital Search Ltd
AWS Lifecycle Setting up and optimizing EC2 instances, Lambda functions, RDS databases Implementing comprehensive cost management and optimization strategies Managing VPCs, IAM policies, and security configurations across environments Monitoring & Observability Building sophisticated monitoring and alerting systems Implementing observability solutions for proactive issue detection Creating dashboards and metrics that drive operational excellence Compliance & Security Ensuring HIPAA, GDPR and healthcare regulatory compliance … in regulated industries (healthcare, financial services, life sciences) AWS cost management and FinOps experience Monitoring tools expertise (CloudWatch, Datadog, New Relic, Prometheus) Security and compliance framework knowledge Experience with observability and APM solutions Why This Opportunity Stands Out: Real Impact - Your work directly improves healthcare outcomes Growth Trajectory - Join a scaling company where your contributions shape the future Innovation Focus More ❯
a hybrid multi-cloud (AWS, Azure, GCP) and on-premises ecosystem. This is your opportunity to drive modernisation, steer architecture, and be a hands-on force across infrastructure, automation, observability, and security. What You'll Do Architect and build modern hosting environments from the ground up, with a focus on observability, security, and infrastructure-as-code. Lead infrastructure design & technical More ❯