1 to 25 of 37 Observability Jobs in the East of England

Infrastructure & Automation Manager

Hiring Organisation
PayPoint plc
Location
Welwyn Garden City, England, United Kingdom
NetApp storage solutions (management, provisioning, capacity planning) Microsoft Azure Amazon Web Services (AWS) Hybrid architecture and cloud migration experience Additional Relevant Skills: Monitoring and observability tooling Backup, DR, and business continuity technologies Our benefits if you decide to join us: Holiday purchase scheme, with 25 days holiday plus bank holidays ...

Senior Infrastructure Support Engineer

Hiring Organisation
Nscale
Location
Stevenage, Hertfordshire, UK
Employment Type
Full-time
. Practical experience with GPU drivers and GPU logs investigation tools, e.g. nvidia-smi. Performance diagnostics using NCCL on large scale clusters. Observability and incident response. Build and use alerting stacks and dashboards, interpret metrics and alerts, and drive runbooks to resolution; contribute to SLOs and post‐incident reviews. Strong ...

Senior Infrastructure Support Engineer

Hiring Organisation
Nscale
Location
Bedford, Bedfordshire, UK
Employment Type
Full-time
. Practical experience with GPU drivers and GPU logs investigation tools, e.g. nvidia-smi. Performance diagnostics using NCCL on large scale clusters. Observability and incident response. Build and use alerting stacks and dashboards, interpret metrics and alerts, and drive runbooks to resolution; contribute to SLOs and post‐incident reviews. Strong ...

Senior Infrastructure Support Engineer

Hiring Organisation
Nscale
Location
Colchester, Essex, UK
Employment Type
Full-time
. Practical experience with GPU drivers and GPU logs investigation tools, e.g. nvidia-smi. Performance diagnostics using NCCL on large scale clusters. Observability and incident response. Build and use alerting stacks and dashboards, interpret metrics and alerts, and drive runbooks to resolution; contribute to SLOs and post‐incident reviews. Strong ...

Senior Infrastructure Support Engineer

Hiring Organisation
Nscale
Location
Luton, Bedfordshire, UK
Employment Type
Full-time
. Practical experience with GPU drivers and GPU logs investigation tools, e.g. nvidia-smi. Performance diagnostics using NCCL on large scale clusters. Observability and incident response. Build and use alerting stacks and dashboards, interpret metrics and alerts, and drive runbooks to resolution; contribute to SLOs and post‐incident reviews. Strong ...

Senior Infrastructure Support Engineer

Hiring Organisation
Nscale
Location
Norwich, Norfolk, UK
Employment Type
Full-time
. Practical experience with GPU drivers and GPU logs investigation tools, e.g. nvidia-smi. Performance diagnostics using NCCL on large scale clusters. Observability and incident response. Build and use alerting stacks and dashboards, interpret metrics and alerts, and drive runbooks to resolution; contribute to SLOs and post‐incident reviews. Strong ...

Senior Infrastructure Support Engineer

Hiring Organisation
Nscale
Location
Basildon, Essex, UK
Employment Type
Full-time
. Practical experience with GPU drivers and GPU logs investigation tools, e.g. nvidia-smi. Performance diagnostics using NCCL on large scale clusters. Observability and incident response. Build and use alerting stacks and dashboards, interpret metrics and alerts, and drive runbooks to resolution; contribute to SLOs and post‐incident reviews. Strong ...

Senior Infrastructure Support Engineer

Hiring Organisation
Nscale
Location
Watford, Hertfordshire, UK
Employment Type
Full-time
. Practical experience with GPU drivers and GPU logs investigation tools, e.g. nvidia-smi. Performance diagnostics using NCCL on large scale clusters. Observability and incident response. Build and use alerting stacks and dashboards, interpret metrics and alerts, and drive runbooks to resolution; contribute to SLOs and post‐incident reviews. Strong ...

Senior Infrastructure Support Engineer

Hiring Organisation
Nscale
Location
Ipswich, Suffolk, UK
Employment Type
Full-time
. Practical experience with GPU drivers and GPU logs investigation tools, e.g. nvidia-smi. Performance diagnostics using NCCL on large scale clusters. Observability and incident response. Build and use alerting stacks and dashboards, interpret metrics and alerts, and drive runbooks to resolution; contribute to SLOs and post‐incident reviews. Strong ...

Senior Infrastructure Support Engineer

Hiring Organisation
Nscale
Location
Chelmsford, Essex, UK
Employment Type
Full-time
. Practical experience with GPU drivers and GPU logs investigation tools, e.g. nvidia-smi. Performance diagnostics using NCCL on large scale clusters. Observability and incident response. Build and use alerting stacks and dashboards, interpret metrics and alerts, and drive runbooks to resolution; contribute to SLOs and post‐incident reviews. Strong ...

Senior Infrastructure Support Engineer

Hiring Organisation
Nscale
Location
Cambridge, Cambridgeshire, UK
Employment Type
Full-time
. Practical experience with GPU drivers and GPU logs investigation tools, e.g. nvidia-smi. Performance diagnostics using NCCL on large scale clusters. Observability and incident response. Build and use alerting stacks and dashboards, interpret metrics and alerts, and drive runbooks to resolution; contribute to SLOs and post‐incident reviews. Strong ...

Senior Infrastructure Support Engineer

Hiring Organisation
Nscale
Location
Peterborough, Cambridgeshire, UK
Employment Type
Full-time
. Practical experience with GPU drivers and GPU logs investigation tools, e.g. nvidia-smi. Performance diagnostics using NCCL on large scale clusters. Observability and incident response. Build and use alerting stacks and dashboards, interpret metrics and alerts, and drive runbooks to resolution; contribute to SLOs and post‐incident reviews. Strong ...

Senior Infrastructure Support Engineer

Hiring Organisation
Nscale
Location
Hemel Hempstead, Hertfordshire, UK
Employment Type
Full-time
. Practical experience with GPU drivers and GPU logs investigation tools, e.g. nvidia-smi. Performance diagnostics using NCCL on large scale clusters. Observability and incident response. Build and use alerting stacks and dashboards, interpret metrics and alerts, and drive runbooks to resolution; contribute to SLOs and post‐incident reviews. Strong ...

Platform Engineer

Hiring Organisation
Darktrace
Location
Cambridge, England, United Kingdom
meets their requirements while promoting scalable application practices, Utilizing your expertise in container orchestration tools to facilitate the seamless deployment of applications, prioritizing reliability, observability, and performance, Staying current with the latest technologies, introducing innovative solutions to the team, and addressing operational challenges through strategic automation and optimization efforts. What ...

Senior Engineer

Hiring Organisation
Stackstudio Digital Ltd
Location
Shefford, Bedfordshire, South East, United Kingdom
Employment Type
Contract
Contract Rate
From £450 to £500 per day
design, scalability, fault tolerance, and performance optimization. Demonstrated ability to own projects end-to-end and mentor junior engineers. Significant experience with production infrastructure, observability, and incident management. Strong collaboration and communication skills across disciplines and teams. Clear understanding of engineering best practices and architectural principles. Desirable Skills/Knowledge ...

Platform Engineer

Hiring Organisation
SoCode Recruitment
Location
Cambridge, England, United Kingdom
removing single points of failure and enhancing autoscaling, high availability and managed service usage • Collaborate with SRE, Security and Engineering teams to strengthen observability, monitoring and alerting using Prometheus, Grafana and CloudWatch • Work closely with Security to embed best practice for IAM, secrets management, WAF and cloud posture management • Optimise … Kubernetes operations on AWS including cluster scaling, deployment automation and monitoring • Solid background in Linux administration, networking and cloud security principles • Familiarity with observability tools such as Prometheus, Grafana and Loki along with structured alerting practices • Experience with database migrations, high availability configurations, backups and disaster recovery • Strong scripting ...

Senior Engineering Manager

Hiring Organisation
Method Resourcing
Location
St. Albans, Hertfordshire, England, United Kingdom
Employment Type
Full-Time
Salary
£100,000 - £110,000 per annum
Shape engineering culture: reliability, ownership, automation, and continuous improvement. Manage budgets, supplier relationships, and resource planning. Ensure modern, efficient practices across CI/CD, observability, security, and cloud operations. What You Bring Proven leadership of engineering teams within scaling or transforming environments. Strong technical background in distributed systems, cloud-native ...

Senior Engineering Manager

Hiring Organisation
Method-Resourcing
Location
St. Albans, Hertfordshire, South East, United Kingdom
Employment Type
Permanent
Shape engineering culture: reliability, ownership, automation, and continuous improvement. Manage budgets, supplier relationships, and resource planning. Ensure modern, efficient practices across CI/CD, observability, security, and cloud operations. What You Bring Proven leadership of engineering teams within scaling or transforming environments. Strong technical background in distributed systems, cloud-native ...

Principal Software Development Engineer - ML Engineering - Personalisation

Hiring Organisation
Tesco
Location
Welwyn Garden City, England, United Kingdom
engineering, including MLOps and model lifecycle management. • Evaluate technology choices, lead innovation through PoCs, and accelerate ML time-toproduction. • Lead Operational Excellence by improving observability, performance and reliability of ML/AI systems. • Communicate effectively with senior stakeholders and mentor teams to solve critical challenges. You will need You bring ...

Cyber Security Engineer

Hiring Organisation
MBDA
Location
Stevenage, Hertfordshire, England, United Kingdom
Employment Type
Full-Time
Salary
£50,000 - £60,000 per annum
evolving challenges of the cyber threat landscape. Key responsibilities include; Act as the subject matter expert (SME) for Splunk across all cyber security and observability use cases. Lead SOC automation initiatives using scripting and SOAR tools, optimising processes through AI and ML technologies. Support alert tuning, connectivity, and visibility across ...

Cyber Security Engineer

Hiring Organisation
MBDA
Location
Stevenage, Hertfordshire, South East, United Kingdom
Employment Type
Permanent, Work From Home
Salary
£60,000
evolving challenges of the cyber threat landscape. Key responsibilities include; Act as the subject matter expert (SME) for Splunk across all cyber security and observability use cases. Lead SOC automation initiatives using scripting and SOAR tools, optimising processes through AI and ML technologies. Support alert tuning, connectivity, and visibility across ...

Senior Software Engineer – Security Platforms

Hiring Organisation
Arm
Location
Cambridge, England, United Kingdom
automation, and downstream integrations. Collaborate with security and engineering teams to define metrics, alerts, and dashboard views that surface critical trends and anomalies. Instrument observability and performance monitoring (metrics, dashboards) to ensure maximum throughput and reliability. Develop custom solutions for aggregating BOMs into hierarchical system views and conducting searches across … databases. Clear technical writing to document data schemas, APIs, and dashboard usage. “Nice to Have” Skills and Experience Experience with Grafana, Prometheus, or similar observability platforms. Familiarity with SAST and SCA tools (e.g., Coverity, Black Duck) and experience understanding their findings. Experience defining and visualizing key security and performance metrics ...

Senior Product Manager - Edge Compute Platforms

Hiring Organisation
Tesco
Location
Welwyn Garden City, England, United Kingdom
used by Tesco Technology across our offices, stores and distribution centres. This encompasses multiple domains; c onnectivity, end-user computing, CI/CD and observability . This includes both 3rd party and internally developed infrastructure applications and infrastructure that support the wider Tesco business. This is a rare opportunity ...

Remote - Infra Consultant (GKE Expert) | UK

Hiring Organisation
Quantiphi
Location
Peterborough, Cambridgeshire, UK
Employment Type
Full-time
running a comprehensive review of customer's existing GKE implementation across following key dimensions: Cluster Architecture, Tenancy, Networking/Connectivity, Security, Operations and Observability, Automation and CI/CD, Cost Management and Billing, Testing ...

Remote - Infra Consultant (GKE Expert) | UK

Hiring Organisation
Quantiphi
Location
Stevenage, Hertfordshire, UK
Employment Type
Full-time
running a comprehensive review of customer's existing GKE implementation across following key dimensions: Cluster Architecture, Tenancy, Networking/Connectivity, Security, Operations and Observability, Automation and CI/CD, Cost Management and Billing, Testing ...