651 to 675 of 1,304 Observability Jobs

Founding Engineer

Hiring Organisation
RedTech Recruitment Ltd
Location
Cambridge, Cambridgeshire, East Anglia, United Kingdom
Employment Type
Permanent
Salary
£95,000
develop high-quality frontend interfaces that make complex AI outputs intuitive and actionable for users Build and maintain deployment pipelines, testing frameworks, monitoring, and observability systems Design and implement secure data pipelines with appropriate access controls and auditability Ensure the platform meets enterprise-grade security and compliance requirements ...

Quantitative Developer

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
keep: research workflows, client-reporting drafts, commentary support, meeting prep. Write production code that other quants want to build on. Own reliability, testing, and observability for what you ship. Mentor teammates on effective AI-augmented engineering practice. Carry out other duties as assigned. What to Expect When You Join ...

Senior Software Engineer I (Android) London, UK

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
Data and Product to interpret results, and iterate based on real user behaviour. Quality & Reliability: Maintain high standards for testing, crash‐free sessions and observability, and contribute to incident investigation and prevention. Qualifications Experience: 4+ years of Android engineering experience building and shipping consumer products in Kotlin. Architectural Depth: Comfortable ...

Head of Application Operations

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
driving effective RCAs. Strong Problem Management and RCA facilitation with a track record of implementing preventative actions that reduce operational risk. Proficient with observability and ITSM tooling to enable proactive monitoring, SLO/SLA definition and data‐driven operational dashboards. Strong people leadership with experience organising teams for fast execution ...

Director, Solutions Engineering Splunk UKI

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
within the UKI region . Experience working across multiple customer segments (Enterprise, Public Sector, Service Provider, Commercial). Strong domainexpertisein enterprise software (e.g., Cybersecurity, Observability, Cloud & AI, IT Operations, Application Performance Management, or Big Data). Exceptional communication and articulation skills; ability to translate complex technical ideas into clear business ...

Systems Engineer, MAJESTIC

Hiring Organisation
HII Mission Technologies Division
Location
Springfield, Virginia, United States
Employment Type
Permanent
Salary
USD Annual
obtain a CI Poly Proven ability to architect large-scale, mission-critical software systems in DoD/IC environments Deep background in security, observability, scalability, and resilient system design Experience leading technical execution in programs with ATO, RMF, or classified deployment milestones Strong ability to translate mission objectives into implementable ...

Senior Network Engineer (Fortinet / Cisco)

Hiring Organisation
Vaco LLC
Location
Irving, Texas, United States
Employment Type
Permanent
Salary
USD 150,000 Annual
Enterprise Network Access Control/Policy Enforcement Enterprise Wireless Management - Supporting Ruckus Wireless Solutions for Large-Scale WiFi Infrastructure Network Monitoring/Observability - Utilizing SolarWinds for Network Monitoring/Performance Management/Reporting/Operational Visibility VPN/Secure Connectivity - Designing/Supporting Site-to-Site IPsec VPN/ ...

Senior Software Engineer

Hiring Organisation
Jobleads-UK
Location
United Kingdom
INSHUR engineers work by contributing to squad, collective, or discipline-level initiatives, especially those advancing AI-augmented engineering practices across the organisation. Own Observability: You'll ensure systems stay healthy and visible by identifying monitoring gaps and independently managing escalations, building confidence that your area runs smoothly. Collaborate Across Functions ...

Site Reliability Engineer (AWS)

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
place for you. What You'll Do Own reliability – Maintain and improve our AWS infrastructure using Terraform, bringing your expertise and best practices Champion observability – Partner with developers to implement effective monitoring, logging, and tracing strategies Strengthen security – Work closely with the CISO to implement security best practices and ensure … compliance Optimise costs – Monitor cloud spend and implement FinOps best practices Maintain CI/CD pipelines – Implement and maintain reliability and observability aspects of GitHub workflows and deployment pipelines Incident response – Lead incidents, run blameless post-mortems, and drive continuous improvement Enable developers – Mentor teams on SRE and observability practices ...

Monitoring & Observability Engineer

Hiring Organisation
COMPUTACENTER (UK) LIMITED
Location
London, United Kingdom
Employment Type
Permanent
Salary
GBP Annual
Life on the team Location: UK Wide At Computacenter, youll be joining a world-class team of over 1,000 skilled professionals within Group Professional Services (GPS). Our teams operate across the UK, Germany ...

Network Reliability Engineer – Observability & Automation

Hiring Organisation
Jobleads-UK
Location
United Kingdom
Genesys is seeking a Network Engineer for Operations Reliability in the United Kingdom. The role focuses on maintaining the reliability, stability, and performance of enterprise network services, including LAN, WAN, and cloud connectivity. Candidates should ...

Solutions Architect - AI, Observability & Security Presales

Hiring Organisation
Jobleads-UK
Location
United Kingdom
Elasticsearch B.V. is looking for a Solutions Architect to serve as a technical authority and trusted advisor. This role involves understanding customer goals, guiding sales efforts, and building relationships. The ideal candidate should have a ...

Staff Site Reliability Engineer - Cloud

Hiring Organisation
Jobleads-UK
Location
Newcastle upon Tyne, England, United Kingdom
Newcastle: UK - London: UK - Leedstime type: Full timeposted on: Posted Todayjob requisition id: R55272**Elevate Global Operations as our Next Cloud Site Reliability Engineer (Observability Expert)!**Trimble is an industrial technology company transforming the way the world works by delivering solutions that enable our customers to thrive. We create technologies … progress with connected hardware and software solutions.**What Makes This Role Great:**In this role, you will be the primary architect of our Observability Centre of Excellence, directly influencing the reliability and uptime of global platforms that keep world industries moving.**Key Exciting Responsibilities:*** Lead a global "OTel First" strategy ...

DevOps Engineer

Hiring Organisation
Twinstream Limited
Location
Bristol, United Kingdom
Employment Type
Contract
Contract Rate
£500 - £600/day
container services and AMQP messaging. Working closely with feature delivery teams, you’ll help drive reliable production releases, maintain CI/CD pipelines, improve observability and ensure systems continue to meet demanding SLA/SLO targets. This is an excellent opportunity for a seasoned engineer who enjoys solving complex operational … promote releases into production efficiently and safely Maintaining highly available services using real-time monitoring and system metrics Building and improving monitoring, alerting and observability capabilities Investigating alerts and incidents, implementing preventative and remedial actions Working with customer stakeholders to coordinate releases and evolving service requirements Driving automation to reduce ...

Platform Engineer (Azure & Kafka)

Hiring Organisation
Digital Waffle
Location
United Kingdom
data movement Support and evolve data platforms (Databricks ideal) Build and maintain data pipelines (batch + streaming/ETL/ELT) Improve platform reliability, observability, and performance Collaborate with engineering teams to improve developer experience Requirements Strong Azure cloud experience Background in Platform Engineering, DevOps, or SRE Strong experience with … Strong understanding of data pipelines and distributed systems Focus on automation, scalability, and reliability Nice to Have Lakehouse or large-scale data platform experience Observability tooling (Datadog, Grafana, Prometheus) SaaS/high-growth product experience Strong developer experience mindset ...

Azure DevOps Engineer (Kafka)

Hiring Organisation
Digital Waffle
Location
United Kingdom
data movement Support and evolve data platforms (Databricks ideal) Build and maintain data pipelines (batch + streaming/ETL/ELT) Improve platform reliability, observability, and performance Collaborate with engineering teams to improve developer experience Requirements Strong Azure cloud experience Background in Platform Engineering, DevOps, or SRE Strong experience with … Strong understanding of data pipelines and distributed systems Focus on automation, scalability, and reliability Nice to Have Lakehouse or large-scale data platform experience Observability tooling (Datadog, Grafana, Prometheus) SaaS/high-growth product experience Strong developer experience mindset ...

AI Native Software Engineer

Hiring Organisation
TekWissen UK
Location
London Area, United Kingdom
invocation, and policy‐based routing Build cloud‐native backend services and APIs to support AI‐driven applications and enterprise integrations Implement evaluation, monitoring, and observability frameworks to ensure accuracy, latency, reliability, and system health across AI agent lifecycles Optimize AI and system performance across cost, scalability, and latency dimensions … Frameworks: LangGraph, AutoGen, CrewAI (or similar) Cloud & DevOps Tooling: Docker, Kubernetes, Terraform, Helm, CI/CD pipelines Enterprise Integration: APIs, enterprise platforms, monitoring and observability tools Why You’ll Love This Role Build real, enterprise‐grade AI systems that move beyond experimentation into production Remain deeply technical ...

Site Reliability Engineer

Hiring Organisation
Jobleads-UK
Location
Birmingham, England, United Kingdom
pipelines to facilitate smooth deployments and automate workflows. Collaborate with development teams to establish best practices in system architecture, deployment, and monitoring. Implement observability solutions to gain insights into system performance and user experience. Participate in on-call rotations to respond to system alerts, perform root cause analysis, and implement … code tools (Terraform, Ansible, etc.) for automating deployments. Proficiency in scripting and programming languages such as Python, Go, or Bash. Familiarity with monitoring and observability tools (Prometheus, Grafana, ELK stack). Excellent problem-solving skills and the ability to work effectively in high-pressure situations. Health Care Plan (Medical, Dental ...

OSS/BSS Solution architect

Hiring Organisation
IBU CONSULTING LTD
Location
London, United Kingdom
Employment Type
Contract
async patterns - Deep understanding of TMF SID (Product/Service domains) and TMF Open APIs used across BSS/OSS HIGHLY DESIRABLE Cloud Services & Observability - Experience designing for multi-cloud, hybrid and on-prem environments - Skilled in cloud-native patterns (PaaS, SaaS, serverless, containers, orchestration) - Cloud platforms … commonly used IaaS/PaaS services - Proficient with observability frameworks (Prometheus, ElasticSearch , Grafana, OpenTelemetry ) for metrics, logs and traces Platform Technologies & BSS Vendor Platforms - Open source workflow engines (Temporal) - Stream/batch processing (Flink, Spark) - Test automation: Robot Framework and BDD - can review BDD robot files - BSS vendor platforms: Salesforce ...

Artificial Intelligence (AI) DevOps

Hiring Organisation
WTW
Location
Greater London, United Kingdom
Employment Type
Full Time
Role The responsibilities will include: Help to design, build, and maintain AI‐augmented DevOps pipelines, integrating LLM‐powered tooling, automated testing, code generation, observability, and environment provisioning. Develop automation for operational workflows (permissions, tagging, remediation tasks, infrastructure housekeeping, monitoring pipelines) Help to build foundational components that allow delivery teams … necessary any and all of the security processes required for operational suitability within WTW for solutions (including SAST and DAST processes) Ensure operational stability, observability, and controlled evolution of AI and agentic systems for the ICT Consultancy business Maintain & support AI tools and AI based systems once deployed and help ...

Principal AI Engineer

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
data models, service integrations, and internal tools. Architect systems using modern cloud patterns such as microservices, event‐driven design, and managed services, ensuring reliability, observability, and scalability. Provide architectural leadership across integrations with enterprise systems and third‐party platforms. ML Ops, Reliability & Engineering Best Practices Productionize AI and machine learning … solutions using modern ML Ops and software engineering practices. Establish standards for testing, deployment, observability, drift detection, retraining, and documentation. Drive quality, automation, and performance in systems where accuracy, resilience, and reliability are critical. Leadership, Mentorship & Execution Serve as a hands‐on technical leader and player‐coach, mentoring engineers while ...

Kubernetes Linux AIOps Engineer – Elite Quant Hedge Fund

Hiring Organisation
Winston Fox
Location
City of London, London, United Kingdom
Infrastructure DevOps Engineer/SRE with expertise in Kubernetes, Linux, Observability, IaC and AIOps sought by a market-leading Quantitative Hedge Fund to further aide further business growth. Our client is one of the World's Elite Quant Hedge Fund Managers with large-scale, massively Distributed Systems, and ample opportunity … Terraform, C...) Must be able to write high quality Automation/scripts from scratch. Configuration Management Tools (Ansible/Puppet/Kapitan/Terraform....) Observability: Experience within the modern open-source ecosystem (ELK, OpenTelemetry, LGTM stack, Prometheus, Grafana, Loki...) CI/CD and GitLab/GitOps : working with Development teams. ...

Site Reliability Engineer (Kubernetes / Multi-Cloud) UK Based

Hiring Organisation
Jobleads-UK
Location
Hereford, England, United Kingdom
Cluster Autoscaler, KEDA, Karpenter) Help improve workload reliability and performance Support networking, identity, compute, and storage services Assist with maintaining secure and scalable environments Observability & Monitoring Work with Prometheus, Grafana, OpenTelemetry, Azure Monitor, and CloudWatch Build dashboards, alerts, and logging/tracing pipelines Support monitoring aligned to SLIs/SLOs … networking, and scaling Cloud Experience with Azure and/or AWS Familiarity with networking, IAM, and core services Infrastructure as Code Experience with Terraform Observability Familiarity with monitoring/logging tools (Prometheus, Grafana, loki) Other Technical Skills Helm Charts/Kustomize creation and maintenance Containers (Docker) Exposure to both Azure ...

AI Engineering Enablement Director

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
/ML, software, or platform engineering, with exposure to automated testing and infrastructure‐as‐code or policy‐as‐code.* Working knowledge of AI observability (logs, metrics, traces, behavioural signals) and practical methods to evaluate or improve AI system behaviour.* Familiarity with AI risk and governance frameworks (e.g., NIST … FinOps, such as cost‐aware model selection, unit economics, or prompt‐efficiency practices.* Experience with MLOps or AI delivery tooling, or with AI‐specific observability systems.* Participation in industry communities or standards bodies, with the ability to translate external practice into internal adoption.* Experience facilitating workshops or engineering enablement events. ...

Sr. Distinguished Machine Learning Engineer (Remote-Eligible)

Hiring Organisation
Capital One
Location
Mc Lean, Virginia, United States
Employment Type
Permanent
Salary
USD Annual
Sr. Distinguished Machine Learning Engineer (Remote-Eligible) Overview: At Capital One, we are creating responsible and reliable AI systems, changing banking for good. For years, Capital One has been an industry leader in using machine ...