1,051 to 1,075 of 1,200 Permanent Observability Jobs

Monitoring & Observability Engineer

Hiring Organisation
COMPUTACENTER (UK) LIMITED
Location
London, UK
Germany, France, and India, delivering complex, enterprise-grade IT solutions and consultancy across infrastructure, cloud, and modern operations. As a Monitoring & xkybehq Observability Engineer, you'll work in Make your application after reading the following skill and qualification requirements for this position. Please click on the apply button to read ...

Head of Support & Service Reliability Engineering

Hiring Organisation
Jobleads-UK
Location
Guildford, England, United Kingdom
transition from a single-tenant architecture to a multi-tenant SaaS platform, requiring a fundamental shift from reactive ticket handling to systemic reliability, observability, and customer experience management at scale. You will own the end-to-end operational integrity of the platform, ensuring availability, performance, and customer trust, while partnering …/P2) Drive improvements in: Mean Time to Detect (MTTD) Mean Time to Resolve (MTTR) Ensure clear, consistent internal and external communication during incidents Observability & Monitoring Define and implement a comprehensive observability strategy, including technical telemetry (infrastructure, application, APIs) Business telemetry (transactions, payment success rates, usage) End-to-end customer ...

Senior DevOps Engineer

Hiring Organisation
17918
Location
Telford, Shropshire, United Kingdom
Observability Engineer (SC Eligible) Rate: £580/day Inside IR35 Duration: 6 months Location: Mostly remote (Telford occasional onsite - 2 days/month) Clearance: SC Eligible Role Overview We are seeking an experienced Observability Engineer to design, implement, and support enterprise-grade monitoring and observability solutions across complex technology environments. ...

AI/ML Engineer

Hiring Organisation
PRACYVA
Location
Edinburgh, UK
Summary Responsible for automating model deployment, ensuring version control, monitoring inference systems, and safely integrating AI agents into production with strong observability and rollback capabilities. Responsibilities • Automate ML model deployment workflows • Manage model versioning and release processes • Monitor inference cost, latency, and drift • Integrate AI agents safely into production systems … Implement observability, alerting, and rollback mechanisms Required Experience 7+ years in ML Engineering/AI Engineering Required Skills • Model deployment automation • Model version control • Monitoring inference cost, latency & drift • Production integration of AI/LLM agents • Observability & rollback systems Preferred Skills • Experience with containerized deployments • Familiarity with CI/ ...

MongoDB Site Reliability Engineer

Hiring Organisation
Jobleads-UK
Location
Knutsford, England, United Kingdom
clusters, sharding, replica sets, backups). Troubleshoot and resolve complex production issues across L1-L3. Build automation using Python, Ansible, TDD, Agile . Improve observability with better monitoring, alerting, and performance insights. Reduce toil by engineering tools and automation that transform the platform. Required Skills Deep MongoDB administration expertise. Strong … Manager and backup tooling. Solid troubleshooting and production support capability. SRE fundamentals and an automation‐first mindset. Hands‐on Python and Ansible experience. Observability experience (monitoring, alerting, dashboards). Why Apply Perfect for Senior DBAs wanting to transition into SRE/Engineering . ~25% of your time spent coding ...

Lead Engineer

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
ownership, quality and continuous improvementDefine and evolve engineering standards, frameworks and best practices across the entire engineering organisationDrive improvements in software quality, testing strategy, observability and release confidencePartner with engineering, platform, product and security to deliver large-scale, cross-functional improvementsShape and deliver internal tooling and AI-assisted engineering workflows … with an engineering-first mindset and influencing people with a broader organisational impactPrevious experience on building enterprise level, highly scalable projects, focused on performance, observability and security best practices.Experience in a technology-driven organisation with strong engineering standardsYou’ll also bring:Strong experience improving engineering quality, reliability and operational maturity ...

Service Architect

Hiring Organisation
NineTech
Location
City of London, London, United Kingdom
operations and the integration of service management tools and processes. Proven experience applying service architecture principles to technical initiatives, including cloud migrations and observability platform integrations. Solid understanding of modern technology environments, including cloud platforms, AI infrastructure, and traditional IT estates. Subject matter expertise in at least one technical domain … such as networking, security, observability, or applications is highly desirable. Key Skills & Attributes Excellent communication skills with the ability to explain complex service and operating models to both technical and non-technical audiences. Comfortable engaging with stakeholders ranging from executive leadership to operational teams. Highly adaptable, with the ability ...

Technical Consultant

Hiring Organisation
Apto Solutions
Location
Bristol, Avon, South West, United Kingdom
Employment Type
Permanent, Part Time, Work From Home
Salary
£30,000
looking for a Graduate Consultant to join our Data practice. Youll work alongside senior colleagues on the design, deployment, and optimisation of monitoring and observability platforms primarily Splunk and Cribl helping enterprise clients get real value from their telemetry data. This is a Grade 1 role. The salary range reflects … will start by supporting experienced colleagues before taking on increasingly independent work. Technical Work Assist in the configuration and deployment of Splunk, Cribl, and observability tooling under the guidance of senior engineers. Support the ingestion and processing of data Learn to apply parsing logic, data normalisation, and enrichment techniques ...

Principal Engineer - Core Banking Platform

Hiring Organisation
Jobleads-UK
Location
Skipton, England, United Kingdom
lead time, high deployment frequency, low change-failure rate and rapid recovery. You’ll equip teams with paved roads, Golden Path pipelines, strong observability and secure environments that make fast, safe change the norm.You’ll also grow skills across squads, align engineering practices and create a trusted environment where teams … release strategies and resilience-first patterns that allow teams to deliver change safely and confidently as we modernise the platform.* Architect for reliability and observability, shaping API/event contracts, posting and settlement patterns, product and account boundaries and platform dependency baselines.* Shift left on quality and security, embedding contract ...

AI Solution Architect

Hiring Organisation
Jobleads-UK
Location
Royal Tunbridge Wells, England, United Kingdom
building reference implementations and scaling successful patterns Partnering with security, platform and engineering teams to enable LLMOps and AgenticOps capabilities - prompt lifecycle management, model observability, caching, evaluation and governance Contributing to the growth of an AI-ready architecture practice within AXA, including knowledge sharing and reference implementation development Supporting integration … happy to consider flexible working arrangements, which you can discuss with Talent Acquisition. Your skills & experience Strong understanding of AI Gov Ops and AI Observability market landscape (such as solutions, frameworks and libraries) and architectural approaches to implementation of preventative, detective and corrective AI risk controls Working knowledge ...

Staff AI Engineer

Hiring Organisation
Pair Team
Location
Oakland, California, United States
Employment Type
Permanent
Salary
USD Annual
product-facing implementations, balancing long-term architecture with rapid iteration. Own complex problem spaces end-to-end - from system design and implementation through observability, evaluation, and continuous improvement in production. Partner closely with product managers, operators, and domain experts to translate complex real-world processes into reliable agent behavior. Establish … operational ownership (beyond simple chat or LLM workflows). Hands-on experience with modern agent ecosystems, including frameworks (e.g., LangGraph, Mastra, Claude Agents SDK), observability/evals tooling (e.g., Langfuse, LangSmith, Braintrust), MCP implementations, and leading AI SDKs (e.g., OpenAI, Anthropic). Strong systems and backend architecture fundamentals, with experience ...

Staff AI Engineer

Hiring Organisation
Pair Team
Location
Milwaukee, Wisconsin, United States
Employment Type
Permanent
Salary
USD Annual
product-facing implementations, balancing long-term architecture with rapid iteration. Own complex problem spaces end-to-end - from system design and implementation through observability, evaluation, and continuous improvement in production. Partner closely with product managers, operators, and domain experts to translate complex real-world processes into reliable agent behavior. Establish … operational ownership (beyond simple chat or LLM workflows). Hands-on experience with modern agent ecosystems, including frameworks (e.g., LangGraph, Mastra, Claude Agents SDK), observability/evals tooling (e.g., Langfuse, LangSmith, Braintrust), MCP implementations, and leading AI SDKs (e.g., OpenAI, Anthropic). Strong systems and backend architecture fundamentals, with experience ...

Staff AI Engineer

Hiring Organisation
Pair Team
Location
Birmingham, Alabama, United States
Employment Type
Permanent
Salary
USD Annual
product-facing implementations, balancing long-term architecture with rapid iteration. Own complex problem spaces end-to-end - from system design and implementation through observability, evaluation, and continuous improvement in production. Partner closely with product managers, operators, and domain experts to translate complex real-world processes into reliable agent behavior. Establish … operational ownership (beyond simple chat or LLM workflows). Hands-on experience with modern agent ecosystems, including frameworks (e.g., LangGraph, Mastra, Claude Agents SDK), observability/evals tooling (e.g., Langfuse, LangSmith, Braintrust), MCP implementations, and leading AI SDKs (e.g., OpenAI, Anthropic). Strong systems and backend architecture fundamentals, with experience ...

Staff AI Engineer

Hiring Organisation
Pair Team
Location
Indianapolis, Indiana, United States
Employment Type
Permanent
Salary
USD Annual
product-facing implementations, balancing long-term architecture with rapid iteration. Own complex problem spaces end-to-end - from system design and implementation through observability, evaluation, and continuous improvement in production. Partner closely with product managers, operators, and domain experts to translate complex real-world processes into reliable agent behavior. Establish … operational ownership (beyond simple chat or LLM workflows). Hands-on experience with modern agent ecosystems, including frameworks (e.g., LangGraph, Mastra, Claude Agents SDK), observability/evals tooling (e.g., Langfuse, LangSmith, Braintrust), MCP implementations, and leading AI SDKs (e.g., OpenAI, Anthropic). Strong systems and backend architecture fundamentals, with experience ...

Staff AI Engineer

Hiring Organisation
Pair Team
Location
Detroit, Michigan, United States
Employment Type
Permanent
Salary
USD Annual
product-facing implementations, balancing long-term architecture with rapid iteration. Own complex problem spaces end-to-end - from system design and implementation through observability, evaluation, and continuous improvement in production. Partner closely with product managers, operators, and domain experts to translate complex real-world processes into reliable agent behavior. Establish … operational ownership (beyond simple chat or LLM workflows). Hands-on experience with modern agent ecosystems, including frameworks (e.g., LangGraph, Mastra, Claude Agents SDK), observability/evals tooling (e.g., Langfuse, LangSmith, Braintrust), MCP implementations, and leading AI SDKs (e.g., OpenAI, Anthropic). Strong systems and backend architecture fundamentals, with experience ...

Staff AI Engineer

Hiring Organisation
Pair Team
Location
Lebanon, New Hampshire, United States
Employment Type
Permanent
Salary
USD Annual
product-facing implementations, balancing long-term architecture with rapid iteration. Own complex problem spaces end-to-end - from system design and implementation through observability, evaluation, and continuous improvement in production. Partner closely with product managers, operators, and domain experts to translate complex real-world processes into reliable agent behavior. Establish … operational ownership (beyond simple chat or LLM workflows). Hands-on experience with modern agent ecosystems, including frameworks (e.g., LangGraph, Mastra, Claude Agents SDK), observability/evals tooling (e.g., Langfuse, LangSmith, Braintrust), MCP implementations, and leading AI SDKs (e.g., OpenAI, Anthropic). Strong systems and backend architecture fundamentals, with experience ...

Staff AI Engineer

Hiring Organisation
Pair Team
Location
Honolulu, Hawaii, United States
Employment Type
Permanent
Salary
USD Annual
product-facing implementations, balancing long-term architecture with rapid iteration. Own complex problem spaces end-to-end - from system design and implementation through observability, evaluation, and continuous improvement in production. Partner closely with product managers, operators, and domain experts to translate complex real-world processes into reliable agent behavior. Establish … operational ownership (beyond simple chat or LLM workflows). Hands-on experience with modern agent ecosystems, including frameworks (e.g., LangGraph, Mastra, Claude Agents SDK), observability/evals tooling (e.g., Langfuse, LangSmith, Braintrust), MCP implementations, and leading AI SDKs (e.g., OpenAI, Anthropic). Strong systems and backend architecture fundamentals, with experience ...

Staff AI Engineer

Hiring Organisation
Pair Team
Location
Baltimore, Maryland, United States
Employment Type
Permanent
Salary
USD Annual
product-facing implementations, balancing long-term architecture with rapid iteration. Own complex problem spaces end-to-end - from system design and implementation through observability, evaluation, and continuous improvement in production. Partner closely with product managers, operators, and domain experts to translate complex real-world processes into reliable agent behavior. Establish … operational ownership (beyond simple chat or LLM workflows). Hands-on experience with modern agent ecosystems, including frameworks (e.g., LangGraph, Mastra, Claude Agents SDK), observability/evals tooling (e.g., Langfuse, LangSmith, Braintrust), MCP implementations, and leading AI SDKs (e.g., OpenAI, Anthropic). Strong systems and backend architecture fundamentals, with experience ...

Staff AI Engineer

Hiring Organisation
Pair Team
Location
Anchorage, Alaska, United States
Employment Type
Permanent
Salary
USD Annual
product-facing implementations, balancing long-term architecture with rapid iteration. Own complex problem spaces end-to-end - from system design and implementation through observability, evaluation, and continuous improvement in production. Partner closely with product managers, operators, and domain experts to translate complex real-world processes into reliable agent behavior. Establish … operational ownership (beyond simple chat or LLM workflows). Hands-on experience with modern agent ecosystems, including frameworks (e.g., LangGraph, Mastra, Claude Agents SDK), observability/evals tooling (e.g., Langfuse, LangSmith, Braintrust), MCP implementations, and leading AI SDKs (e.g., OpenAI, Anthropic). Strong systems and backend architecture fundamentals, with experience ...

Staff AI Engineer

Hiring Organisation
Pair Team
Location
Tallahassee, Florida, United States
Employment Type
Permanent
Salary
USD Annual
product-facing implementations, balancing long-term architecture with rapid iteration. Own complex problem spaces end-to-end - from system design and implementation through observability, evaluation, and continuous improvement in production. Partner closely with product managers, operators, and domain experts to translate complex real-world processes into reliable agent behavior. Establish … operational ownership (beyond simple chat or LLM workflows). Hands-on experience with modern agent ecosystems, including frameworks (e.g., LangGraph, Mastra, Claude Agents SDK), observability/evals tooling (e.g., Langfuse, LangSmith, Braintrust), MCP implementations, and leading AI SDKs (e.g., OpenAI, Anthropic). Strong systems and backend architecture fundamentals, with experience ...

Staff AI Engineer

Hiring Organisation
Pair Team
Location
Houston, Texas, United States
Employment Type
Permanent
Salary
USD Annual
product-facing implementations, balancing long-term architecture with rapid iteration. Own complex problem spaces end-to-end - from system design and implementation through observability, evaluation, and continuous improvement in production. Partner closely with product managers, operators, and domain experts to translate complex real-world processes into reliable agent behavior. Establish … operational ownership (beyond simple chat or LLM workflows). Hands-on experience with modern agent ecosystems, including frameworks (e.g., LangGraph, Mastra, Claude Agents SDK), observability/evals tooling (e.g., Langfuse, LangSmith, Braintrust), MCP implementations, and leading AI SDKs (e.g., OpenAI, Anthropic). Strong systems and backend architecture fundamentals, with experience ...

Staff AI Engineer

Hiring Organisation
Pair Team
Location
Hartford, Connecticut, United States
Employment Type
Permanent
Salary
USD Annual
product-facing implementations, balancing long-term architecture with rapid iteration. Own complex problem spaces end-to-end - from system design and implementation through observability, evaluation, and continuous improvement in production. Partner closely with product managers, operators, and domain experts to translate complex real-world processes into reliable agent behavior. Establish … operational ownership (beyond simple chat or LLM workflows). Hands-on experience with modern agent ecosystems, including frameworks (e.g., LangGraph, Mastra, Claude Agents SDK), observability/evals tooling (e.g., Langfuse, LangSmith, Braintrust), MCP implementations, and leading AI SDKs (e.g., OpenAI, Anthropic). Strong systems and backend architecture fundamentals, with experience ...

Staff AI Engineer

Hiring Organisation
Pair Team
Location
Virginia Beach, Virginia, United States
Employment Type
Permanent
Salary
USD Annual
product-facing implementations, balancing long-term architecture with rapid iteration. Own complex problem spaces end-to-end - from system design and implementation through observability, evaluation, and continuous improvement in production. Partner closely with product managers, operators, and domain experts to translate complex real-world processes into reliable agent behavior. Establish … operational ownership (beyond simple chat or LLM workflows). Hands-on experience with modern agent ecosystems, including frameworks (e.g., LangGraph, Mastra, Claude Agents SDK), observability/evals tooling (e.g., Langfuse, LangSmith, Braintrust), MCP implementations, and leading AI SDKs (e.g., OpenAI, Anthropic). Strong systems and backend architecture fundamentals, with experience ...

Staff AI Engineer

Hiring Organisation
Pair Team
Location
Savannah, Georgia, United States
Employment Type
Permanent
Salary
USD Annual
product-facing implementations, balancing long-term architecture with rapid iteration. Own complex problem spaces end-to-end - from system design and implementation through observability, evaluation, and continuous improvement in production. Partner closely with product managers, operators, and domain experts to translate complex real-world processes into reliable agent behavior. Establish … operational ownership (beyond simple chat or LLM workflows). Hands-on experience with modern agent ecosystems, including frameworks (e.g., LangGraph, Mastra, Claude Agents SDK), observability/evals tooling (e.g., Langfuse, LangSmith, Braintrust), MCP implementations, and leading AI SDKs (e.g., OpenAI, Anthropic). Strong systems and backend architecture fundamentals, with experience ...

Staff AI Engineer

Hiring Organisation
Pair Team
Location
Nashville, Tennessee, United States
Employment Type
Permanent
Salary
USD Annual
product-facing implementations, balancing long-term architecture with rapid iteration. Own complex problem spaces end-to-end - from system design and implementation through observability, evaluation, and continuous improvement in production. Partner closely with product managers, operators, and domain experts to translate complex real-world processes into reliable agent behavior. Establish … operational ownership (beyond simple chat or LLM workflows). Hands-on experience with modern agent ecosystems, including frameworks (e.g., LangGraph, Mastra, Claude Agents SDK), observability/evals tooling (e.g., Langfuse, LangSmith, Braintrust), MCP implementations, and leading AI SDKs (e.g., OpenAI, Anthropic). Strong systems and backend architecture fundamentals, with experience ...