851 to 875 of 1,213 Permanent Observability Jobs

Senior Software Engineer - DevOps

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
Implementing infrastructure as code and improving automation across environments Troubleshooting and resolving complex build, deployment and production issues across application and infrastructure layers Improving observability, reliability and performance of internal platforms and production systems Partnering with engineering teams to define best practices for deployment, release management and cloud architecture Contributing … principles Experience working with GitHub, including workflow automation and repository management Experience with infrastructure as code and automated environment management Strong understanding of reliability, observability and operational best practices Ability to debug complex systems and work effectively across multiple engineering teams Why Deliveroo Our mission is to transform ...

AI Engineer

Hiring Organisation
VIA MATCH LIMITED
Location
London, South East, England, United Kingdom
Employment Type
Full-Time
Salary
£80,000 - £110,000 per annum
Doing Designing and building production-grade AI systems that integrate LLMs, RAG pipelines, vector databases, and agentic frameworks Creating evaluation and observability frameworks to measure, monitor, and continuously improve system performance, accuracy, and reliability Implementing and maintaining retrieval systems, including ingestion pipelines, chunking strategies, and advanced techniques such as HyDE … with hands-on fine-tuning experience Familiarity with real-time streaming, multimodal models, or search technologies such as Elasticsearch Experience with model observability tools such as LangSmith or Weights & Biases Background in a regulated or specialised vertical (financial services, healthcare, energy, legal, retail), with an understanding of compliance, security ...

Senior Full Stack Java Developer (Legacy Modernization & Cloud Migration)

Hiring Organisation
Vaco LLC
Location
Charlotte, North Carolina, United States
Employment Type
Permanent
Salary
USD Annual
database solutions (Oracle, SQL Server, PostgreSQL, MongoDB). Collaboration: Work closely with cross-functional teams to ensure seamless integration, data integrity, and system observability (Sumo Logic). Technical Leadership: Mentor junior developers, review code, and contribute to best practices for Java, Angular, and DevOps. Required Skills & Experience 5+ years … e.g., VB to Java, on-prem to cloud). Languages: Fluent in English and Spanish. Nice-to-Have Experience with Sumo Logic or similar observability tools. Familiarity with Windows Server Linux migrations. Basic Python scripting for automation. What We Offer Remote-first work environment. Competitive salary and benefits. Opportunities ...

Forward Deployed Engineer (FDE), Customer Solutions

Hiring Organisation
DaVinci Commerce
Location
City of London, London, United Kingdom
grade AI agents for commerce and BrandStore use cases. Implement orchestration logic, state management, workflow automation, and service integrations. Optimize AI agent performance, reliability, observability, and fault tolerance. Support hosting, deployment, monitoring, and debugging of mission-critical customer-facing systems. Customer Onboarding & Training Lead technical onboarding sessions for enterprise customers … cloud infrastructure and deployment environments (AWS/GCP/Azure). Familiarity with databases, authentication systems, queues, monitoring, and distributed systems. Ability to use observability and monitoring tools to diagnose production issues in AI deployments. Experience with frontend/UI prototyping frameworks is a plus. Familiarity with MCP (Model Context ...

Software Engineering Manager - Platform

Hiring Organisation
Jobleads-UK
Location
City of Westminster, England, United Kingdom
Engineering Excellence and creating culture of innovation Leading the team in adopting a Site Reliability Engineering (SRE) mindset Ensuring a holistic approach to include observability and key metric dashboards for all deliverables Ensuring cross‐functional requirements are adhered to within the team, e.g. Performance etc. Making use of Platforms … support of existing software systems, ensuring prompt resolution of issues and bugs. Tech Stack React, Next.js, Typescript Java Kotlin Swift GraphQL Federation Cloud: Azure Observability: Dynatrace Who You Are Previous polyglot hands‐on senior software engineer Experience working on highly scalable software solutions across web or backend Extensive background ...

Platform Site Reliability Engineer

Hiring Organisation
Jobleads-UK
Location
England, United Kingdom
provision of tooling for our support organisation Apply ITSM frameworks: Incident, Major Incident, Change Management, and service improvement. Maintain and enhance Radiant’s observability stack: Prometheus, Grafana, and custom monitoring integrations Operate and support services in 24x7 production environments, including on-call rotation Contribute to Incident postmortem analyses, root cause … DHCP, VLANs, routing, switching Strong experience with API interrogation Strong experience with infrastructure scripting and automation (Bash, Python, Ansible) Deep understanding of observability principles and tools (Prometheus, Grafana preferred) Strong grasp of ITSM and service operation best practices Excellent communication and mentorship skills Comfortable interfacing with internal stakeholders and external ...

Principal Platform Engineer (Edge)

Hiring Organisation
Jobleads-UK
Location
Bristol, England, United Kingdom
peer coordination, and systems that can operate independently during disconnection. You are the sort of engineer who thinks carefully about failure modes, deployment risk, observability, workload lifecycle, service discovery, and operational simplicity. You care about building robust abstractions that allow application teams to securely deploy workloads without needing to understand … limited connectivity. Working on security baselines for edge nodes, including secure boot, hardware‐rooted identity, attestation, and the runtime isolation of workloads. Building observability, logging, and telemetry capabilities that work when bandwidth is scarce and devices are intermittently reachable. Designing zero‐touch onboarding and provisioning flows so devices come online ...

Senior Onboarding Engineering | 6 month Contract

Hiring Organisation
Novatus
Location
London Area, United Kingdom
Novatus is a Series B scale-up RegTech SaaS provider and boutique advisory practice, enabling financial services firms to solve complex challenges and redefine what’s possible through expert-led technology and consulting. Across both ...

Senior SRE Lead

Hiring Organisation
Albany Beck
Location
London Area, United Kingdom
about capability build, technical excellence, and delivering meaningful change within complex enterprise environments. Role Overview Albany Beck is seeking a Senior SRE Lead/Observability SME to lead the establishment of a new enterprise Site Reliability Engineering (SRE) capability, with a primary focus on designing and implementing a modern observability … suite and operational resilience framework. This is a foundational build role, responsible for defining how reliability engineering and observability are structured, measured, and embedded across a complex global technology estate. The successful candidate will play a key role in shifting the organisation from reactive operational support to a metrics-driven ...

Senior SRE – Electronic Trading Observability Lead

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
global financial services firm in Greater London is seeking a Senior Software Engineer/SRE to focus on ensuring observability and resilience for its Electronic Trading systems. You will drive reliability initiatives, develop frameworks for tracking metrics, and collaborate on system health reports. The ideal candidate has a strong background ...

Platform Engineering Lead — AWS, Kubernetes, Observability

Hiring Organisation
Jobleads-UK
Location
City Of London, England, United Kingdom
opportunities for professional growth, including leadership training. Ideal candidates should have key technical skills in AWS, Terraform, and Kubernetes, along with a passion for observability and Linux. An exciting chance for those looking to make an impact within a collaborative team environment. #J-18808-Ljbffr ...

Field CTO EMEA

Hiring Organisation
Jobleads-UK
Location
Maidenhead, England, United Kingdom
Engineering, platform teams, and business stakeholders.Translate customer business goals into compelling transformation strategies powered by Dynatrace.Lead high-impact technical discovery and executive conversations around observability, cloud modernization, AI adoption, security, automation, and business outcomes.Shape account strategy with Sales and Solution Engineering teams for complex, multi-stakeholder deals.Develop board-level … executive-level narratives that connect platform capabilities to risk reduction, operational excellence, digital experience, and growth.Guide customers on modern observability and security operating models, including platform engineering, SRE, DevSecOps, and AI-assisted operations.Support large opportunities by validating architecture direction, differentiation, value realization, and long-term platform vision.Influence go-to-market ...

Senior AI Solutions Architect, Pre-Sales & Integration

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
experimenting with cutting‐edge technologies. Preferred Qualifications Advanced Integration – Experience integrating Salesforce with external agents via APIs and open standards (MCP, A2A). Governance & Observability – Familiarity with prompt governance, observability, monitoring frameworks, responsible AI and compliance best practices. Cross‐Platform Background – Background in cross‐platform integrations (e.g., Hyperscaler SDKs ...

Principal AI Engineer

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
Agentic ecosystem, responsible for the high‐level design choices that define how agents run at PhysicsX. You will cover topics such as: Agent Observability: Own the implementation to enforce deep tracing, granular cost tracking, and observability across the lifecycle. Agent Deployment: Deliver an intuitive deployment lifecycle which simplifies questions around … behalf of users in a regulated enterprise environment. The Tech Stack Core Platform: Python (Primary), Go or TypeScript (Secondary), Kubernetes, Docker, Terraform. Observability & Evals: OTel, LangSmith, Arize, Braintrust. Who You Are An Architect at Heart: You have strong, reasoned opinions on Durable Execution vs. Standard Async, Vector Search vs. Keyword ...

Head of Engineering

Hiring Organisation
Xapien
Location
London Area, United Kingdom
leads who own architectural decisions within a domain-driven design structure. ● Establish engineering-wide standards for code quality, review processes, and technical governance. ● Build observability, incident management, and on-call practices that scale with team growth and deployment frequency. ● Embed DevOps, MLOps, security, and compliance practices into … Series A/B). ● Technical Credibility: Strong background in cloud-native architectures, distributed systems, and modern delivery practices (CI/CD, automated testing, observability). Experience with cloud cost management and infrastructure optimisation. ● Operational Maturity: Experience building observability, on-call rotations, and incident management practices as engineering organisations scale ...

Azure SRE Engineer

Hiring Organisation
Oscar Associates (UK) Limited
Location
Glasgow, Lanarkshire, Scotland, United Kingdom
Employment Type
Part Time
Salary
£575 - £625 per day
Contract We're looking for two experienced Azure Site Reliability Engineers to join a major Financial Services programme focused on platform health, reliability, and observability across a large-scale Azure environment. You'll be responsible for building and maintaining Azure platform health infrastructure using Terraform, developing Python-based automation … integrations, and implementing SLOs/SLIs across infrastructure and application layers. The role also involves working with observability tooling, event-driven integrations, and Azure-native services in a highly collaborative environment with engineering and product stakeholders. Required experience: * Strong hands-on Azure engineering experience * Terraform in production environments (primary ...

Agentic AI Data Architect

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
ModelOps - Azure AI Foundry (model hosting, versioning, monitoring); Evaluation frameworks (LLM-as-judge, test datasets); Prompt/version control, cost/latency monitoring DevOps & Observability - CI/CD pipelines (Azure DevOps/GitHub Actions); Logging, monitoring, observability (App Insights, etc.); Performance tuning and scalability As part of a leading global ...

Senior Platform Engineer

Hiring Organisation
REALM
Location
United Kingdom
building and owning the production infrastructure for a multi-user distributed system from the ground up. That means designing for debuggability and observability from day one, not bolting it on later. Core remit includes scalable multi-environment Terraform, secrets management, gradual deployment practices (blue/green), and the ability … testing. An AI/multi-agent infrastructure component is on the near-term roadmap. The stack IaC Terraform + Terragrunt Helm/Kubernetes AWS Observability Prometheus/Grafana Auth0 Rust/Golang NoSQL What they're looking for Production experience with Terraform, Helm/Kubernetes, AWS networking, and debugging multi ...

Data Reliability Engineer

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
work from home 2 days per week. This is a high-impact role focused on improving data quality, reducing incidents, and building scalable observability across a modern enterprise data platform. You’ll help ensure data across the organisation is accurate, reliable, and trusted for critical business decision-making. … style roles, with strong SQL and Python skills and experience working in modern cloud-based data environments. Hands‐on experience with data observability tools such as Grafana, Monte Carlo, or Acceldata, and data governance/quality platforms like Informatica, Collibra or Microsoft Purview is highly desirable. Experience within the Azure ...

DevOps Engineer

Hiring Organisation
Prism Digital
Location
Cambridge, England, United Kingdom
across two regions, with the plan to bring the second along the same path over time. They also need someone to introduce proper observability and monitoring - knowing when things aren't running, alerting the right people, and building the kind of visibility that lets the team respond rather than react. … session host infrastructure (being deprecated over the next 12-18 months) CI/CD tooling and working with the development team Observability and monitoring tooling - currently limited; you'd shape this Disaster recovery architecture MFC C++, .NET services, Angular front-end (context for the broader dev estate) Nice to Haves ...

Senior Software Engineer / SRE - Electronic Trading

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
Senior Software Engineer/SRE - Electronic Trading Location London Business Area Engineering and CTO Ref # 10050148 Description & Requirements About Observability Engineering Senior Software Engineers - SRE in Electronic Trading (ET) ensure our global enterprise products spanning fixed income, equities, and derivatives are resilient and observable. This role focuses on building … culture and platforms of observability and resilience to prevent market disruptions for global traders. We specialize in proactive anomaly detection, providing advanced performance insights and best practice guidance. Our team collaborates with application developers to define meaningful SLOs, implement chaos engineering, and build diagnostic tools that mitigate architectural risks ...

Technical Lead

Hiring Organisation
Findrs
Location
Aylesbury, England, United Kingdom
technical contribution. The Role As Software Tech Lead, the successful candidate will define and drive the overall software architecture across backend services, APIs, observability systems, data infrastructure, and cloud integrations. Working closely with product and engineering leadership, they will translate complex deployment requirements into scalable technical solutions while ensuring … part in setting technical standards across the business. From API consistency and schema evolution through to CI/CD practices, security baselines, and observability frameworks, the successful candidate will help establish the engineering foundations that support long term scale. On the backend, the role will involve guiding and contributing ...

Snowflake Data Cloud Architect

Hiring Organisation
Talent Software Services
Location
New York, United States
Employment Type
Permanent
Salary
USD 180,000 Annual
upstream and downstream system interoperability. Data Governance and Compliance: Implement RBAC, data masking, and encryption aligned with enterprise data policy. Ensure lineage and observability for regulatory reporting and audit. Technical Leadership: Act as a trusted advisor for architectural decisions and future-state roadmaps. Prepare technical specifications and design documentation. Innovation … upstream and downstream system interoperability. Data Governance and Compliance: Implement RBAC, data masking, and encryption aligned with enterprise data policy. Ensure lineage and observability for regulatory reporting and audit. Technical Leadership: Act as a trusted advisor for architectural decisions and future-state roadmaps. Prepare technical specifications and design documentation. Innovation ...

Senior AI Product Engineer

Hiring Organisation
Jobleads-UK
Location
York and North Yorkshire, England, United Kingdom
awareness Use tools such as DSPy (or similar) for optimisation and evaluation Deploy and operate services using Azure (OpenAI, Web Apps/Functions) Implement observability (Application Insights) and CI/CD (Azure DevOps) Contribute to infrastructure via Terraform Build high‐quality, async Python services with strong testing (pytest) Collaborate with … similar) Strong API and data modelling skills Experience with async Python Experience with Azure environments and CI/CD pipelines Familiarity with Terraform and observability tooling Minimum Qualifications Degree or equivalent Right to work in the country of employment Integrity and Ethics All StarCompliance employees are expected to commit ...

Principal Full Stack Engineer & Architecture Lead

Hiring Organisation
BCT Resourcing
Location
London, South East, England, United Kingdom
Employment Type
Full-Time
Salary
£80,000 - £90,000 per annum
technical design decisions * Define scalable, secure, and maintainable engineering standards * Provide technical leadership across frontend, backend, APIs, infrastructure, and integrations * Drive platform scalability, resilience, observability, and performance * Partner with leadership teams to align technical strategy with business goals * Act as the senior technical authority for complex engineering decisionsHands-On Engineering … Lambda, API Gateway, EventBridge, SQS, Step Functions, S3, CloudWatch, RDS)Backend Node.js, TypeScriptFrontend React, Next.js, Tailwind CSSData & Architecture PostgreSQL, Serverless, Event-Driven MicroservicesDevOps & Observability Terraform/AWS CDK, CI/CD, Monitoring & LoggingAbout YouWe are looking for a technically strong and commercially minded engineering leader with: * 10+ years of software ...