376 to 400 of 1,260 Observability Jobs

Senior Site Reliability Engineer

Hiring Organisation
Veloc Inc
Location
Irving, Texas, United States
Employment Type
Permanent
Salary
USD Annual
Partner with development teams to improve deployment safety, release reliability, and operational scalability. Drive standardization of cloud infrastructure, operational engineering practices, and deployment governance. Observability & Performance Optimization ( 15% of role) Build and maintain monitoring, logging, tracing, and alerting capabilities across distributed systems. Establish service-level objectives (SLOs), SLIs, and error …/CD pipelines, and Infrastructure as Code tools. Strong scripting and automation skills using Python, Bash, PowerShell, Go, or similar languages. Experience with observability and monitoring platforms such as Datadog, Grafana, Prometheus, or Splunk. Strong understanding of networking, Linux/Windows administration, distributed systems, and cloud-native architectures. Experience with ...

Senior DevOps, Infrastructure & Security Engineer

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
vulnerability management, threat modelling, penetration testing, and incident response planning Build and evolve CI/CD pipelines, release management processes, and deployment automation Establish observability, monitoring, logging, alerting, and operational runbooks Manage secrets, key custody, access controls, and infrastructure governance Deliver backup, disaster recovery, and business continuity strategies Drive compliance … Kubernetes, Docker, and containerised application deployment Modern CI/CD platforms including GitHub Actions, Cloud Build, Buildkite, CircleCI, or similar Cloud platforms, ideally GCP Observability tooling including Prometheus, Grafana, OpenTelemetry, or equivalent PostgreSQL operations, backup, recovery, and data durability Identity management, API gateways, networking, and access controls Bash and Python ...

Senior Platform Engineer

Hiring Organisation
Akixi
Location
United Kingdom
inventory structuring, and role-based automation. Manage secrets securely using services such as AWS Secrets Manager or HashiCorp Vault. Implement robust monitoring, alerting, and observability tooling (e.g., CloudWatch, Prometheus, Grafana, Datadog). Participate in incident response, root cause analysis, and resilience improvements. Maintain and evolve CI/CD pipelines using … container orchestration and deployment (Docker, ECS, or Kubernetes). Proficient with GitOps or IaC-based workflows. Familiarity with Google SRE practices, particularly around reliability, observability, and operational excellence. Understanding of systems reliability metrics and associated tooling Soft Skills & Behaviours Self-driven with a bias toward action and ownership. Excellent communicator ...

Senior DevOps Engineer

Hiring Organisation
Novatus
Location
City of London, London, United Kingdom
Responsibilities: Design and implement AWS cloud infrastructure, writing clean and maintainable Infrastructure as Code for repeatable, auditable deployments across environments. Drive reliability, scalability, security, observability and cost efficiency of AWS infrastructure. Provide mentorship to other DevOps engineers, sharing expertise and raising engineering quality across the team. Develop and maintain robust … security, resilience, and cost optimization. Hands-on experience deploying and operating applications on Kubernetes in production, including Helm/manifests, ingress, configuration/secrets, observability, and troubleshooting live workloads. Proficient with Infrastructure as Code, especially Terraform, with experience building maintainable, reusable modules and environment patterns. Solid production networking knowledge ...

Software Engineer

Hiring Organisation
Visa
Location
London, UK
Employment Type
Full-time
related technologies. Collaborate with product and engineering teams to understand requirements and deliver platform capabilities that accelerate their development. Drive best practices for observability, reliability, and scalability in distributed systems. Mentor and support engineers within the team, fostering a culture of technical excellence and continuous improvement. Act as an evangelist … next phase of growth, are written to 12-factor principles and fit into our microservices architecture Cloud-related tools, services, and distributed system observability to support these applications, such as Docker, Kubernetes,ElasticSearch, log management systems, and Datadog APM, to name but a few API specifications, conforming to the OpenAPI ...

Site Reliability Engineer

Hiring Organisation
Jobleads-UK
Location
Greater London, England, United Kingdom
management skills (AWS cloud skills is secondary). Primary Responsibilities: • Work closely with Product Engineering team and implement strategies for modernizing IT operations enhancing observability and toil reduction. • Architect and deploy observability platforms to monitor system health, performance, and reliability effectively. • Propose & drive strategies for AI-driven alerting and proactive … reliability, automation, and continuous improvement across the organization. Key Skills: • Strong expertise in implementing Site Reliability Engineering (SRE) principles. • Advanced knowledge of establishing observability using tools – Dynatrace & Datadog (primary skills). • Proficiency in automation & scripting using Python & Ansible (primary skills). • Strong experience with cloud platforms – AWS & Azure (primary skills ...

Platform Engineering Manager

Hiring Organisation
Tria
Location
Hampshire, United Kingdom
Employment Type
Permanent
Salary
£70000 - £77000/annum + bonus
evolve Azure platform services and standards Drive automation and Infrastructure as Code practices Contribute to architecture, technical design and engineering standards Support service reliability, observability, security and operational improvements Work closely with engineering, architecture and security teams Participate in technical decision-making and platform strategy Requirements Strong Azure cloud engineering ...

Platform Engineering Manager

Hiring Organisation
Tria
Location
Worcestershire, United Kingdom
Employment Type
Permanent
Salary
£70000 - £77000/annum + bonus
evolve Azure platform services and standards Drive automation and Infrastructure as Code practices Contribute to architecture, technical design and engineering standards Support service reliability, observability, security and operational improvements Work closely with engineering, architecture and security teams Participate in technical decision-making and platform strategy Requirements Strong Azure cloud engineering ...

Agentic AI Architect

Hiring Organisation
Emporia Consulting Group Limited
Location
London, United Kingdom
Employment Type
Contract, Work From Home
Contract Rate
Paying up to £1100 per day
learning in real-world deployments. Experience integrating AI agents into enterprise apps like Salesforce, ServiceNow, SAP, or custom apps via APIs. Understanding of AI observability, performance monitoring, and ethical guidelines in GenAI systems. Package for the Agentic AI Architect, Agentic Artificial Intelligence, Architecture Hybrid position Paying ...

Artificial Intelligence Engineer

Hiring Organisation
Airswift
Location
London Area, United Kingdom
production‐ready tools and clear insights. Implement LLM and agentic workflows (prompt engineering, LangGraph, MCP, tool calling, retrieval, guardrails). Productionise solutions with testing, observability, versioning, documentation, and CI/CD. Must‐Have Skills: Hands‐on Databricks and Spark expertise (PySpark, SQL, Delta, Unity Catalog). Strong data engineering background ...

Azure Site Reliability Engineer

Hiring Organisation
Context
Location
Manchester, North West, United Kingdom
Employment Type
Permanent, Work From Home
Salary
£65,000
continuous delivery Solid understanding of DevOps and SRE principles Hands-on experience with Infrastructure as Code (Terraform, Bicep, ARM templates) Experience with monitoring and observability tools Desirable: Azure certifications (or working towards) Experience within a managed services or multi-client environment Remote based. Paying between 55,000-65,000, depending ...

Software Engineer Lead

Hiring Organisation
V2Soft
Location
Pittsburgh, Pennsylvania, United States
Employment Type
Permanent
Salary
USD Annual
monitoring, and performance Nice to Have Skills: Kafka Streams ksqlDB Kafka Connect Cloud-based Kafka platforms and containerized deployments Familiarity with CI/CD, observability, and security best practices Education: Bachelor required. V2Soft is an Equal Opportunity Employer ( EOE). We welcome applicants from all backgrounds, including individuals with disabilities ...

Software Development Engineer in Test

Hiring Organisation
A-Line Staffing Solutions LLC
Location
United States
Employment Type
Permanent
Salary
USD 500 Annual
cases for each feature Collaborate with teams to triage bugs and investigate failures in various environments Monitor and maintain test pipelines, and use observability tools and test logs to identify and fix failures Clearly communicate bugs and proactively initiate discussions to aid issue discovery and resolution Leverage cutting-edge ...

Lead AI Engineer - Media Workflows

Hiring Organisation
Harnham
Location
City of London, London, United Kingdom
Bonus: 10% annual bonus • Working model: Hybrid (2 days per week in Soho, London) • Tech stack: Python, Azure, OpenAI APIs, CI/CD, Docker, observability • Visa sponsorship: Cannot currently sponsor Interested? Please apply below. ...

Full Stack Engineer

Hiring Organisation
Ultimate Asset
Location
United Kingdom
Ability to collaborate cross functionally with product, design and commercial teams to deliver end to end features • Strong understanding of performance, scalability, security and observability best practice ...

Data Platform Engineer

Hiring Organisation
Rising Associates
Location
Milton Keynes, England, United Kingdom
technically complex environments Strong communication and stakeholder collaboration skills Desirable Skills Terraform, GitHub Actions, and Infrastructure as Code tooling Exposure to cloud monitoring and observability platforms Understanding of Zero Trust or modern cloud security principles Experience with cost optimisation and cloud reporting tools Relevant technical degree or certifications beneficial ...

Senior Software Engineer

Hiring Organisation
Harnham - Data & Analytics Recruitment
Location
London, South East, England, United Kingdom
Employment Type
Full-Time
Salary
£85,000 - £95,000 per annum, OTE
capacity planning, performance testing, and fault tolerance Identify and resolve performance bottlenecks across applications, APIs, and data pipelines to improve speed and efficiency Enhance observability and operational stability through better monitoring, logging, and proactive issue detection Establish and promote strong engineering standards, including clean code, automated testing, CI/ ...

D365 Lead Developer

Hiring Organisation
Reed
Location
Bedford, Bedfordshire, England, United Kingdom
Employment Type
Temporary
Salary
£500 - £600 per day, Inc benefits
Infrastructure as Code (IaC) to provision and manage cloud environments. Release code and D365 solutions via Azure DevOps pipelines. Implement robust logging, monitoring, and observability patterns across D365 and Azure components. Document technical designs, configurations, and deployment procedures to support maintainability and knowledge transfer. Required Skills & Qualifications: Strong C#/ ...

Lead ML Engineer (12 month FTC)

Hiring Organisation
Harnham - Data & Analytics Recruitment
Location
Manchester, Lancashire, England, United Kingdom
Employment Type
Full-Time
Salary
£80,000 - £90,000 per annum
solutions for large scale unstructured data, including complex document processing and LLM ready data pipelines. Own MLOps practices, covering CI/CD, model serving, observability and lifecycle management. Provide hands on technical leadership, contributing to architecture decisions and best practice. Act as a delivery focused partner to stakeholders, confidently explaining ...

(198488) Lead ML Engineer

Hiring Organisation
Harnham
Location
England, United Kingdom
solutions for large scale unstructured data, including complex document processing and LLM ready data pipelines. Own MLOps practices, covering CI/CD, model serving, observability and lifecycle management. Provide hands on technical leadership, contributing to architecture decisions and best practice. Act as a delivery focused partner to stakeholders, confidently explaining ...

EKS Engineer

Hiring Organisation
Ampstek
Location
London Area, United Kingdom
Terraform • Implement GitOps-based deployments using Argo CD or similar tools • Collaborate with platform, DevOps, and SRE teams for scalable and resilient architecture • Ensure observability, reliability, and performance of the platform Must-Have Skills (Critical Expertise) • Hands-on experience in AWS EKS cluster build and lifecycle management • Strong understanding ...

Data Engineer

Hiring Organisation
Searchability (UK) Ltd
Location
Manchester, North West, United Kingdom
Employment Type
Permanent, Work From Home
optimise pipeline performance, implement CI/CD processes for data workflows, and maintain strong governance and security standards across the data platform. Documentation, observability, and cost optimisation will also form part of your responsibilities as you contribute to the ongoing development of the organisation's data ecosystem. OUR BENEFITS: Competitive ...

Senior Software Engineer

Hiring Organisation
Cititec
Location
City of London, London, United Kingdom
cloud-native environments, particularly Kubernetes-based architectures. Familiarity with distributed data and event-driven systems (e.g. Kafka-style messaging patterns). Experience with observability, monitoring, testing, and production incident response in live systems. Highly Desirable Experience in commodities markets (energy, metals, agriculture, freight) or other complex, multi-venue asset classes. ...

Senior Software Engineer - TypeScript / Next.Js / SQL

Hiring Organisation
Adria Solutions
Location
Manchester, North West, United Kingdom
Employment Type
Permanent, Work From Home
Salary
£75,000
operational discussions with customers over calls and video meetings Building systems that are accurate, auditable, explainable, scalable, and maintainable Driving improvements in reliability, observability, documentation, and engineering standards Using AI-assisted development tools to accelerate delivery while maintaining engineering quality and judgement Helping shape technical direction, architecture, and long-term ...

Data Engineering Lead (Fabric)

Hiring Organisation
ECS Resource Group
Location
Newark, Lincolnshire, UK
data models Translate business requirements into pragmatic, scalable data architecture decisions Define, implement, and maintain data contracts aligned to business domains Establish data quality, observability, and SLA frameworks Engineering Delivery Design and evolve scalable data pipelines (batch and incremental watermark‐based ingestion) using: Microsoft Fabric Pipelines Azure Data Factory PySpark ...