Permanent Observability Job Vacancies

1 to 25 of 677 Permanent Observability Jobs

DevOps/Site Reliability Engineer, Junior/Mid/Senior (m/f/ )

United Kingdom
Hybrid / WFH Options
Crane Venture Partners
such as Kubernetes, Docker Swarm, or HashiCorp Nomad. Excellent problem-solving, communication, and collaboration skills. Nice to have: Experience managing distributed systems, microservices, and event-driven architectures. Knowledge of observability tools such as Prometheus, Grafana, ELK Stack, or Datadog. Experience with security best practices, monitoring, and incident response. Familiarity with DevSecOps and compliance frameworks (ISO 27001, SOC 2, GDPR). More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior DevOps Engineer

Liverpool, Lancashire, United Kingdom
Hybrid / WFH Options
The Granite Group
with GitOps tools (e.g., ArgoCD, Flux). CI/CD - Skilled in building and managing pipelines using Azure DevOps, GitHub Actions, etc. Monitoring - Experience with Prometheus, Grafana, and other observability tools. Application Stack - Familiarity with .NET, Node.js, React, and web server technologies like Nginx. Relevant certifications or the ability to demonstrate equivalent experience, such as: Terraform Associate About Acorn Insurance More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Monitoring & Observability Engineer

South East London, London, United Kingdom
COMPUTACENTER (UK) LIMITED
GPS). Our teams operate across the UK, Germany, France, and India, delivering complex, enterprise-grade IT solutions and consultancy across infrastructure, cloud, and modern operations. As a Monitoring & Observability Engineer, you'll work in high-impact delivery teams that support some of the worlds most well-known organisations. Youll play a key role in helping our customers achieve greater … visibility, performance, and reliability across their IT estatescontributing to their operational success through proactive insight and incident prevention. What you'll do Design, implement, and manage observability solutions using industry-leading tools such as Dynatrace (primary), Grafana, and Splunk Collect and analyse telemetry data (metrics, logs, traces, events) to diagnose and resolve system and application performance issues Integrate monitoring platforms … with ITSM tools (e.g. ServiceNow) and CI/CD pipelines to enable proactive alerting and resolution workflows Act as a Monitoring & Observability SME within customer delivery teams Support incident response activities and postmortems by identifying patterns, root causes, and optimisation opportunities Work collaboratively with cross-functional teams to define and implement best practices in observability and monitoring Attend customer and More ❯
Employment Type: Permanent
Posted:

Monitoring & Observability Engineer

London, United Kingdom
Computacenter AG & Co. oHG
Select how often (in days) to receive an alert: Monitoring & Observability Engineer Life on the team At Computacenter, you'll be joining a world-class team of over 1,000 skilled professionals within Group Professional Services (GPS). Our teams operate across the UK, Germany, France, and India, delivering complex, enterprise-grade IT solutions and consultancy across infrastructure, cloud, and … modern operations. As a Monitoring & Observability Engineer, you'll work in high-impact delivery teams that support some of the world's most well-known organisations. You'll play a key role in helping our customers achieve greater visibility, performance, and reliability across their IT estates-contributing to their operational success through proactive insight and incident prevention. What you'll … do Design, implement, and manage observability solutions using industry-leading tools such as Dynatrace (primary), Grafana, and Splunk Collect and analyse telemetry data (metrics, logs, traces, events) to diagnose and resolve system and application performance issues Integrate monitoring platforms with ITSM tools (e.g. ServiceNow) and CI/CD pipelines to enable proactive alerting and resolution workflows Act as a Monitoring More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Monitoring & Observability Engineer

Lakenheath, Suffolk, United Kingdom
Computacenter AG & Co. oHG
Select how often (in days) to receive an alert: Monitoring & Observability Engineer Life on the team At Computacenter, you'll be joining a world-class team of over 1,000 skilled professionals within Group Professional Services (GPS). Our teams operate across the UK, Germany, France, and India, delivering complex, enterprise-grade IT solutions and consultancy across infrastructure, cloud, and … modern operations. As a Monitoring & Observability Engineer, you'll work in high-impact delivery teams that support some of the world's most well-known organisations. You'll play a key role in helping our customers achieve greater visibility, performance, and reliability across their IT estates-contributing to their operational success through proactive insight and incident prevention. What you'll … do Design, implement, and manage observability solutions using industry-leading tools such as Dynatrace (primary), Grafana, and Splunk Collect and analyse telemetry data (metrics, logs, traces, events) to diagnose and resolve system and application performance issues Integrate monitoring platforms with ITSM tools (e.g. ServiceNow) and CI/CD pipelines to enable proactive alerting and resolution workflows Act as a Monitoring More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Delivery Engineer

United Kingdom
Hybrid / WFH Options
Sportserve
Python (or other language), Bash/Shell, YAML including any Development frameworks Extensive experience and in-depth knowledge of the Linux operating system for effective troubleshooting activities Experience with Observability tools like Grafana, Prometheus, ELK, OCI Observability We highly value ownership and initiative with capabilities to drive projects independently Dealing with changes on a daily basis in a very dynamic More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Site Reliability Engineer

London, United Kingdom
Hybrid / WFH Options
Stratospherec Ltd
one or more public cloud providers such as Azure, AWS or GCP Proficiency using Infrastructure as Code (IaC) tools such as Terraform (preferred), Ansible, or CloudFormation. Experience with monitoring, observability and logging tools such as DataDog, Prometheus, Grafana, or similar. Proven track record of maintaining highly-available and performant production environments. Ability to identify and implement effective mitigation strategies and More ❯
Employment Type: Permanent
Salary: £85000 - £90000/annum Excellent Benefits package
Posted:

Senior Infrastructure Engineer with Security Clearance

Chantilly, Virginia, United States
CACI
relevant technologies (e.g., CKA, CKAD, AWS, Azure, or GCP certifications) Experience with CI/CD pipelines and tools (e.g., Jenkins, GitLab CI, or GitHub Actions) Knowledge of monitoring and observability tools (e.g., Prometheus, Grafana, ELK stack) - What You Can Expect: A culture of integrity. At CACI, we place character and innovation at the center of everything we do. As a More ❯
Employment Type: Permanent
Salary: USD 206,800 Annual
Posted:

Staff Infrastructure/DevSecOps Engineer (TS/SCI with Full Scope with Security Clearance

Herndon, Virginia, United States
ARKA Group LP
development methodology Experience with container tools (Docker, Podman) and container orchestration Experience delivering microservices in Kubernetes-based systems Experience with infrastructure as code tools (Terraform, CloudFormation) Basic understanding of observability tools like Prometheus, Grafana or similar Location: Herndon, VA Herndon offers a charming blend of small-town ambiance and modern conveniences. In historic Herndon you'll find small town charm More ❯
Employment Type: Permanent
Salary: USD Annual
Posted:

Senior DevOps Engineer, Clinical Software

United Kingdom
Waters Corporation
to maintain a CI build environment capable of running automation tests for effective feedback. Assist in designing, developing and implementing automation test frameworks. Develop and improve our monitoring and observability tooling. Coach and mentorteam matesto improve their own DevOps skills and experience Research emerging tools, trends and methodologies Assist in managing checked in source code from check-in through to More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Site Reliability Engineer with Security Clearance

San Antonio, Texas, United States
Hybrid / WFH Options
BridgePhase, LLC
this role, you'll help ensure the stability, scalability, and security of mission-critical cyber systems. As part of a collaborative, agile team, you'll be responsible for building observability, managing performance, automating operations, and enhancing the resilience of cloud-native platforms that support cyber operations across the Department of Defense. Ideal candidates are comfortable bridging the gap between software … to: Build and maintain scalable, resilient infrastructure using Infrastructure as Code (IaC) and Configuration as Code (CaC) tools such as Terraform, Ansible, and Helm. Design, implement, and maintain robust observability solutions-logging, metrics, and tracing-to support 24/7 mission awareness. Automate platform operations, including system provisioning, patching, and recovery, to reduce manual effort and increase uptime. Monitor system … beyond those listed above. Preferred Experience and Qualifications: Hands-on experience in Site Reliability Engineering, Cloud Infrastructure, DevSecOps, or System Administration within secure, mission-critical environments. Strong expertise in observability tools such as Prometheus, Grafana, ELK Stack, Fluent Bit, or OpenTelemetry. Deep knowledge of containerization (Docker) and orchestration (Kubernetes), including optimization and troubleshooting. Proficiency in AWS services and cloud-native More ❯
Employment Type: Permanent
Salary: USD Annual
Posted:

Junior Delivery Engineer

United Kingdom
Hybrid / WFH Options
Sportserve
operating system for effective troubleshooting activities Awareness of any cloud infrastructure principles (like AWS, GCP or OCI), understanding basic principles of secure software delivery is a plus Familiar with Observability tools like Grafana or Prometheus, understanding the importance of giving the correct visibility to our platforms and environments We highly value ownership and initiative with capabilities to drive projects independently More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Cloud Technical Architect / Data DevOps Engineer

Bristol, Gloucestershire, United Kingdom
Hewlett Packard Enterprise Development LP
etc. Infrastructure as Code and CI/CD paradigms and systems such as: Ansible, Terraform, Jenkins, Bamboo, Concourse etc. Monitoring utilising products such as: Prometheus, Grafana, ELK, filebeat etc. Observability - SRE Big Data solutions (ecosystems) and technologies such as: Apache Spark and the Hadoop Ecosystem Edge technologies e.g. NGINX, HAProxy etc. Excellent knowledge of YAML or similar languages The following More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Site Reliability Engineer (SRE) with Security Clearance

Reston, Virginia, United States
Hybrid / WFH Options
CGI
tools like Jenkins, GitLab, Docker, and artifact repositories. Proficient in at least one programming or scripting languagePython, Java, Node.js, Bash, or PowerShell are all great. Familiar with monitoring and observability tools like CloudWatch, Splunk, Dynatrace, or OpenTelemetry. Understands and applies security best practices, including IAM, RBAC, and vulnerability management. Experience designing and supporting microservices and APIs, with a focus on More ❯
Employment Type: Permanent
Salary: USD 137,100 Annual
Posted:

Manual Tester (DV Security Clearance)

Basingstoke, Hampshire, South East
CGI
Manual Tester (DV Security Clearance) Position Description Are you an experienced Test Analyst with a background in secure or classified programmes, ready to contribute to projects of national importance? Step into a role where you'll challenge the complex to More ❯
Employment Type: Permanent
Posted:

Lead Site Reliability Engineer

Belgium
Tenth Revolution Group
engineering teams to deploy faster and more confidently-without compromising stability or uptime. As the SRE Lead, you'll mentor a growing team of SREs, drive best practices in observability, automation, and incident management, and collaborate cross-functionally to ensure a seamless experience for both our internal teams and customers. What You'll Be Doing: Leadership & Strategy -Lead and grow … Proven experience in an SRE or DevOps leadership role. -Deep understanding of networking, containers (Docker, Kubernetes), and cloud infrastructure (AWS/GCP/Azure). -Strong skills in monitoring, observability, and alerting systems (Prometheus, Grafana, Datadog, etc.). -Proficiency with infrastructure-as-code tools like Terraform or Pulumi. -Experience with CI/CD pipelines and GitOps practices. -Excellent communication and More ❯
Employment Type: Permanent
Salary: EUR Annual
Posted:

Lead Site Reliability Engineer - Cloud

Bristol, Avon, England, United Kingdom
Hybrid / WFH Options
Robert Walters
of cloud infrastructure and applications on Google Cloud Platform. You will work collaboratively with engineering and infrastructure teams to implement site reliability engineering (SRE) principles, focusing on system reliability, observability, automation, and operational excellence. This role follows a hybrid working model, requiring attendance at the Bristol office for at least two days per week or 40% of the working time. … objectives (SLOs), indicators (SLIs), and monitoring practices Hands-on experience with infrastructure as code (e.g., Terraform) and CI/CD tools (e.g., Jenkins, Azure DevOps) Desirable Knowledge Familiarity with observability and performance tools such as Dynatrace, Stackdriver, Cloud Monitoring, or similar Exposure to cost monitoring, logging frameworks, and cloud consumption analytics Personal Attributes Ability to mentor and support engineers in More ❯
Employment Type: Full-Time
Salary: £90,000 - £110,000 per annum
Posted:

Senior Platform Developer

Edinburgh, United Kingdom
Hybrid / WFH Options
Registers of Scotland
WAF, CloudFront, API GW, AWS Organizations, S3, ECS, EKS, Route 53, ELBs, OpenShift, Kubernetes, Docker Languages: TypeScript, Python Security & Scanning: AWS Guardrails, Checkov, Prisma Cloud, OSV Scanner, SonarQube, Renovate Observability & Logging: CloudWatch, OpenSearch Operating System Management: RedHat Satellite, AMI lifecycle management, Ubuntu Landscape Testing Tools: Pytest, Jest, Cypress APIs/Microservices: RESTful APIs, API Gateway, containerised services Version Control: GitLab … to as Senior DevOps Engineer. On a typical day you will Design, build, and maintain scalable, high-quality software and platform systems Implement and manage CI/CD pipelines, observability, security automation, automated testing, and engineering standards Lead feature development from concept to production with focus on quality and performance Troubleshoot issues, ensuring resilience, reliability, and minimal user disruption Contribute More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Software Engineer L3 QKFO - TS/SCI FS Poly required with Security Clearance

Columbia, Maryland, United States
Emtak LLC
and MinIO in production environments. Familiarity with infrastructure-as-code and automation using cloud-init or Terraform. Experience with CI/CD pipelines and Git-based workflows. Background in observability (Prometheus, Grafana, or similar). Experience with Rancher Suite (Harvester, Longhorn, KubeVirt). Prior work with AWS (EKS, S3, Lambda, RDS) or other cloud platforms. Maintain and improve documentation and More ❯
Employment Type: Permanent
Salary: USD Annual
Posted:

Software Engineer Level 3 with Security Clearance

Annapolis Junction, Maryland, United States
Jovian Concepts
and MinIO in production environments. Familiarity with infrastructure-as-code and automation using cloud-init or Terraform. Experience with CI/CD pipelines and Git-based workflows. Background in observability (Prometheus, Grafana, or similar). Experience with Rancher Suite (Harvester, Longhorn, KubeVirt). Prior work with AWS (EKS, S3, Lambda, RDS) or other cloud platforms. Maintain and improve documentation and More ❯
Employment Type: Permanent
Salary: USD Annual
Posted:

Lead Site Reliability Engineer

Caldecotte, Milton Keynes, Buckinghamshire, England, United Kingdom
Connells Group HQ
for someone who has: Strong .NET framework knowledge (C#,ASP.NET Core etc..) Expertise in Windows Server administration Database administration (SQL Server primarily) Ability to instrument and consume monitoring and observability tools (Application Insights, Prometheus, Grafana) Experience using PowerShell, Azure CLI, and Bash for automation tasks Previous experience with Azure DevOps, Jenkins, GitHub Actions, or similar tools Containerisation and orchestration (Docker More ❯
Employment Type: Full-Time
Salary: Competitive salary
Posted:

Senior Azure Site Reliability Engineer

London, United Kingdom
Hybrid / WFH Options
Nordcloud group
to L3 networking Programming languages, such as C#, Python, Perl, Java, C++ CICD tools such as Azure DevOps, GitHub Actions, Gitlab, Jenkins, TeamCity Scripting languages such as PowerShell, bash Observability/Monitoring: Prometheus, Grafana, Splunk Containerisation tools such as Docker, K8S, OpenShift, EC, containers Hosting technologies such as IIS, nginx, Apache, App Service, LightSail Analytical and creative approach to problem More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Lead DevOps Engineers

England, United Kingdom
InterQuest Solutions
Go • Significant experience with AWS cloud infrastructure • Deep understanding of IaC tools: Terraform, Packer, CloudFormation • Proven leadership in multidisciplinary delivery teams • Skills in Databases: MongoDB/Atlas, Messaging: Kafka, Observability: Prometheus, Grafana, Splunk • Experience of working in a DevOps environment - favouring and implementing Continuous Integration & Deployment over manual processes. • Experience of designing, implementing, securing and supporting Unix/Linux based More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Software Architect - Switzerland

Buchs, St. Gallen, United Kingdom
Proactive Global
scaffolding Collaborate with engineering teams, QA, DevOps, and product managers to deliver integrated solutions Mentor engineers in architectural thinking and AI-assisted development Ensure architectural alignment across systems with observability using Prometheus, Grafana, ELK Stack Required Skills & Qualifications: Master's degree in Computer Science, Software Engineering, or related field 8+ years of software engineering experience, with 3+ years in architectural More ❯
Employment Type: Permanent
Salary: £138118 - £164016/annum
Posted:

Platform(DevOps) Engineer

United Kingdom
Hybrid / WFH Options
Sportserve
Linux - RHEL and debian based flavours Cloud - Oracle Cloud CI/CD - gitlab-ci, jenkins, ansible, terraform, Helm, Fluxcd Web servers - nginx Caching - Redis Messaging queues - Kafka Monitoring and Observability - ELK, Grafana, Prometheus, Thanos Traffic Management - Haproxy, Keepalived, Cloud Load Balancing, etc Scripting - bash, python Virtualisation and Orchestration - docker, kubernetes Databases - Cloud Managed MySql Desireable OS Administration - Linux - debian based More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:
Observability
10th Percentile
£57,500
25th Percentile
£65,000
Median
£80,000
75th Percentile
£97,500
90th Percentile
£120,000