1 to 25 of 1,180 Observability Jobs

Expert Solution Architect

Hiring Organisation
Finastra
Location
London, UK
Employment Type
Full-time
/CD: Jenkins/GitHub Actions/Azure DevOps Containerization: Docker, Kubernetes, Helm IaC: Terraform OS: RedHat Enterprise Linux Familiarity with monitoring and observability tools: Prometheus, Grafana. Experience with testing frameworks and tools: JUnit, Postman. Knowledge of cloud-native architectures and migration strategies (preferably Azure). Strong documentation skills using ...

Senior DevOps Engineer

Hiring Organisation
Lloyds Banking Group
Location
Bolton, Greater Manchester, UK
Employment Type
Full-time
infrastructure as code (IaC) tools (Terraform) * Solid understanding of CI/CD pipelines, version control systems, and release management practices * Familiarity with monitoring and observability tools (Prometheus, Grafana, Dynatrace) * Knowledge of security best practices, compliance standards, and incident response protocols. * Strong analytical and problem-solving skills, with the ability ...

Senior Infrastructure Engineer

Hiring Organisation
American Express
Location
Phoenix, Arizona, United States
Employment Type
Permanent
Salary
USD Annual
Experience integrating AI and machine learning concepts into infrastructure operations or predictive monitoring. Practical knowledge of DevOps methodologies, CI/CD pipelines, and modern observability frameworks. In-depth experience with hardware lifecycle management (build, operate, maintain, decommission). Professional Skills Proven ability to lead complex infrastructure programs that span ...

DevOps Engineer

Hiring Organisation
Regional Recruitment Services
Location
Leicester, Leicestershire, East Midlands, United Kingdom
Employment Type
Permanent
Actions, CircleCI). - Configuration management or provisioning tools such as Ansible, Chef and Puppet, especially in hybrid-cloud/on-prem environments. - Monitoring and observability tooling (Prometheus, Grafana, logging stacks) to manage reliability and operations. - Solid scripting and systems knowledge such as Linux, networking, shell scripting, Python. Join our Talent ...

Solace Administrator

Hiring Organisation
BGC Group
Location
Slough, Berkshire, UK
Employment Type
Full-time
high availability, optimal performance, and reliability across production and non-production environments. This includes working on incident response, capacity planning, WAN optimization, and system observability using tools like Prometheus and Grafana. Key Responsibilities: Administer and maintain Solace PubSub+ appliances and software brokers across environments (on-prem and cloud). Provide ...

Full Stack Developer

Hiring Organisation
BPM Bi INC
Location
Washington, Washington DC, United States
Employment Type
Any
Salary
USD Annual
Manage databases (SQL and NoSQL) for structured and unstructured transit data. Operations: Conduct automated and manual testing, debugging, and performance optimization. Implement monitoring and observability tools (e.g., Prometheus, Grafana, Datadog) to ensure uptime and reliability. Collaborate with DevOps/IT operations teams for smooth deployments and environment management. Apply security ...

Mid-Level Cloud and Microservices Engineer

Hiring Organisation
Credence
Location
McLean, Virginia, United States
Employment Type
Permanent
Salary
USD Annual
Apply Zero Trust principles with least-privilege access, RBAC, and multi-factor authentication. Implement monitoring and logging solutions using CloudWatch, Grafana, and OpenSearch for observability and alerting. Support DevSecOps integration including code quality gates, image scanning, and compliance automation (OPA, Conftest, Checkov). Collaborate with development teams to containerize legacy ...

Senior Manager, Software Engineering, DevOps (People Leader)

Hiring Organisation
Capital One
Location
Richmond, Virginia, United States
Employment Type
Permanent
Salary
USD Annual
Capital One. The Cloud Operations Resilience Engineering (CORE) Technology division is responsible for enabling and evolving Capital One's foundational cloud infrastructure layer, including observability, connectivity, resilience and availability. What You'll Do: Lead a team managing a portfolio of diverse technology projects and developers specializing in automation, distributed microservices ...

Senior Manager, Software Engineering, DevOps (People Leader)

Hiring Organisation
Capital One
Location
Williamsburg, Virginia, United States
Employment Type
Permanent
Salary
USD Annual
Capital One. The Cloud Operations Resilience Engineering (CORE) Technology division is responsible for enabling and evolving Capital One's foundational cloud infrastructure layer, including observability, connectivity, resilience and availability. What You'll Do: Lead a team managing a portfolio of diverse technology projects and developers specializing in automation, distributed microservices ...

Senior Manager, Software Engineering, DevOps (People Leader)

Hiring Organisation
Capital One
Location
Fredericksburg, Virginia, United States
Employment Type
Permanent
Salary
USD Annual
Capital One. The Cloud Operations Resilience Engineering (CORE) Technology division is responsible for enabling and evolving Capital One's foundational cloud infrastructure layer, including observability, connectivity, resilience and availability. What You'll Do: Lead a team managing a portfolio of diverse technology projects and developers specializing in automation, distributed microservices ...

Senior Manager, Software Engineering, DevOps (People Leader)

Hiring Organisation
Capital One
Location
Petersburg, Virginia, United States
Employment Type
Permanent
Salary
USD Annual
Capital One. The Cloud Operations Resilience Engineering (CORE) Technology division is responsible for enabling and evolving Capital One's foundational cloud infrastructure layer, including observability, connectivity, resilience and availability. What You'll Do: Lead a team managing a portfolio of diverse technology projects and developers specializing in automation, distributed microservices ...

Senior Manager, Software Engineering, DevOps (People Leader)

Hiring Organisation
Capital One
Location
Goochland, Virginia, United States
Employment Type
Permanent
Salary
USD Annual
Capital One. The Cloud Operations Resilience Engineering (CORE) Technology division is responsible for enabling and evolving Capital One's foundational cloud infrastructure layer, including observability, connectivity, resilience and availability. What You'll Do: Lead a team managing a portfolio of diverse technology projects and developers specializing in automation, distributed microservices ...

Senior Manager, Software Engineering, DevOps (People Leader)

Hiring Organisation
Capital One
Location
Charlottesville, Virginia, United States
Employment Type
Permanent
Salary
USD Annual
Capital One. The Cloud Operations Resilience Engineering (CORE) Technology division is responsible for enabling and evolving Capital One's foundational cloud infrastructure layer, including observability, connectivity, resilience and availability. What You'll Do: Lead a team managing a portfolio of diverse technology projects and developers specializing in automation, distributed microservices ...

Senior Manager, Software Engineering, DevOps (People Leader)

Hiring Organisation
Capital One
Location
Norfolk, Virginia, United States
Employment Type
Permanent
Salary
USD Annual
Capital One. The Cloud Operations Resilience Engineering (CORE) Technology division is responsible for enabling and evolving Capital One's foundational cloud infrastructure layer, including observability, connectivity, resilience and availability. What You'll Do: Lead a team managing a portfolio of diverse technology projects and developers specializing in automation, distributed microservices ...

Senior Manager, Software Engineering, DevOps (People Leader)

Hiring Organisation
Capital One
Location
Newport News, Virginia, United States
Employment Type
Permanent
Salary
USD Annual
Capital One. The Cloud Operations Resilience Engineering (CORE) Technology division is responsible for enabling and evolving Capital One's foundational cloud infrastructure layer, including observability, connectivity, resilience and availability. What You'll Do: Lead a team managing a portfolio of diverse technology projects and developers specializing in automation, distributed microservices ...

Senior Software Engineer

Hiring Organisation
WRK DIGITAL LTD
Location
Leeds, West Yorkshire, Yorkshire, United Kingdom
Employment Type
Permanent, Work From Home
Salary
£70,000
Azure (you don't need to be an expert, but being interested helps!) Promote strong engineering practices around code quality, automated testing, peer reviews, observability, and security, helping to instil a culture of quality and accountability in engineering Collaborate closely with designers, product managers, and QA to ensure solutions ...

AWS Data Engineer

Hiring Organisation
W3Global
Location
Slough, Berkshire, UK
Employment Type
Full-time
environments, such as availability and provisioning time, to ensure they meet the needs of development and testing teams. Monitor environment health and performance: Use observability tools like Prometheus and Grafana to track the health of test environments, identify bottlenecks, and resolve issues proactively, not reactively. Manage incident response: Lead ...

Principal Software Architect Public Sector

Hiring Organisation
Anson Mccade
Location
Manchester, North West, United Kingdom
Employment Type
Permanent
OIDC, Zero Trust principles and government accreditation requirements . Oversee software quality, engineering standards, testing strategies, CI/CD pipelines, IaC (Terraform/Ansible), observability and resilience . Work alongside product, delivery, user research, DevOps and data teams to align user needs, policy requirements and technical feasibility. Mentor engineering ...

Test Environment Manager

Hiring Organisation
Adroit People Ltd
Location
London, United Kingdom
Employment Type
Permanent
Salary
£90,000
environments, such as availability and provisioning time, to ensure they meet the needs of development and testing teams. Monitor environment health and performance: Use observability tools like Prometheus and Grafana to track the health of test environments, identify bottlenecks, and resolve issues proactively, not reactively. Manage incident response: Lead ...

Head of Development and AI

Hiring Organisation
McCabe & Barton
Location
City of London, London, United Kingdom
Employment Type
Permanent
/CD pipelines and DevOps methodologies Knowledge of infrastructure monitoring (Datadog), log aggregation, and incident management Understanding of SLO/SLA definition and observability best practices Strategic & Business Acumen Ability to align technical initiatives with business objectives and articulate ROI Experience creating technical roadmaps and conducting cost-benefit analyses Track ...

Senior Data Scientist

Hiring Organisation
GlobalLogic
Location
London, UK
Employment Type
Full-time
management capabilities. Emerging Technology Awareness: Familiarity or hands-on exposure to AI/ML integration, knowledge graphs, data mesh/data fabric architectures, and observability tooling (e.g., Monte Carlo, DataDog, OpenLineage). Strong Communication & Leadership: Excellent stakeholder engagement, requirements gathering, and technical leadership skills. Proven ability to bridge business needs ...

Cloud engineer

Hiring Organisation
Adler & Allan Ltd
Location
Nelson, Lancashire, North West, United Kingdom
Employment Type
Permanent
secrets management, encryption) • Support compliance initiatives (ISO 27001, NIST, GDPR, MCERTS, etc.) • Manage network configuration, firewalls, and secure endpoints Monitoring & Reliability • Set up observability and monitoring tools (Prometheus, Grafana, Datadog, or CloudWatch) • Ensure high availability, scalability, and cost efficiency of cloud services • Define SLIs, SLOs, and SLAs for platform components ...

Lead Software Engineer, DevOps (Cloud Operations Resilience Engineering)

Hiring Organisation
Capital One
Location
Richmond, Virginia, United States
Employment Type
Permanent
Salary
USD Annual
Capital One. The Cloud Operations Resilience Engineering (CORE) Technology division is responsible for enabling and evolving Capital One's foundational cloud infrastructure layer, including observability, connectivity, resilience and availability. What You'll Do: Lead a portfolio of diverse technology projects and a team of developers with deep experience in machine ...

Lead Software Engineer, DevOps (Cloud Operations Resilience Engineering)

Hiring Organisation
Capital One
Location
Petersburg, Virginia, United States
Employment Type
Permanent
Salary
USD Annual
Capital One. The Cloud Operations Resilience Engineering (CORE) Technology division is responsible for enabling and evolving Capital One's foundational cloud infrastructure layer, including observability, connectivity, resilience and availability. What You'll Do: Lead a portfolio of diverse technology projects and a team of developers with deep experience in machine ...

Lead Software Engineer, DevOps (Cloud Operations Resilience Engineering)

Hiring Organisation
Capital One
Location
Norfolk, Virginia, United States
Employment Type
Permanent
Salary
USD Annual
Capital One. The Cloud Operations Resilience Engineering (CORE) Technology division is responsible for enabling and evolving Capital One's foundational cloud infrastructure layer, including observability, connectivity, resilience and availability. What You'll Do: Lead a portfolio of diverse technology projects and a team of developers with deep experience in machine ...