1 to 25 of 1,510 Observability Jobs

AWS Senior Platform Engineer (FinOps) - ACTIVE Security Clearance required.

Hiring Organisation
Appvia
Location
Slough, Berkshire, UK
Employment Type
Full-time
estate, maturing their cloud financial governance, and enabling cost-efficient, scalable platform operations. You will work closely with client teams to build automation, enhance observability, improve cloud cost controls, and embed FinOps principles across engineering and platform functions. This role requires both deep AWS engineering skills and strong awareness ...

Lead DevOps Engineer

Hiring Organisation
Tek Spikes
Location
Dallas, Texas, United States
Employment Type
Permanent
Salary
USD Annual
Kubernetes (EKS/AKS/GKE) . Collaborate with dev, QA, and architecture teams to enhance release velocity and stability. Establish monitoring, logging, and observability using Prometheus, Grafana, ELK, Splunk, Datadog , etc. Manage production issues, identify root causes, and enforce system reliability best practices. Mentor and guide junior ...

DevOps Engineer

Hiring Organisation
Chartmetric
Location
New York, United States
Employment Type
Permanent
Salary
USD Annual
/CD tools (Jenkins, GitLab CI, GitHub Actions, or similar) Strong scripting skills in Python, Bash, or Go Experience with monitoring and observability tools (Prometheus, Grafana, ELK stack, or similar) Developer Experience Focus Experience building internal tools and platforms for development teams Understanding of software development lifecycle and common developer ...

DevOps Engineer

Hiring Organisation
Love2shop
Location
Birkenhead, Merseyside, UK
Employment Type
Full-time
production support (1 week out of 6) As well as making improvements to: • Deployment automation and release management processes • Application and infrastructure monitoring and observability • Security scanning and vulnerability management in pipelines • Performance optimization and capacity planning • Development team productivity through tooling and automation What we would like from ...

DevOps Engineer

Hiring Organisation
Love2shop
Location
Warrington, Cheshire, UK
Employment Type
Full-time
production support (1 week out of 6) As well as making improvements to: • Deployment automation and release management processes • Application and infrastructure monitoring and observability • Security scanning and vulnerability management in pipelines • Performance optimization and capacity planning • Development team productivity through tooling and automation What we would like from ...

Expert Solution Architect

Hiring Organisation
Finastra
Location
London Area, United Kingdom
/CD: Jenkins/GitHub Actions/Azure DevOps Containerization: Docker, Kubernetes, Helm IaC: Terraform OS: RedHat Enterprise Linux Familiarity with monitoring and observability tools: Prometheus , Grafana . Experience with testing frameworks and tools: JUnit , Postman . Knowledge of cloud-native architectures and migration strategies (preferably Azure). Strong documentation ...

Staff Site Reliability Engineer

Hiring Organisation
Vgs
Location
United States
Employment Type
Permanent
Salary
USD Annual
Docker, Kubernetes (EKS), Kafka (MSK), Java, Spring Framework, Python, and AWS services. Strong plus if you are a database wiz. Expertise in monitoring and observability tools like Prometheus, Grafana, Honeycomb, Datadog, Open Telemetry, New Relic, or similar tools to measure system health and performance. Programming and scripting experience in languages ...

Site Reliability Engineer Sr. Consultant

Hiring Organisation
Visa
Location
Austin, Texas, United States
Employment Type
Permanent
Salary
USD Annual
protocols. You should have experience with cloud migration, automation tools, and applying Generative AI to improve operational efficiencies. Proficiency in containerization technologies (Docker, Kubernetes), observability tools, and distributed caching systems is essential. As a SME in Visa's Operations and Infrastructure division, you will join our Global PRE team ...

Site Reliability Engineer, Sr. Consultant level

Hiring Organisation
Visa
Location
Austin, Texas, United States
Employment Type
Permanent
Salary
USD Annual
protocols. You should have experience with cloud migration, automation tools, and applying Generative AI to improve operational efficiencies. Proficiency in containerization technologies (Docker, Kubernetes), observability tools, and distributed caching systems is essential. As a SME in Visa's Operations and Infrastructure division, you will join our Global PRE team ...

AI Engineer

Hiring Organisation
Metropolitan Commercial Bank
Location
New York, United States
Employment Type
Permanent
Salary
USD Annual
workflows. Uses Docker, Kubernetes, and IaC tools (Terraform, CloudFormation) for scalable deployments. Experienced in CI/CD, real-time inference, GPU optimization, and ML observability (Prometheus, Grafana, MLflow). Full-Stack Development: Capable of building end-to-end AI solutions, from front-end (React) to back-end APIs (Flask, FastAPI ...

DevOps Engineer - Sr.

Hiring Organisation
Securiport
Location
Reston, Virginia, United States
Employment Type
Permanent
Salary
USD Annual
Bash, Python, for automation and tooling. Infrastructure as Code (IaC): Experience managing infrastructure using tools like Terraform or CloudFormation Monitoring & Logging: Familiarity with observability stacks such as Prometheus, Grafana, ELK/EFK Stack, or other application performance monitoring. Version Control: Advanced knowledge of Git, branching strategies, and repository management. Windows ...

Infrastructure Engineer - Bare Metal Kubernetes

Hiring Organisation
Defense Unicorns
Location
United States
Employment Type
Permanent
Salary
USD Annual
YAML, Bash) and automated CI/CD pipelines. Knowledge of Kubernetes networking, service meshes (Istio), ingress/egress controllers (metallb, nginx). Experience with observability stacks (Prometheus, Grafana, Loki, Promtail). Security and compliance practices: runtime security (Neuvector), policy enforcement (Kyverno, Pepr). Ability to produce technical design docs, ADRs ...

Senior Infrastructure Engineer

Hiring Organisation
American Express
Location
Phoenix, Arizona, United States
Employment Type
Permanent
Salary
USD Annual
Experience integrating AI and machine learning concepts into infrastructure operations or predictive monitoring. Practical knowledge of DevOps methodologies, CI/CD pipelines, and modern observability frameworks. In-depth experience with hardware lifecycle management (build, operate, maintain, decommission). Professional Skills Proven ability to lead complex infrastructure programs that span ...

DevOps Engineer

Hiring Organisation
FBI &TMT
Location
London, United Kingdom
Employment Type
Permanent, Work From Home
Salary
£85,000
Terraform/CloudFormation/Pulumi and configuration management * Solid understanding of CI/CD tools (GitHub Actions, GitLab CI, Jenkins) * Knowledge of monitoring/observability tools (Prometheus, Grafana, OpenObserve) * Experience with GPU infrastructure and distributed ML compute frameworks * Familiarity with MLOps tools and model lifecycle management * Strong scripting skills (Python ...

DevOps Engineer

Hiring Organisation
Matchtech
Location
London, South East, England, United Kingdom
Employment Type
Full-Time
Salary
£85,000 per annum, Negotiable
Terraform/CloudFormation/Pulumi and configuration management* Solid understanding of CI/CD tools (GitHub Actions, GitLab CI, Jenkins)* Knowledge of monitoring/observability tools (Prometheus, Grafana, OpenObserve)* Experience with GPU infrastructure and distributed ML compute frameworks* Familiarity with MLOps tools and model lifecycle management* Strong scripting skills (Python ...

DevSecOps Engineer

Hiring Organisation
G-Research
Location
London Area, United Kingdom
experience with DevOps practices, such as CI/CD pipelines (such as Jenkins and GitLab), Infrastructure as Code (such as Terraform and Ansible) and observability tools (such as Prometheus, ELK and Grafana) Familiarity with containerisation and orchestration technologies, such as Docker or Kubernetes,, cloud platforms, such as AWS, Azure ...

Senior Site Reliability Engineer - Fleet Reliability

Hiring Organisation
Lambda
Location
San Francisco, California, United States
Employment Type
Permanent
Salary
USD Annual
currently Tuesday. What You'll Do Define Fleet Health metrics and indicators to objectively measure and improve system availability Collaborate with the observability team on comprehensive monitoring and alerting systems to proactively predict, detect and respond to issues or anomalies Create runbooks and automated remediations for common failure scenarios Build ...

Senior Software Engineer (Full Stack)

Hiring Organisation
Civica
Location
Midlands, UK
Employment Type
Full-time
communication and collaboration skills, with a commitment to mentoring and team development. Preferred skills: Experience with .NET, C#, ASP.NET Experience with Node.js Understanding of observability practices, including logging, metrics, and tracing. Experience with monitoring tools such as Prometheus and Grafana. Awareness of cloud security best practices, including IAM policies ...

Senior Software Engineer (Full Stack)

Hiring Organisation
Civica
Location
Liverpool, UK
Employment Type
Full-time
communication and collaboration skills, with a commitment to mentoring and team development. Preferred skills: Experience with .NET, C#, ASP.NET Experience with Node.js Understanding of observability practices, including logging, metrics, and tracing. Experience with monitoring tools such as Prometheus and Grafana. Awareness of cloud security best practices, including IAM policies ...

Senior Software Engineer (Full Stack)

Hiring Organisation
Civica
Location
Edinburgh, UK
Employment Type
Full-time
communication and collaboration skills, with a commitment to mentoring and team development. Preferred skills: Experience with .NET, C#, ASP.NET Experience with Node.js Understanding of observability practices, including logging, metrics, and tracing. Experience with monitoring tools such as Prometheus and Grafana. Awareness of cloud security best practices, including IAM policies ...

Senior Software Engineer (Full Stack)

Hiring Organisation
Civica
Location
Glasgow, UK
Employment Type
Full-time
communication and collaboration skills, with a commitment to mentoring and team development. Preferred skills: Experience with .NET, C#, ASP.NET Experience with Node.js Understanding of observability practices, including logging, metrics, and tracing. Experience with monitoring tools such as Prometheus and Grafana. Awareness of cloud security best practices, including IAM policies ...

Senior Software Engineer (Full Stack)

Hiring Organisation
Civica
Location
Leeds, UK
Employment Type
Full-time
communication and collaboration skills, with a commitment to mentoring and team development. Preferred skills: Experience with .NET, C#, ASP.NET Experience with Node.js Understanding of observability practices, including logging, metrics, and tracing. Experience with monitoring tools such as Prometheus and Grafana. Awareness of cloud security best practices, including IAM policies ...

Senior Software Engineer (Full Stack)

Hiring Organisation
Civica
Location
Manchester, UK
Employment Type
Full-time
communication and collaboration skills, with a commitment to mentoring and team development. Preferred skills: Experience with .NET, C#, ASP.NET Experience with Node.js Understanding of observability practices, including logging, metrics, and tracing. Experience with monitoring tools such as Prometheus and Grafana. Awareness of cloud security best practices, including IAM policies ...

Senior Software Engineer (Full Stack)

Hiring Organisation
Civica
Location
London, UK
Employment Type
Full-time
communication and collaboration skills, with a commitment to mentoring and team development. Preferred skills: Experience with .NET, C#, ASP.NET Experience with Node.js Understanding of observability practices, including logging, metrics, and tracing. Experience with monitoring tools such as Prometheus and Grafana. Awareness of cloud security best practices, including IAM policies ...

Senior Software Engineer (Full Stack)

Hiring Organisation
Civica
Location
United Kingdom
communication and collaboration skills, with a commitment to mentoring and team development. Preferred skills: Experience with .NET, C#, ASP.NET Experience with Node.js Understanding of observability practices, including logging, metrics, and tracing. Experience with monitoring tools such as Prometheus and Grafana. Awareness of cloud security best practices, including IAM policies ...