Observability Job Vacancies

451 to 475 of 516 Observability Jobs

DevSecOps Engineer with Security Clearance

Reston, Virginia, United States
Echelon Services, LLC
Job Title: DevSecOps Engineer Location: Reston, VA or Charleston, SC Clearance Required: TS/SCI Employment Type: Full-Time C ompany Overview Echelon Services LLC is a Native Hawaiian-Owned 8(a) small business that delivers mission-critical IT, cybersecurity More ❯
Employment Type: Permanent
Salary: USD 200,000 Annual
Posted:

Senior Software Engineer - Network Production Engineer

London, United Kingdom
Bloomberg L.P
tools to manage a large-scale, multi-vendor network with an emphasis on automation, telemetry, and model-driven infrastructure as code. Automate the full network lifecycle-including provisioning, configuration, observability, testing, troubleshooting, and capacity planning. Collaborate with architecture and design teams and the CTO office to implement new technologies that ensure scalability, efficiency, and operational resilience. Develop tools and platforms … that enhance the observability, reliability, and performance of the production network. Enhance existing monitoring and observability frameworks, integrating intelligent alerting and self-remediation capabilities to reduce manual intervention and improve incident response. Define and measure service-level objectives (SLOs) to track infrastructure performance and reliability. Write software utilizing orchestration systems to automate tasks and interact with other systems. Provide mentorship More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Software Engineer - NSPE Firewall

London, United Kingdom
Bloomberg L.P
tools to manage a large-scale, multi-vendor network with an emphasis on automation, telemetry, and model-driven infrastructure as code. Automate the full network lifecycle-including provisioning, configuration, observability, testing, troubleshooting, and capacity planning. Collaborate with architecture and design teams and the CTO office to implement new technologies that ensure scalability, efficiency, and operational resilience. Develop tools and platforms … that enhance the observability, reliability, and performance of the production network. Enhance existing monitoring and observability frameworks, integrating intelligent alerting and self-remediation capabilities to reduce manual intervention and improve incident response. Define and measure service-level objectives (SLOs) to track infrastructure performance and reliability. Write software utilizing orchestration systems to automate tasks and interact with other systems. Provide mentorship More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Splunk Enterprise Monitoring Engineer

Decatur, Georgia, United States
SpiceOrb
CD Summary: We are looking for a highly skilled Splunk Subject Matter Expert (SME) and Enterprise Monitoring Engineer to lead the design, implementation, and optimization of our monitoring and observability ecosystem. The ideal candidate will be an expert in Splunk, with a strong background in enterprise IT infrastructure, system performance monitoring, and log analytics. You will play a pivotal role … Strong understanding of network protocols, system logs, and application telemetry. Preferred Qualifications: Splunk certifications (e.g., Splunk Certified Power User, Admin, Architect). Experience with Splunk ITSI, Enterprise Security, or Observability Suite. Knowledge of cloud-native environments (AWS, Azure, or GCP) and cloud monitoring integrations. Experience with log aggregation, security event monitoring, or compliance (e.g., PCI, HIPAA, SOX). Familiarity with More ❯
Employment Type: Permanent
Salary: USD Annual
Posted:

Director of AWS Platforms

London, United Kingdom
Boston Consulting Group
for the creation, implementation, and continuous improvement of BCG's modern, fully automated SACM function. As the beating heart of IT, this system will serve as the backbone for observability, service reliability, release and change management, and infrastructure management. The leader will drive the automation and governance of BCG's configuration management database (CMDB), integrating it with SRE, ITSM, and … Establish the CMDB as a real-time, trusted system of record for configuration items across cloud, on-prem, and hybrid environments. Embed SACM capabilities into core IT processes including observability, incident response, service management, and architecture governance. Champion automation, transparency, and traceability of all infrastructure, software, and asset relationships. Automation & Integration: Build and operate a fully automated CMDB with bi … reduce risk and accelerate safe deployments. Operational Excellence & SRE Alignment: Apply SRE principles to ensure reliability, performance, and resilience of the SACM platform. Embed SACM into 24x7 operations and observability platforms to support real-time decision-making. Support incident prevention, root cause analysis, and continuous improvement through data-driven insights. Define and enforce service level objectives (SLOs) and key performance More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

AWS Senior Platform Engineer

London, United Kingdom
CACI Limited
at scale, leveraging AWS Organizations, Landing Zones, and multi-account best practices. Develop and maintain Infrastructure as Code solutions using Terraform, CloudFormation, and AWS CDK. Champion security, compliance, and observability by integrating services like AWS Security Hub, GuardDuty, and Inspector. Design CI/CD pipelines to enable seamless deployments and self-service models for customers. Innovate with AWS Networking, KMS … Proficiency in Python, Go, or similar languages for automation and scripting. Expert-level knowledge of AWS Networking, TLS, and security best practices. Experience with container orchestration (Kubernetes, EKS) and observability tools (Grafana, ELK). A passion for innovation, problem-solving, and delivering high-impact solutions. Why Work For Us? 25 days holiday + bank holidays Up to 5% employer pension More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Platform Engineer

Caldecotte, Milton Keynes, Buckinghamshire, England, United Kingdom
Connells Group HQ
mindset, working directly with development teams to understand their needs and deliver solutions. You will work across multiple technical domains including orchestration, automation, CI/CD pipelines, cloud services, observability, and security, developing deeper expertise in areas that align with platform priorities and your interests. Experience with Microsoft Azure is essential.You will play your part in operating the platform aligned … with Docker and basic Kubernetes concepts Understanding of cloud networking concepts (VNets, subnets, NSGs) Awareness of cloud security best practices and compliance requirements Basic knowledge of monitoring, logging, and observability tools Understanding of cloud cost management and resource optimisation principles Comfort with troubleshooting and supporting development teams Understanding of service reliability and incident response practices Connells Group UK is an More ❯
Employment Type: Full-Time
Salary: Competitive salary
Posted:

Systems Development Engineer, Kuiper Enterprise Technology-Low Earth Orbit Satellites

Bellevue, Washington, United States
Amazon Kuiper Manufacturing Enterprises LLC
on AWS. Key job responsibilities Manage and maintain Kuiper's SAP Infrastructure, Collaborate directly with customers to understand their unique use cases and implement tailored solutions Implement and improve observability measures across the team's infrastructure Implement and maintain Infrastructure as Code (IaC) practices for all managed systems Apply DevOps best practices to improve system reliability, scalability, and security Troubleshoot … you will function as a DevOps Engineer. You will operate and support the team's services, as well as developing automation for upgrades/patching/testing, and enhance observability of the systems. You will work with stakeholders and senior engineers to design and develop custom solutions and integrations between tools and other services in a secure manner. About the More ❯
Employment Type: Permanent
Salary: USD Annual
Posted:

Senior Software Engineer

Dublin, Ireland
General Motors
credentialing, key exchange via serial/USB) aligned to enterprise security standards. Develop robust Wi-Fi networking and enterprise service integration (REST, message queues) with resilient error handling. Enable observability with structured logging, metrics, and diagnostics; participate in on-call rotations supporting global plant operations. Collaborate on API contracts, device state models, and secure endpoints; influence architecture for scalability and … using Java Spring Boot (REST APIs, data persistence, messaging/streaming integration). Build and maintain Angular front-end applications (TypeScript, RxJS) with responsive, accessible, and performant UIs. Establish observability across services and UIs (logging, metrics, tracing, SLOs, dashboards). Apply security best practices (OWASP, OAuth2/OIDC, secrets management). Drive coding standards, testing strategies, and design reviews; mentor More ❯
Employment Type: Permanent
Salary: EUR 125,000 - 150,000 Annual
Posted:

Staff Platform Engineer with Security Clearance

Lexington, Massachusetts, United States
Hybrid / WFH Options
Raft
for disconnected operations and must ensure a smooth software deployment process for applications developed on IL4 and delivering them to IL6/SIPR. You will be responsible for ensuring observability, monitoring, and alerting operate as engineered by client application teams. These processes will be documented and executed with the assistance of run books, checklists, and rely on you to keep … applications Highly preferred: - Background within DoD/Air Force AOC Weapon System and operating standards within cleared facilities (SIPR, IL6) - Familiarity with AWS and cloud technologies - Skill in operating observability tooling and alerting (Prometheus, Grafana, etc.) - Knowledge of Platform One Big Bang Clearance Requirements: Active Secret security clearance Work Type: Hybrid - Hanscom AFB, MA highly preferred (or local to Reston More ❯
Employment Type: Permanent
Salary: USD 190,000 Annual
Posted:

Data Scientist- Gen AI

London, United Kingdom
Scrumconnect Limited
London, United Kingdom Posted on 12/09/2025 We're hiring a Data Scientist with strong Generative-AI experience to design, build, and ship AI-powered tools end-to-end. You'll work in a small, multi-disciplinary More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Software Engineer / SRE

Leeds, West Yorkshire, Yorkshire, United Kingdom
Hybrid / WFH Options
Fruition Group
Software Engineer/SRE JavaScript/TypeScript, Node.js, AWS, Observability Leeds/Hybrid, c. 2x per week Salary up to £65,000 We're looking for a Software Engineer with strong AWS and Observability experience to join a growing engineering team in Leeds. This is a hybrid role, giving you the flexibility to split your time between home and a … improving platform performance and automation, while collaborating with developers, product teams, and operations. What you'll be doing: Building and maintaining scalable cloud infrastructure in AWS Implementing and improving observability tools (monitoring, logging, tracing) Automating deployments and improving CI/CD pipelines Driving reliability, availability and performance across systems Working with developers and SREs to solve complex problems What we … re looking for: Strong experience with AWS (EC2, ECS, Lambda, RDS etc.) Good knowledge of observability tools (Grafana, Prometheus, OpenTelemetry, Datadog, or similar) Background in software engineering (JavaScript/TypeScript & Node.js, although any language is fine) Experience with Infrastructure as Code (Terraform, CloudFormation, or similar) CI/CD pipelines and automation experience What's on offer: Salary up to More ❯
Employment Type: Permanent, Work From Home
Salary: £65,000
Posted:

Lead Azure Platform Engineer

Potters Bar, Hertfordshire, South East, United Kingdom
Searchstone Ltd
Platform Engineer . This is a hands-on, high-impact role where you will design, build, and operate next-generation cloud platforms, with a strong focus on operational resilience, observability, and Infrastructure as Code (IaC) . This is a player-coach role : you will lead by example, delivering complex Azure solutions while mentoring and developing other engineers. What Youll Do … and operation of Azure platforms , ensuring security, scalability, and reliability. Define and implement strategies for operational resilience , including high availability, disaster recovery, and business continuity. Establish end-to-end observability across workloads (logging, metrics, tracing, alerting) to proactively detect and resolve issues. Champion Infrastructure as Code (IaC) using Terraform, Bicep, or ARM templates for repeatable, reliable deployments. Promote native Azure … templates. Proficiency in Azure DevOps, CI/CD pipelines, and automation frameworks . Solid understanding of cloud security, governance, and compliance . Ability to design for reliability, scalability, and observability . Excellent communication and leadership skills, with a proven ability to influence technical direction. Nice to Have Familiarity with multi-region and hybrid cloud architectures . Knowledge of SRE (Site More ❯
Employment Type: Permanent
Salary: £95,000
Posted:

Staff Infrastructure Engineer - Long Term Project - Los Angeles (Hybrid)

Los Angeles, California, United States
Hybrid / WFH Options
INSPYR Solutions
on reliability engineering to deliver robust and maintainable systems. You will work on network design, traffic analysis and engineering, maintaining CI/CD pipeline and creating tools to enhance observability and streamline troubleshooting for core infrastructure services. Your role will include: Designing, deploying, and operating the global network: Plan, build, and maintain both new and existing infrastructure to deliver the … system reliability, and enable rapid scaling. Developing customer-centric tooling: Build tools to simplify and streamline the consumption of cloud resources for internal teams, empowering them to innovate faster Observability and troubleshooting: Enhance monitoring and logging systems to quickly detect, debug, and resolve issues across our infrastructure Mentorship and continuous learning: Guide and mentor junior and senior engineers in systems … engineers across various timezones to maximize coverage, responsiveness, and global reach. Responsibilities: Solve complex challenges independently, diagnosing and resolving production issues across globally distributed systems. Advance our monitoring and observability platforms, driving innovation that keep our infrastructure visible, actionable, and resilient. Troubleshoot live incidents (on-call rotation) and design resilient solutions to maintain uptime and meet SLAs, continually evolving our More ❯
Employment Type: Permanent
Salary: USD Annual
Posted:

Principal AWS Platform Engineer

London, United Kingdom
CACI Limited
at scale, leveraging AWS Organizations, Landing Zones, and multi-account best practices. Develop and maintain Infrastructure as Code solutions using Terraform, CloudFormation, and AWS CDK. Champion security, compliance, and observability by integrating services like AWS Security Hub, GuardDuty, and Inspector. Design CI/CD pipelines to enable seamless deployments and self-service models for customers. Innovate with AWS Networking, KMS … Proficiency in Python, Go, or similar languages for automation and scripting. Expert-level knowledge of AWS Networking, TLS, and security best practices. Experience with container orchestration (Kubernetes, EKS) and observability tools (Grafana, ELK). A passion for innovation, problem-solving, and delivering high-impact solutions. Experience leading/managing junior engineers Significant experience with Control Tower and deploying landing zones. More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Lead Technical Application Service Specialist

Edinburgh, United Kingdom
Lloyds Bank plc
End Date Tuesday 23 September 2025 Salary Range £83,411 - £98,130 Flexible Working Options Flexibility in when hours are worked, Job Share Job Description Summary . Job Description JOB TITLE: Lead Technical Application Service Specialist SALARY: £83,411 - £107 More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

AppSec Lead

Central London, London, United Kingdom
Hybrid / WFH Options
Halian Technology Limited
A leading fintech company is seeking a Lead AppSec Engineer to join their established team. Youll be instrumental in embedding security into every stage of the software development lifecycleguiding engineers, shaping best practices, and driving secure, scalable solutions across our More ❯
Employment Type: Permanent, Work From Home
Posted:

Senior Software Engineer in Test

London, United Kingdom
Hybrid / WFH Options
LinuxRecruit
This is a fast-expanding company at the forefront of odds comparison, where innovation converges with excitement. Here you can experience the best of both worlds, working within a close-knit team with autonomy while enjoying substantial financial backing from More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Lead DevOps Engineer (Data)

London, United Kingdom
Hybrid / WFH Options
LGBT Great
a key role in scaling and supporting our data systems, which leverage a modern AWS stack and Snowflake. This is a high-impact role with direct influence over reliability, observability, and the DevOps maturity of our data engineering function. Key Responsibilities Platform Ownership Own and manage the data platform infrastructure built on AWS services (EventBridge, Lambda, EC2, MWAA, S3). … Snowflake, and support its integration into the broader data ecosystem. Infrastructure and System Reliability Ensure platform reliability, availability, and scalability across environments. Design and maintain robust monitoring, alerting, and observability frameworks to reduce MTTR and improve visibility. Lead and manage initiatives related to data lineage, platform health, and alert hygiene. CI/CD and Automation Enhance and expand our CI … and operating production data platforms within AWS. Strong understanding of AWS core services: EventBridge, Lambda, EC2, S3, and MWAA (Managed Workflows for Apache Airflow). Experience with infrastructure reliability, observability tooling, and platform automation. Solid experience with CI/CD pipelines, preferably Bitbucket Pipelines. Familiarity with Snowflake administration and deployment practices. Comfortable working through ambiguity and in cross-functional, collaborative More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Software Engineer, Front End

Cambridge, Cambridgeshire, United Kingdom
Futureshaper.com
reusability. Implement responsive, accessible, and performant UIs optimized for data-rich and interactive workflows. Implement DevOps/GitOps practices for automated testing, deployment, and monitoring. Ensure security, scalability, and observability of front-end services in cloud environments (AWS). Ensure robust unit, integration, and end-to-end test coverage to maintain long-term code quality. An eye for optimal and … suites (Jest, React Testing Library, Cypress). Experience with core AWS services (e.g., EC2, S3, Lambda) and infrastructure-as-code using AWS CDK. Experience with system design, performance optimization, observability, and operational excellence during parallel LLM streams. Strong intuition for UX design and a demonstrated commitment to building delightful, workflow-first products. Excellent communication skills and collaborative mindset, especially in … fast-moving, cross-functional environments. Preferred Qualifications Background in scientific domains such as biology, chemistry, or complex systems is a plus but not required. Familiarity with system evaluation and observability tools (e.g., Grafana, Langfuse, Kibana, Cloudwatch) and managing SLAs in production environments. Why Join Us? By joining this initiative within Flagship's Pioneering Intelligence group, you will: Help define a More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Engineering - Senior Backend Engineer

London, United Kingdom
Hybrid / WFH Options
quench.ai
enhancing our proprietary search engine , indexing and querying structured and unstructured data. Collaborate closely with the AI team to deliver intelligent, contextual responses to user queries. Ensure high performance, observability, and resilience across all backend services. Contribute to technical strategy , code reviews, and overall engineering best practices. You may be suited for this role if you meet the following criteria … 5+ years of backend development experience. Expertise in Python and cloud-based architectures (preferably GCP). Strong understanding of modern software development best practices, including CI/CD, containerization, observability, and microservices . Experience with data integrations and APIs , particularly across enterprise tools. Familiarity with search indexing and large-scale data pipelines is a strong plus. Strong understanding of system More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:

Senior Data Management Professional - Data Engineering - Equity Corporate Actions

New York, United States
Bloomberg
passion for finance, data, and technology who has extensive experience building data management solutions. You'll be responsible for strategizing, designing, and implementing data pipelines and remediation workflows, ensuring observability, transparency, and continuous improvement of pipeline performance and quality of output. In this role, you'll also act as a technical leader-guiding design decisions, mentoring team members, and owning … loading both structured and unstructured data from diverse and numerous sources, leveraging Bloomberg's technology stack Lead the development and implementation of proactive programmatic data quality strategies with enhanced observability, transparency, and robust remediation workflows, enabling rapid identification and resolution of data issues with minimal client disruption Use your analytical experience to analyze internal processes to identify gaps and opportunities More ❯
Employment Type: Permanent
Salary: USD Annual
Posted:

Lead Developer

London, United Kingdom
Hybrid / WFH Options
Experis
Lead Developer 6 Months Hybrid -1/3 days a month in office, either London or Bristol £750 Overview: Working within an agile digital delivery team developing and supporting a mission critical application for the UK client , with instances hosted More ❯
Employment Type: Contract
Rate: £600 - £750/day
Posted:

Splunk Specialist - Migration to Elasticsearch (Kubernetes Environment)

Birmingham, United Kingdom
Flint UK Technology Services
pipelines, dashboards, alerting rules, data models, etc. Design a detailed migration roadmap , including milestones, risk assessments, and fallback plans. Collaborate with Elastic/Elasticsearch platform teams to implement equivalent observability tooling (eg, Watcher, Kibana dashboards). Act as the primary Splunk SME supporting the customer's existing team of two during the transition. Post-migration, support and troubleshoot any issues … be able to interact with technical and business stakeholders. Nice to Have: Splunk Certifications (eg, Splunk Certified Admin/Architect ) Experience with Bicep, Terraform, or Ansible Familiarity with Elastic Observability solutions (eg, Elastic APM, Elastic Security) Engagement Model: Full-time, Hybrid role- based in Birmingham. More ❯
Employment Type: Contract
Rate: GBP Annual
Posted:

Staff Software Engineer - Platform Strategy for Expansion

London, United Kingdom
Burns Sheehan
to solve complex challenges. Drive innovation around cloud-native technologies and platform automation. Balance strategic vision with 30% hands-on coding and design work. Promote best practice in reliability, observability, and scalability. The Ideal Staff Software Engineer Proven experience operating at Staff+ level within a fast-paced engineering organisation. Strong background in cloud platforms (AWS or GCP) and deep knowledge … ability to build operators. Strong coding skills in Golang, Java, or C#, with experience in distributed systems. Demonstrated leadership across multiple squads and technical roadmaps. Expertise in operational excellence: observability, reliability, automation. This is an outstanding opportunity for a Staff Software Engineer join a rapidly scaling company where you'll play a pivotal role in shaping the technical foundations of More ❯
Employment Type: Permanent
Salary: GBP Annual
Posted:
Observability
10th Percentile
£57,500
25th Percentile
£67,500
Median
£80,000
75th Percentile
£100,000
90th Percentile
£130,000