Implement secure architecture and platform hardening aligned with defence-grade standards, supporting identity, access control, encryption, and system resilience. Monitoring & Continuous Improvement Setup and maintain monitoring solutions (eg ELK, OpenTelemetry, Prometheus), troubleshoot performance, and deliver root cause analysis and remediation. What We're Looking For DV Clearance : Active Developed Vetting clearance is essential . Systems Engineering Experience : 2nd/3rd More ❯
Implement secure architecture and platform hardening aligned with defence-grade standards, supporting identity, access control, encryption, and system resilience. Monitoring & Continuous Improvement Setup and maintain monitoring solutions (e.g. ELK, OpenTelemetry, Prometheus), troubleshoot performance, and deliver root cause analysis and remediation. What We're Looking For DV Clearance : Active Developed Vetting clearance is essential . Systems Engineering Experience : 2nd/3rd More ❯
Implement secure architecture and platform hardening aligned with defence-grade standards, supporting identity, access control, encryption, and system resilience. Monitoring & Continuous Improvement Setup and maintain monitoring solutions (e.g. ELK, OpenTelemetry, Prometheus), troubleshoot performance, and deliver root cause analysis and remediation. What We're Looking For DV Clearance : Active Developed Vetting clearance is essential . Systems Engineering Experience : 2nd/3rd More ❯
Implement secure architecture and platform hardening aligned with defence-grade standards, supporting identity, access control, encryption, and system resilience. Monitoring & Continuous Improvement Setup and maintain monitoring solutions (e.g. ELK, OpenTelemetry, Prometheus), troubleshoot performance, and deliver root cause analysis and remediation. What We're Looking For DV Clearance : Active Developed Vetting clearance is essential . Systems Engineering Experience : 2nd/3rd More ❯
Implement secure architecture and platform hardening aligned with defence-grade standards, supporting identity, access control, encryption, and system resilience. Monitoring & Continuous Improvement Setup and maintain monitoring solutions (e.g. ELK, OpenTelemetry, Prometheus), troubleshoot performance, and deliver root cause analysis and remediation. What We're Looking For DV Clearance : Active Developed Vetting clearance is essential . Systems Engineering Experience : 2nd/3rd More ❯
you: Good working knowledge of AWS services including ECS, EC2, Lambda, VPC, IAM, Route53, CloudFront, S3, RDS Good understanding of monitoring and logging solutions, e.g. Prometheus, AWS Cloudwatch, Grafana, OpenTelemetry, Honeycomb, ELK etc. Basic SRE knowledge, and experience in alerting and incident management platforms (eg. Opsgenie, Pagerduty) Proven ability to provide and support strong and scalable CI/CD pipelines More ❯
and programming languages such as C++, Java or Python. Strong understanding of distributed systems and low-latency architectures Hands-on experience with observability stacks (e.g., Prometheus, Grafana, Splunk, Geneos, OpenTelemetry) and infrastructure automation (e.g., Ansible, Terraform, CI/CD pipelines) Strong understanding of the trade lifecycle, market data, and fixed income products, FX or algorithmic trading experience is a plus More ❯
and programming languages such as C++, Java or Python. Strong understanding of distributed systems and low-latency architectures Hands-on experience with observability stacks (e.g., Prometheus, Grafana, Splunk, Geneos, OpenTelemetry) and infrastructure automation (e.g., Ansible, Terraform, CI/CD pipelines) Strong understanding of the trade lifecycle, market data, and fixed income products, FX or algorithmic trading experience is a plus More ❯
experience in technical integrations and POCs Comfortable coding in any high-level programming language (Java, Go, Python) Strong hands-on knowledge of Kubernetes, AWS, Azure, GCP, Docker, Prometheus, and OpenTelemetry Industry knowledge and opinions on Monitoring, Observability, Log Management, SIEM Engineering/DevOps Background - advantage Experience in Technical Sales of Log Analytics/Monitoring/APM/SIEM - advantage Cultural More ❯
/Accounts - AWS Control Tower, GCP Resource Manager, etc. Network - AWS Transit Gateway, GCP Shared VPC, AWS Route53, GCP Cloud DNS, etc. Observability - AWS OpenSearch, GCP Monitoring/Traces, OpenTelemetry, Grafana, Prometheus, etc. Automation Prowess: Hands-on experience with modern Infrastructure as Code (IaC) automation tools and frameworks (Terraform, Jenkins, Ansible, etc.). Software Development Acumen: A software development background More ❯
/Accounts - AWS Control Tower, GCP Resource Manager, etc. Network - AWS Transit Gateway, GCP Shared VPC, AWS Route53, GCP Cloud DNS, etc. Observability - AWS OpenSearch, GCP Monitoring/Traces, OpenTelemetry, Grafana, Prometheus, etc. Automation Prowess: Hands-on experience with modern Infrastructure as Code (IaC) automation tools and frameworks (Terraform, Jenkins, Ansible, etc.). Software Development Acumen: A software development background More ❯
cloud (preferably Azure) using Terraform and Kubernetes. Manage CI/CD pipelines using GitHub Actions and ensure smooth delivery to production. Own monitoring, alerting, and observability, using tools like OpenTelemetry and Dynatrace. Security & Compliance: Champion secure coding practices and data protection across services. Collaboration & Mentoring: Work closely with product owners, engineering leads, and other stakeholders to shape technical solutions. Mentor More ❯
Linux fundamentals. Curiosity and the confidence to ask questions in a fast-moving team. Nice-to-haves Exposure to Kubernetes, Docker or Terraform. Experience with observability stacks (Grafana, Prometheus, OpenTelemetry). Familiarity with Postgres. Interest in data-privacy, AdTech/MarTech or large-scale data processing. Familiarity with Kafka, gRPC or Apache Spark. As well as working as part of More ❯
technical experience in Cloud DevOps, SaaS, or observability, with 5+ years in leadership roles. Strong hands-on experience with AWS, GCP, Azure, K8S, Terraform and observability tools: Prometheus, Grafana, OpenTelemetry, ELK, Splunk, Datadog, and similar. Proficiency with metrics, logs, traces and APM. Leadership & Global Operations Proven success leading multi-regional or global technical teams with direct management of managers. Demonstrated More ❯
patterns, and packaging. Familiarity with building performant and reliable Python systems, including low-level C/C++ extensions (e.g., using pybind11, Cython) and instrumentation for production telemetry (e.g., Prometheus, OpenTelemetry). A proactive ownership mindset and the ability to navigate ambiguity. Excellent collaboration and communication skills for working effectively with teams and stakeholders. Ideally Professional experience GPGPU programming (e.g., CUDA More ❯
in: Languages: Java 17+ (Java 21 preferred) Frameworks: Micronaut (preferred), Spring Boot Testing: JUnit, Mockito Build Tools: Gradle Data & Messaging: Kafka, MongoDB APIs: GraphQL Federation, REST Infrastructure & Observability: Terraform, OpenTelemetry, Dynatrace Soft Skills & Leadership Exceptional communication skills - able to distill and present engineering decisions to executives and business teams. Experienced in managing relationships with third-party vendors and platform providers. More ❯
platform, writing new monitoring queries to drive our alerting, or coordinating across multiple teams to manage the response to an incident. Our technology stack: AWS (including ECS and RDS), OpenTelemetry, NewRelic, Python, Postgres, Liquibase, Angular, Docker Who you are: Four or more years professional experience in a customer-facing technical support or engineering role Excellent verbal and written communication skills More ❯
you will work on the best-in-class open-banking decision making platform, and learn how a operate with low-latency, at scale. Our technology stack: Python (including FastAPI, OpenTelemetry, procrastinate, SQLAlchemy, Uvicorn), Postgres, MySQL, Liquibase, Retool, Docker, AWS Who you are: Three or more years professional experience in software engineering Proficiency in writing well-structured Python code with type More ❯
in distributed, real-time systems Experience with containerisation and orchestration technologies, such as Kubernetes, in production environments Familiarity with observability tooling and practices, such as Victoria Metrics, Prometheus, Grafana, OpenTelemetry and SLOs Well-developed debugging skills with the ability to navigate unfamiliar systems, identify root causes and deliver effective solutions under time pressure Proven track record of contributing to fault More ❯
Implement secure architecture and platform hardening aligned with defence-grade standards, supporting identity, access control, encryption, and system resilience. Monitoring & Continuous Improvement Setup and maintain monitoring solutions (e.g. ELK, OpenTelemetry, Prometheus), troubleshoot performance, and deliver root cause analysis and remediation. What We're Looking For DV Clearance : Active Developed Vetting clearance is essential . Systems Engineering Experience : 2nd/3rd … Implement secure architecture and platform hardening aligned with defence-grade standards, supporting identity, access control, encryption, and system resilience. Monitoring & Continuous Improvement Setup and maintain monitoring solutions (e.g. ELK, OpenTelemetry, Prometheus), troubleshoot performance, and deliver root cause analysis and remediation. What We're Looking For DV Clearance : Active Developed Vetting clearance is essential . Systems Engineering Experience : 2nd/3rd More ❯
Cambridge, Cambridgeshire, United Kingdom Hybrid / WFH Options
Arm Limited
Modernise our infrastructure by leading the migration from Docker Swarm to Kubernetes Design and operate CI/CD pipelines using CloudBees and GitLab Build out observability with Prometheus, Grafana, OpenTelemetry, and Dynatrace Automate cloud deployments (AWS-first) using Terraform and platform tooling Improve security posture across IAM, secrets, and networking Help the team ship faster and safer by mentoring on … distributed systems at scale in production. Cloud AWS (primary), Kubernetes (future), Docker (current), Terraform. Excellent debugging skills across network, systems, and data stack. Observability tooling, e.g. custom metrics pipelines, OpenTelemetry tracing, or integrations across telemetry stacks. Security engineering and practical understanding of IAM hardening, zero-trust network principles, and secrets management in data-heavy systems. Passion for building reliable, secure More ❯