networking and security standards, protocols and best practices Proven experience in logging systems (e.g. ELK stack ) Proven experience in monitoring systems (e.g. Prometheus ) Proven experience in tracing systems (e.g. OpenTelemetry , Jaeger) Experience in performance optimization and resource management Relevant certifications (AWS, Google) Understanding of Agile methodologies Ability to diagnose and resolve service- affecting issues in a Broadcast/Livestream environment More ❯
london, south east england, united kingdom Hybrid / WFH Options
Sky
networking and security standards, protocols and best practices Proven experience in logging systems (e.g. ELK stack ) Proven experience in monitoring systems (e.g. Prometheus ) Proven experience in tracing systems (e.g. OpenTelemetry , Jaeger) Experience in performance optimization and resource management Relevant certifications (AWS, Google) Understanding of Agile methodologies Ability to diagnose and resolve service- affecting issues in a Broadcast/Livestream environment More ❯
systems administration combined with strong SQL skills and proficiency in scripting languages such as Python or Java. Demonstrated experience with monitoring and observability tools including Prometheus, Grafana, Splunk, Geneos, OpenTelemetry or Corvil is highly desirable. Familiarity with cloud platforms as well as containerisation technologies like Kubernetes or Docker alongside CI/CD pipeline management is important for this role. Comprehensive More ❯
to translate complexity into clarity Experience with Terraform, Helm, or GitOps tooling Familiarity with front-end technologies such as React and TypeScript Exposure to GraphQL, observability stacks (e.g., Prometheus, OpenTelemetry), or large-scale data platforms Prior work in regulated industries (BFSI, telecom, public sector) To succeed in this role, you'll bring more than just technical knowledge. You'll demonstrate More ❯
lifecycle tools, model monitoring, and versioning Exposure to tools like KServe, Ray Serve, Triton, or vLLM is a big plus Bonus Points Experience with observability frameworks like Prometheus or OpenTelemetry Knowledge of ML libraries: TensorFlow, PyTorch, HuggingFace Exposure to Azure or GCP Passion for financial services Qualifications Degree in Computer Science, Engineering, Data Science, or similar What We Offer A More ❯
and programming languages such as C++, Java or Python. Strong understanding of distributed systems and low-latency architectures Hands-on experience with observability stacks (e.g., Prometheus, Grafana, Splunk, Geneos, OpenTelemetry) and infrastructure automation (e.g., Ansible, Terraform, CI/CD pipelines) Strong understanding of the trade lifecycle, market data, and fixed income products, FX or algorithmic trading experience is a plus More ❯
experience in technical integrations and POCs Comfortable coding in any high-level programming language (Java, Go, Python) Strong hands-on knowledge of Kubernetes, AWS, Azure, GCP, Docker, Prometheus, and OpenTelemetry Industry knowledge and opinions on Monitoring, Observability, Log Management, SIEM Engineering/DevOps Background - advantage Experience in Technical Sales of Log Analytics/Monitoring/APM/SIEM - advantage Cultural More ❯
under test: Containerisation (e.g. Docker), Virtualisation and Provisioning, Workload and job scheduling (e.g. Kubernetes, Ray) on high core-count machines and rack-scale installations, Management and Observability (e.g. Prometheus, OpenTelemetry, DataDog, Splunk, etc.). 10+ years of relevant experience related to quality assurance/testing teams. Experience with the Atlassian suite and CI/CD platforms such as Jenkins; GitHub More ❯
/Accounts - AWS Control Tower, GCP Resource Manager, etc. Network - AWS Transit Gateway, GCP Shared VPC, AWS Route53, GCP Cloud DNS, etc. Observability - AWS OpenSearch, GCP Monitoring/Traces, OpenTelemetry, Grafana, Prometheus, etc. Automation Prowess: Hands-on experience with modern Infrastructure as Code (IaC) automation tools and frameworks (Terraform, Jenkins, Ansible, etc.). Software Development Acumen: A software development background More ❯
/Accounts - AWS Control Tower, GCP Resource Manager, etc. Network - AWS Transit Gateway, GCP Shared VPC, AWS Route53, GCP Cloud DNS, etc. Observability - AWS OpenSearch, GCP Monitoring/Traces, OpenTelemetry, Grafana, Prometheus, etc. Automation Prowess: Hands-on experience with modern Infrastructure as Code (IaC) automation tools and frameworks (Terraform, Jenkins, Ansible, etc.). Software Development Acumen: A software development background More ❯
cloud (preferably Azure) using Terraform and Kubernetes. Manage CI/CD pipelines using GitHub Actions and ensure smooth delivery to production. Own monitoring, alerting, and observability, using tools like OpenTelemetry and Dynatrace. Security & Compliance: Champion secure coding practices and data protection across services. Collaboration & Mentoring: Work closely with product owners, engineering leads, and other stakeholders to shape technical solutions. Mentor More ❯
technical experience in Cloud DevOps, SaaS, or observability, with 5+ years in leadership roles. Strong hands-on experience with AWS, GCP, Azure, K8S, Terraform and observability tools: Prometheus, Grafana, OpenTelemetry, ELK, Splunk, Datadog, and similar. Proficiency with metrics, logs, traces and APM. Leadership & Global Operations Proven success leading multi-regional or global technical teams with direct management of managers. Demonstrated More ❯
patterns, and packaging. Familiarity with building performant and reliable Python systems, including low-level C/C++ extensions (e.g., using pybind11, Cython) and instrumentation for production telemetry (e.g., Prometheus, OpenTelemetry). A proactive ownership mindset and the ability to navigate ambiguity. Excellent collaboration and communication skills for working effectively with teams and stakeholders. Ideally Professional experience GPGPU programming (e.g., CUDA More ❯
in: Languages: Java 17+ (Java 21 preferred) Frameworks: Micronaut (preferred), Spring Boot Testing: JUnit, Mockito Build Tools: Gradle Data & Messaging: Kafka, MongoDB APIs: GraphQL Federation, REST Infrastructure & Observability: Terraform, OpenTelemetry, Dynatrace Soft Skills & Leadership Exceptional communication skills - able to distill and present engineering decisions to executives and business teams. Experienced in managing relationships with third-party vendors and platform providers. More ❯
multi-tenant PostgreSQL, sharded MySQL). Strong backend fundamentals around concurrency, caching, indexing and distributed systems trade-offs. Proven track record of setting SLOs, building dashboards (Prometheus/Grafana, OpenTelemetry, etc.) and tuning alerts. Comfort with Kubernetes , IaC and cloud-native patterns; can debug from network to application layer. Start-up bias for action: you prioritise high-leverage fixes, ship More ❯
you will work on the best-in-class open-banking decision making platform, and learn how a operate with low-latency, at scale. Our technology stack: Python (including FastAPI, OpenTelemetry, procrastinate, SQLAlchemy, Uvicorn), Postgres, MySQL, Liquibase, Retool, Docker, AWS Who you are: Three or more years professional experience in software engineering Proficiency in writing well-structured Python code with type More ❯
strategy for technical audiences Analytics and conversion optimization Growth hacking and experimentation frameworks AI automation tools for marketing workflows Nice to haves include experience in: Observability/monitoring space (OpenTelemetry, APM tools) Working with AI/ML products or communities Hands-on experience with LLMs and AI agents for marketing automation Building developer communities Product-led growth strategies Public speaking More ❯
in software delivery, CI/CD, observability, and infrastructure-as-code. Drive improvements in telemetry and observability , helping us move from log-centric metrics to first-class telemetry using OpenTelemetry and modern observability stacks. Optimise for performance , helping the platform scale for low-latency, high-throughput demands in real-time sports data delivery. Mentor and guide engineers , promoting a strong … e.g., RabbitMQ, Kafka). Strong grasp of telemetry, observability, and performance monitoring in distributed systems. Track record of technical leadership and setting engineering standards. Nice to Have: Experience with OpenTelemetry , Prometheus, Grafana, or similar observability tooling. Exposure to hybrid-cloud or cloud migration strategies. Familiarity with performance optimisation in low-latency data pipelines. Contributions to DevOps-related communities, blogs, open More ❯
in software delivery, CI/CD, observability, and infrastructure-as-code. Drive improvements in telemetry and observability , helping us move from log-centric metrics to first-class telemetry using OpenTelemetry and modern observability stacks. Optimise for performance , helping the platform scale for low-latency, high-throughput demands in real-time sports data delivery. Mentor and guide engineers , promoting a strong … e.g., RabbitMQ, Kafka). Strong grasp of telemetry, observability, and performance monitoring in distributed systems. Track record of technical leadership and setting engineering standards. Nice to Have: Experience with OpenTelemetry , Prometheus, Grafana, or similar observability tooling. Exposure to hybrid-cloud or cloud migration strategies. Familiarity with performance optimisation in low-latency data pipelines. Contributions to DevOps-related communities, blogs, open More ❯
engineering in Go. You will not only architect our internal systems for scale but also build and operate key product infrastructure, including our customer-facing telemetry pipeline (built on OpenTelemetry and ClickHouse) and the AI pipeline that empowers our products. We are looking for a hands-on technical leader, driven by the challenge of solving ambiguous, 'eBay-scale' problems-whether … on, but is not limited to: Architecting, building, and operating the core cloud-native infrastructure for WunderGraph Cosmo, primarily using Go and Kubernetes. Owning and evolving our observability stack (OpenTelemetry, Prometheus, ClickHouse) and the infrastructure supporting our AI-driven features to ensure deep, actionable insights into our systems. Building and optimizing CI/CD pipelines to improve build times, automate … system architecture, distributed systems, and the challenges of running high-performance API gateways. Familiarity with GraphQL Federation is a significant plus. Experience building or managing modern observability stacks (e.g., OpenTelemetry, Prometheus, Grafana, ClickHouse). A self-starter attitude and a leader's mindset: you are comfortable with ambiguity, can identify and solve ill-defined problems, and don't need hand More ❯
About Birdie Birdie is the leading home healthcare technology platform that aims to radically transform the lives of older adults. Its all-in-one solution supports around 4.8 million (and growing) care visits every month, equipping care providers with the More ❯