chicago, illinois, united states Hybrid / WFH Options
Options Clearing Corporation
Harness and Jenkins. Strong scripting skills in languages like Python and Bash. Excellent troubleshooting and problem-solving skills Understanding of networking principles. Experience with monitoring tools like Splunk, Splunk OTEL, Prometheus and Grafana. Technical Skills: Knowledge and experience with Kraft, CFK, Zookeeper and control center highly preferred Kafka, Ansible, Terraform, Bash, Kubernetes, Rancher, GitHub, Artifactory, Harness, Jenkins, AWS, Azure, CI More ❯
Manchester, England, United Kingdom Hybrid / WFH Options
Suits Me
using IaC (e.g. Terraform, CDK) Owning and improving CI/CD pipelines (e.g. GitHub Actions, Jenkins) to streamline secure, automated deployments Building and managing observability tooling (e.g. CloudWatch, Grafana, OpenTelemetry) for proactive system monitoring and alerting Developing event-driven containerised and serverless systems using Lambda, ECS and EKS Championing reliability and security, embedding best practices in identity management, network design More ❯
warrington, cheshire, north west england, united kingdom Hybrid / WFH Options
Suits Me
using IaC (e.g. Terraform, CDK) Owning and improving CI/CD pipelines (e.g. GitHub Actions, Jenkins) to streamline secure, automated deployments Building and managing observability tooling (e.g. CloudWatch, Grafana, OpenTelemetry) for proactive system monitoring and alerting Developing event-driven containerised and serverless systems using Lambda, ECS and EKS Championing reliability and security, embedding best practices in identity management, network design More ❯
bolton, greater manchester, north west england, united kingdom Hybrid / WFH Options
Suits Me
using IaC (e.g. Terraform, CDK) Owning and improving CI/CD pipelines (e.g. GitHub Actions, Jenkins) to streamline secure, automated deployments Building and managing observability tooling (e.g. CloudWatch, Grafana, OpenTelemetry) for proactive system monitoring and alerting Developing event-driven containerised and serverless systems using Lambda, ECS and EKS Championing reliability and security, embedding best practices in identity management, network design More ❯
networking and security standards, protocols and best practices Proven experience in logging systems (e.g. ELK stack ) Proven experience in monitoring systems (e.g. Prometheus ) Proven experience in tracing systems (e.g. OpenTelemetry , Jaeger) Experience in performance optimization and resource management Relevant certifications (AWS, Google) Understanding of Agile methodologies Ability to diagnose and resolve service- affecting issues in a Broadcast/Livestream environment More ❯
networking and security standards, protocols and best practices Proven experience in logging systems (e.g. ELK stack ) Proven experience in monitoring systems (e.g. Prometheus ) Proven experience in tracing systems (e.g. OpenTelemetry , Jaeger) Experience in performance optimization and resource management Relevant certifications (AWS, Google) Understanding of Agile methodologies Ability to diagnose and resolve service- affecting issues in a Broadcast/Livestream environment More ❯
networking and security standards, protocols and best practices Proven experience in logging systems (e.g. ELK stack ) Proven experience in monitoring systems (e.g. Prometheus ) Proven experience in tracing systems (e.g. OpenTelemetry , Jaeger) Experience in performance optimization and resource management Relevant certifications (AWS, Google) Understanding of Agile methodologies Ability to diagnose and resolve service- affecting issues in a Broadcast/Livestream environment More ❯
of ITSM/incident management processes and tools (Halo ITSM, ServiceNow, Jira Service Management) Cloud experience ( AWS, Azure, GCP ) and deploying observability tools in cloud-native environments Understanding of OpenTelemetry and modern observability standards Strong problem-solving skills and ability to work in a fast-paced start-up or consulting environment Why Join: Work with our exclusive client , a high More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Morela
of ITSM/incident management processes and tools (Halo ITSM, ServiceNow, Jira Service Management) Cloud experience ( AWS, Azure, GCP ) and deploying observability tools in cloud-native environments Understanding of OpenTelemetry and modern observability standards Strong problem-solving skills and ability to work in a fast-paced start-up or consulting environment Why Join: Work with our exclusive client , a high More ❯
ML lifecycle tools, model monitoring, and versioning Exposure to tools like KServe, Ray Serve, Triton, or vLLM a big plus Bonus Points: Experience with observability frameworks like Prometheus or OpenTelemetry Knowledge of ML libraries: TensorFlow, PyTorch, HuggingFace Exposure to Azure or GCP Passion for financial services Requirements: Degree in Computer Science, Engineering, Data Science or similar What We Offer A More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Ventula Consulting Limited
managing data handling, consent flows, and feature-gating based on user location. ? Partner with the DevOps Engineer to create comprehensive logging, monitoring, and analytics systems (e.g., using Prometheus, Grafana, OpenTelemetry) to provide deep visibility into platform health, security events, and business KPIs. ? Required Qualifications Education & Experience ? Bachelor's degree in Computer Science or a related technical field. ? 5+ years of More ❯
managing data handling, consent flows, and feature-gating based on user location. Partner with the DevOps Engineer to create comprehensive logging, monitoring, and analytics systems (eg, using Prometheus, Grafana, OpenTelemetry) to provide deep visibility into platform health, security events, and business KPIs. Required Qualifications Education & Experience Bachelor's degree in Computer Science or a related technical field. 5+ years of More ❯
West London, London, United Kingdom Hybrid / WFH Options
Staffworx Limited
development. Familiarity with testing frameworks (Vitest, Playwright) for both API and end-to-end testing. Experience with Docker, Helm, YAML, Kubernetes, and cloud-native deployments. Telemetry tools; Prometheus, Grafana, OpenTelemetry, DataDog, APM tools Understanding of infrastructure-as-code and CI/CD pipelines. Ability to improve codebases and influence architectural direction. Experience mentoring or coaching engineers. Please send updated CV More ❯
development. Familiarity with testing frameworks (Vitest, Playwright) for both API and end-to-end testing. Experience with Docker, Helm, YAML, Kubernetes, and cloud-native deployments. Telemetry tools; Prometheus, Grafana, OpenTelemetry, DataDog, APM tools Understanding of infrastructure-as-code and CI/CD pipelines. Ability to improve codebases and influence architectural direction. Experience mentoring or coaching engineers. Please send updated CV More ❯
west london, south east england, united kingdom Hybrid / WFH Options
Staffworx Limited
development. Familiarity with testing frameworks (Vitest, Playwright) for both API and end-to-end testing. Experience with Docker, Helm, YAML, Kubernetes, and cloud-native deployments. Telemetry tools; Prometheus, Grafana, OpenTelemetry, DataDog, APM tools Understanding of infrastructure-as-code and CI/CD pipelines. Ability to improve codebases and influence architectural direction. Experience mentoring or coaching engineers. Please send updated CV More ❯
london, south east england, united kingdom Hybrid / WFH Options
Staffworx Limited
development. Familiarity with testing frameworks (Vitest, Playwright) for both API and end-to-end testing. Experience with Docker, Helm, YAML, Kubernetes, and cloud-native deployments. Telemetry tools; Prometheus, Grafana, OpenTelemetry, DataDog, APM tools Understanding of infrastructure-as-code and CI/CD pipelines. Ability to improve codebases and influence architectural direction. Experience mentoring or coaching engineers. Please send updated CV More ❯
cambridge, east anglia, united kingdom Hybrid / WFH Options
Speechmatics
the same incident doesn't happen twice. Managing and improving GitOps release workflows and CI/CD pipelines. Monitoring system performance and troubleshooting production environments. Implementing observability improvements using OpenTelemetry tooling. Automating processes that reduces manual efforts and creates self-healing systems. Taking part in on-call rota for production systems that has a generous daily pay rate and mentorship … and eager to dive deep into new technologies; you thrive on learning as you go. Prior experience with on-call rotations and incident response is a plus. Familiarity with OpenTelemetry and related observability tooling is advantageous. We encourage you to apply even if you do not feel you match all of the requirements exactly. The list of requirements is intended More ❯
rebuilding nearly every component of our observability platform, from data collection to real-time analytics. You will drive core initiatives that move Twilio from fragmented tooling to a unified, OpenTelemetry-first observability stack built for scale. You'll lead technically and strategically-designing platform components, influencing architectural decisions, mentoring engineers, and engaging with teams across Platform Engineering and R&D. … workflows. Design and build developer-friendly tooling and APIs to support incident response, performance analysis, and platform debugging at scale. Leverage (and optionally contribute to) open-source standards like OpenTelemetry to ensure interoperability and extensibility. Champion a pragmatic approach to observability-balancing performance, cost, and user value across diverse engineering teams. Qualifications Twilio values diverse experiences from all kinds of … logging platforms, metrics pipelines, tracing infrastructure, or profiling tools). Lead technical execution for major components of Twilio's observability overhaul, including shift to centralized S3-based data lakes, OpenTelemetry instrumentation, and ClickHouse-backed query engines. Deep proficiency in at least one modern programming language (e.g., Go, Python, Java). Familiarity with high-cardinality data challenges and telemetry correlation techniques. More ❯
Reading, Berkshire, United Kingdom Hybrid / WFH Options
Wireless Logic Group
Any company can tell you about how they are a multi award winning, market leading business and yes, we are both of those things in the world of IoT connectivity! But we're more than that. Our mission? To make More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Paradigm Talent
Role: Senior Backend Engineer (Product) Location: Hybrid - London Compensation: Up to £130,000 + equity We’re supporting a next-generation AI company developing a groundbreaking platform that automatically creates and scales immersive 3D environments. The team is bringing world More ❯
Role: Senior Backend Engineer (Product) Location: Hybrid - London Compensation: Up to £130,000 + equity We’re supporting a next-generation AI company developing a groundbreaking platform that automatically creates and scales immersive 3D environments. The team is bringing world More ❯
london (city of london), south east england, united kingdom Hybrid / WFH Options
Paradigm Talent
Role: Senior Backend Engineer (Product) Location: Hybrid - London Compensation: Up to £130,000 + equity We’re supporting a next-generation AI company developing a groundbreaking platform that automatically creates and scales immersive 3D environments. The team is bringing world More ❯
slough, south east england, united kingdom Hybrid / WFH Options
Paradigm Talent
Role: Senior Backend Engineer (Product) Location: Hybrid - London Compensation: Up to £130,000 + equity We’re supporting a next-generation AI company developing a groundbreaking platform that automatically creates and scales immersive 3D environments. The team is bringing world More ❯