with observability tools, APM, log analytics, and infrastructure monitoring. Proficiency in scripting or programming languages (e.g., Java, Python, JavaScript). Certifications in Dynatrace, AWS, Azure, or GCP. Familiarity with OpenTelemetry, FluentBit, Cribl, or similar data pipeline tools.Ability to translate technical capabilities into business value, aligning observability solutions with customer KPIs and strategic goals Excellent communication and presentation skills. Ability to More ❯
Middlesex, south east england, united kingdom Hybrid/Remote Options
Sky
networking and security standards, protocols and best practices Proven experience in logging systems (e.g. ELK stack ) Proven experience in monitoring systems (e.g. Prometheus ) Proven experience in tracing systems (e.g. OpenTelemetry , Jaeger) Experience in performance optimization and resource management Relevant certifications (AWS, Google) Understanding of Agile methodologies Ability to diagnose and resolve service- affecting issues in a Broadcast/Livestream environment More ❯
Sheffield, South Yorkshire, Yorkshire, United Kingdom Hybrid/Remote Options
Vallum Associates Limited
Preferred Qualifications: OpenShift certifications (e.g., Red Hat Certified Specialist in OpenShift Administration). Experience with multi-cluster and hybrid cloud OpenShift deployments. Familiarity with monitoring and logging tools (e.g., oTel, Grafana, Splunk stack). Knowledge of OpenShift Operators and Helm charts. Experience with large-scale migration projects. More ❯
development. Familiarity with testing frameworks (Vitest, Playwright) for both API and end-to-end testing. Experience with Docker, Helm, YAML, Kubernetes, and cloud-native deployments. Telemetry tools; Prometheus, Grafana, OpenTelemetry, DataDog, APM tools Understanding of infrastructure-as-code and CI/CD pipelines. Ability to improve codebases and influence architectural direction. Experience mentoring or coaching engineers. Please send updated CV More ❯
and problem-solving skills. Preferred Qualifications Red Hat OpenShift certifications (RHCSA/OCP Specialist). Experience with multi-cluster or hybrid-cloud OpenShift deployments. Familiarity with monitoring & logging tools (oTel, Grafana, Splunk stack). Knowledge of OpenShift Operators and Helm charts. Experience delivering large-scale migration programmes. How to Apply If you are an OpenShift expert passionate about designing enterprise More ❯
will be helping the client move to an AIOps environment. What you'll need to succeed Extensive experience in observability/SRE/platform engineering roles Strong experience with OpenTelemetry, Prometheus, Grafana, Splunk, Elastic etc Python, Go or Java programming Experience with Terraform, Helm or other IAC tools What you'll get in return An exciting opportunity to join an More ❯
Wigan, Lancashire, England, United Kingdom Hybrid/Remote Options
Searchability
or .NET preferred) * Cloud experience, ideally AWS, and knowledge of container orchestration (Kubernetes) and Infrastructure as Code (Terraform) * Experience with monitoring and observability tools such as Grafana, Prometheus or OpenTelemetry * Strong understanding of networking fundamentals and distributed systems* Ability to collaborate effectively with engineering, operations and product teams TO BE CONSIDERED: Please either apply through this advert or email me More ❯
Wigan, Greater Manchester, North West, United Kingdom Hybrid/Remote Options
Searchability (UK) Ltd
or .NET preferred) * Cloud experience, ideally AWS, and knowledge of container orchestration (Kubernetes) and Infrastructure as Code (Terraform) * Experience with monitoring and observability tools such as Grafana, Prometheus or OpenTelemetry * Strong understanding of networking fundamentals and distributed systems * Ability to collaborate effectively with engineering, operations and product teams TO BE CONSIDERED: Please either apply through this advert or email me More ❯
Airflow/Prefect, DVC, Feast Vectors/NLP: pgvector , FAISS/Milvus/Qdrant , Transformers/embedding libraries Automation/low-code OSS: Robot Framework, TagUI , n8n, Appsmith Observability: OpenTelemetry Company Overview: Keurig Dr Pepper (NASDAQ: KDP) is a leading beverage company in North America, with a portfolio of more than 125 owned, licensed and partner brands and powerful distribution More ❯
Migration track record Documented end-to-end Java-to-Azure migrations (assessment refactor re-platform cut-over Micro-service domain modelling & patterns Bounded-context design, DDD, event buses, observability (OpenTelemetry) in Java. We are an equal opportunity employer. All aspects of employment including the decision to hire, promote, discipline, or discharge, will be based on merit, competence, performance, and business More ❯
release validation, and production monitoring Strong communication skills; can adapt output to technical and non-technical audiences Bonus Points: Background in QA, test automation, or release engineering Experience with OpenTelemetry, distributed tracing, or event-driven logs Experience in continuous delivery environments with real-time observability needs Prior involvement in incident reviews or quality postmortems Relevant certifications (e.g., Data Analytics, SQL More ❯
Lambda, Glue, Redshift, OpenSearch) • Hands-on experience deploying AI/LLM-based systems into production • Experience using dbt Cloud for transformation pipelines • Familiarity with tracing and observability (e.g., Langfuse, OpenTelemetry) • Experience preparing datasets and running supervised fine-tuning (SFT) of LLMs • Exposure to reverse ETL tools (e.g., Census, Hightouch) or building custom syncs to HubSpot, Slack, APIs Responsibilities: AI & Application More ❯
Backend Engineer with strong Kotlin and Java (Spring Boot) expertise to build scalable services. Core Tech Kotlin (+ Java) , Spring Boot , Gradle , IntelliJ Coroutines , Kubernetes , Gitlab CI, Harness, Sonar OpenTelemetry, Grafana We Look For High Autonomy and ownership of projects. Experience with large, multi-module codebases . Data-driven decisions (experimentation/feature toggles). Collaboration with Product/UX More ❯
Backend Engineer with strong Kotlin and Java (Spring Boot) expertise to build scalable services. Core Tech Kotlin (+ Java) , Spring Boot , Gradle , IntelliJ Coroutines , Kubernetes , Gitlab CI, Harness, Sonar OpenTelemetry, Grafana We Look For High Autonomy and ownership of projects. Experience with large, multi-module codebases . Data-driven decisions (experimentation/feature toggles). Collaboration with Product/UX More ❯
Warwick, Warwickshire, West Midlands, United Kingdom Hybrid/Remote Options
Sanderson Government and Defence
ElasticSearch clusters, Kibana dashboards, and Logstash pipelines. Integrate SIEM with cloud-native observability tools (AWS CloudWatch, Azure Monitor, GCP Operations Suite). Automate log collection and enrichment using Beats, OpenTelemetry, and scripting. Security Use Cases & Threat Detection Build and maintain SIEM use cases, alerts, and dashboards for threat detection. Map detection rules to frameworks like MITRE ATT&CK, STRIDE, and More ❯
Pflugerville, Texas, United States Hybrid/Remote Options
Charles Schwab
on site in the specified location(s). This role is responsible for supporting and maintaining enterprise monitoring and telemetry platforms; Confluent Enterprise Platform (i.e., Kafka), ITRS Geneos, and OpenTelemetry telemetry pipeline as a member of the Enterprise Telemetry team. Activities include supporting Kafka producers and consumers, ITRS agent administration, OTEL pipeline management, troubleshooting and resolving issues, identifying opportunities for … include: On-boarding new Kafka producer and consumer use cases. Engineering and supporting the enterprise telemetry pipeline Testing and deploying software upgrades. Managing and supporting telemetry agents. Support of OpenTelemetry collectors Issue troubleshooting and resolution. What you have Deep understanding of the Confluent Enterprise Platform component: Brokers, Topics, Partitions, Producers, Consumers, Zookeeper, KRaft. Ability to setup and configure on-prem … Kafka components, replication factors, and partitioning. E xperience engineering logging platforms Understanding of telemetry monitoring platforms and concepts, like ITRS Geneos, OpenTelemetry agents like Grafana Alloy. Grafana Cloud and Datadog. Deep understanding of security protocols: SSL/TLS, SASL, LDAP, etc. and role-based authentication. Experience working in telemetry monitoring (alerts, events, logs, metrics, and traces). Experience working in More ❯
Austin, Texas, United States Hybrid/Remote Options
Charles Schwab
on site in the specified location(s). This role is responsible for supporting and maintaining enterprise monitoring and telemetry platforms; Confluent Enterprise Platform (i.e., Kafka), ITRS Geneos, and OpenTelemetry telemetry pipeline as a member of the Enterprise Telemetry team. Activities include supporting Kafka producers and consumers, ITRS agent administration, OTEL pipeline management, troubleshooting and resolving issues, identifying opportunities for … include: On-boarding new Kafka producer and consumer use cases. Engineering and supporting the enterprise telemetry pipeline Testing and deploying software upgrades. Managing and supporting telemetry agents. Support of OpenTelemetry collectors Issue troubleshooting and resolution. What you have Deep understanding of the Confluent Enterprise Platform component: Brokers, Topics, Partitions, Producers, Consumers, Zookeeper, KRaft. Ability to setup and configure on-prem … Kafka components, replication factors, and partitioning. E xperience engineering logging platforms Understanding of telemetry monitoring platforms and concepts, like ITRS Geneos, OpenTelemetry agents like Grafana Alloy. Grafana Cloud and Datadog. Deep understanding of security protocols: SSL/TLS, SASL, LDAP, etc. and role-based authentication. Experience working in telemetry monitoring (alerts, events, logs, metrics, and traces). Experience working in More ❯
on site in the specified location(s). This role is responsible for supporting and maintaining enterprise monitoring and telemetry platforms; Confluent Enterprise Platform (i.e., Kafka), ITRS Geneos, and OpenTelemetry telemetry pipeline as a member of the Enterprise Telemetry team. Activities include supporting Kafka producers and consumers, ITRS agent administration, OTEL pipeline management, troubleshooting and resolving issues, identifying opportunities for … include: On-boarding new Kafka producer and consumer use cases. Engineering and supporting the enterprise telemetry pipeline Testing and deploying software upgrades. Managing and supporting telemetry agents. Support of OpenTelemetry collectors Issue troubleshooting and resolution. What you have Deep understanding of the Confluent Enterprise Platform component: Brokers, Topics, Partitions, Producers, Consumers, Zookeeper, KRaft. Ability to setup and configure on-prem … Kafka components, replication factors, and partitioning. E xperience engineering logging platforms Understanding of telemetry monitoring platforms and concepts, like ITRS Geneos, OpenTelemetry agents like Grafana Alloy. Grafana Cloud and Datadog. Deep understanding of security protocols: SSL/TLS, SASL, LDAP, etc. and role-based authentication. Experience working in telemetry monitoring (alerts, events, logs, metrics, and traces). Experience working in More ❯
Bellevue, Iowa, United States Hybrid/Remote Options
Charles Schwab
on site in the specified location(s). This role is responsible for supporting and maintaining enterprise monitoring and telemetry platforms; Confluent Enterprise Platform (i.e., Kafka), ITRS Geneos, and OpenTelemetry telemetry pipeline as a member of the Enterprise Telemetry team. Activities include supporting Kafka producers and consumers, ITRS agent administration, OTEL pipeline management, troubleshooting and resolving issues, identifying opportunities for … include: On-boarding new Kafka producer and consumer use cases. Engineering and supporting the enterprise telemetry pipeline Testing and deploying software upgrades. Managing and supporting telemetry agents. Support of OpenTelemetry collectors Issue troubleshooting and resolution. What you have Deep understanding of the Confluent Enterprise Platform component: Brokers, Topics, Partitions, Producers, Consumers, Zookeeper, KRaft. Ability to setup and configure on-prem … Kafka components, replication factors, and partitioning. E xperience engineering logging platforms Understanding of telemetry monitoring platforms and concepts, like ITRS Geneos, OpenTelemetry agents like Grafana Alloy. Grafana Cloud and Datadog. Deep understanding of security protocols: SSL/TLS, SASL, LDAP, etc. and role-based authentication. Experience working in telemetry monitoring (alerts, events, logs, metrics, and traces). Experience working in More ❯