OpenShift Telemetry Engineer

OpenShift Telemetry Engineer
The Role

We are seeking a skilled OpenShift Telemetry Engineer to join our team. In this role, you will be responsible for implementing, managing, and optimizing the observability stack within a Red Hat OpenShift Container Platform environment to ensure system health, performance, and security.

You will act as a bridge between application monitoring and infrastructure observability, leveraging modern telemetry and data streaming tools.

Key Responsibilities
  • Design, implement, and maintain data pipelines to ingest and process OpenShift telemetry data (metrics, logs, and traces) at scale.
  • Stream OpenShift telemetry through Kafka (producers, topics, schemas) and build resilient consumer services for transformation and enrichment.
  • Engineer data models and routing mechanisms for multi-tenant observability while ensuring data lineage, quality, and SLA adherence across streaming layers.
  • Integrate processed telemetry into Splunk for dashboards, visualization, alerting, and analytics to achieve Observability Level 4 (proactive insights).
  • Implement schema management, governance, and versioning using Avro or Protobuf for telemetry events.
  • Build automated validation, replay, and backfill mechanisms to ensure data reliability and recovery.
  • Instrument services with OpenTelemetry, standardizing tracing, metrics, and structured logging across platforms.
  • Utilize LLM-based capabilities to enhance observability (e.g., query assistance, anomaly summarization, runbook generation).
  • Collaborate with Platform, SRE, and Application teams to integrate telemetry, alerts, and SLOs.
  • Ensure security, compliance, and best practices for telemetry data pipelines and observability platforms.
  • Document data flows, schemas, dashboards, and operational runbooks.
Required Skills & Experience
  • Hands-on experience building streaming data pipelines with Kafka (producers/consumers, schema registry, Kafka Connect, KSQL, Kafka Streams).
  • Strong experience with OpenShift / Kubernetes telemetry, including OpenTelemetry and Prometheus.
  • Experience integrating telemetry into Splunk (HEC, Universal Forwarder, source types, CIM) and building dashboards and alerts.
  • Strong data engineering skills using Python (or similar languages) for ETL/ELT, enrichment, and validation.
  • Experience with event schemas (Avro, Protobuf, JSON) and schema compatibility strategies.
  • Familiarity with observability frameworks and maturity models, driving toward Level 4 observability (proactive monitoring and automated insights).
  • Understanding of hybrid cloud and multi-cluster telemetry architectures.
Preferred Skills:
  • Security and compliance practices for data pipelines, including:
    • Secret management
    • RBAC
    • Encryption in transit and at rest
  • Strong problem-solving and analytical skills.
  • Ability to work effectively in cross-functional teams.
  • Excellent communication and documentation skills.

JBRP1_UKTJ

Job Details

Company
Avance Consulting
Location
London, UK
Employment Type
Full-time
Posted