Openshift Telemetry
Your responsibilities:
In this role, you will be
Primarily responsible for implementing, managing, and optimizing the observability stack within a Red Hat OpenShift Container Platform environment to ensure system health, performance, and security.
Bridge the gap between application monitoring and infrastructure, leveraging different tools like tools
Your Profile
Essential skills/knowledge/experience:
* Design, implement, and maintain data pipelines to ingest and process OpenShift telemetry (metrics, logs, traces) at scale.
* Stream OpenShift telemetry via Kafka (producers, topics, schemas) and build resilient consumer services for transformation and enrichment.
* Engineer data models and routing for multi-tenant observability; ensure lineage, quality, and SLAs across the stream layer.
* Integrate processed telemetry into Splunk for visualization, dashboards, alerting, and analytics to achieve Observability Level 4 (proactive insights).
* Implement schema management (Avro/Protobuf), governance, and versioning for telemetry events.
* Build automated validation, replay, and backfill mechanisms for data reliability and recovery.
* Instrument services with Open Telemetry; standardize tracing, metrics, and structured logging across platforms.
* Use LLMs to enhance observability capabilities (eg, query assistance, anomaly summarization, runbook generation).
* Collaborate with platform, SRE, and application teams to integrate telemetry, alerts, and SLOs.
* Ensure security, compliance, and best practices for data pipelines and observability platforms.
* Document data flows, schemas, dashboards, and operational runbooks.
Desirable skills/knowledge/experience/Personal attributes:
* Hands-on experience building streaming data pipelines with Kafka (producers/consumers, schema registry, Kafka Connect/KSQL/Stream).
* Proficiency with OpenShift/Kubernetes telemetry (Open Telemetry, Prometheus) and CLI tooling.
* Experience integrating telemetry into Splunk (HEC, UF, source types, CIM), building dashboards and alerting.
* Strong data engineering skills in Python (or similar) for ETL/ELT, enrichment, and validation.
* Knowledge of event schemas (Avro/Protobuf/JSON), contracts, and backward/forward compatibility.
* Familiarity with observability standards and practices; ability to drive toward Level 4 maturity (proactive monitoring, automated insights).
* Understanding of hybrid cloud and multi-cluster telemetry patterns.
* Security and compliance for data pipelines: secret management, RBAC, encryption in transit/at rest.
* Good problem-solving skills and ability to work in a collaborative team environment.
* Strong communication and documentation skills.
LA International is a HMG approved ICT Recruitment and Project Solutions Consultancy, operating globally from the largest single site in the UK as an IT Consultancy or as an Employment Business & Agency depending upon the precise nature of the work, for security cleared jobs or non-clearance vacancies, LA International welcome applications from all sections of the community and from people with diverse experience and backgrounds.
Award Winning LA International, winner of the Recruiter Awards for Excellence, Best IT Recruitment Company, Best Public Sector Recruitment Company and overall Gold Award winner, has now secured the most prestigious business award that any business can receive, The Queens Award for Enterprise: International Trade, for the second consecutive period.