Data Engineer Kafka
Sheffield (3 days on-site per week)
Candidates must be eligible for BPSS clearance
We're looking for a highly skilled Senior Data & Observability Engineer to design, build, and evolve large-scale telemetry and observability capabilities across modern cloud platforms. This role is ideal for someone who thrives in high-volume, real-time data environments and enjoys solving complex engineering challenges.
What You'll Do
- Design, implement, and maintain scalable data pipelines to ingest and process telemetry (metrics, logs, traces) from OpenShift and Kubernetes platforms.
- Stream telemetry via Kafka-managing producers, topics, schemas, and building resilient consumer services for transformation and enrichment.
- Engineer multi-tenant observability data models and routing strategies, ensuring data lineage, quality, and SLAs.
- Integrate processed telemetry into Splunk, enabling dashboards, alerting, analytics, and proactive insights (Level 4 Observability maturity).
- Implement schema governance, versioning, and compatibility controls using Avro/Protobuf.
- Build automated validation, replay, and backfill tools to maintain data reliability and support recovery.
- Instrument services with OpenTelemetry and drive consistency in tracing, metrics, and structured logging.
- Use LLMs to enhance observability workflows-such as anomaly summarisation, query assistance, and automated runbook creation.
- Work closely with platform, SRE, and application teams to integrate telemetry, alerts, and SLOs.
- Ensure security, compliance, RBAC, and best practices across pipelines and tooling.
- Produce clear documentation covering data flows, schemas, dashboards, and operational processes.
What You'll Bring
- Strong hands-on experience with Kafka (producers/consumers, schema registry, Kafka Connect, KSQL/KStreams).
- Proficiency with OpenShift / Kubernetes telemetry, including OpenTelemetry and Prometheus.
- Experience integrating telemetry into Splunk (HEC, UF, sourcetypes, CIM) and building dashboards and alerts.
- Solid data engineering skills in Python (or similar) for ETL/ELT, enrichment, and validation.
- Understanding of schema formats (Avro, Protobuf, JSON) and compatibility strategies.
- Knowledge of observability standards and ability to drive toward proactive, automated monitoring capabilities.
- Understanding of telemetry in hybrid cloud and multi-cluster environments.
- Strong security awareness-secrets management, RBAC, encryption in transit/at rest.
- Great communication skills and the ability to work effectively with cross-functional teams.
Security Requirement Applicants must be eligible for BPSS clearance (Baseline Personnel Security Standard). What you need to do now If you're interested in this role, click 'apply now' to forward an up-to-date copy of your CV, or call us now.If this job isn't quite right for you, but you are looking for a new position, please contact us for a confidential discussion about your career.
Hays Specialist Recruitment Limited acts as an employment agency for permanent recruitment and employment business for the supply of temporary workers. By applying for this job you accept the T&C's, Privacy Policy and Disclaimers which can be found at hays.co.uk