Splunk and OpenShift Observability Engineer

We're looking for a Splunk & OpenShift Observability Engineer to design, deploy, and optimise enterprise-grade monitoring across hybrid Kubernetes and OpenShift environments.

This is a high-impact role where you'll shape observability strategy, enhance service intelligence, and ensure platform reliability at scale - balancing performance, cost efficiency, and security governance.

You'll work at the intersection of platform engineering, observability, and service intelligence, helping to transform raw telemetry into actionable insight. This is an opportunity to influence reliability strategy, improve operational maturity, and deliver measurable value across a modern cloud-native estate.

What You'll Be Doing

  • Design, deploy, and operate Splunk Enterprise and ITSI across hybrid Kubernetes/OpenShift platforms
  • Onboard and normalise data at scale (HEC, Universal Forwarder, Deployment Server), aligning to CIM standards
  • Build and optimise ITSI service models: service trees, KPIs, adaptive thresholds, NEAP policies, glass tables, deep dives, and health scoring
  • Deliver OpenShift-focused executive and operational dashboards, including:
  • Cluster/API/etcd health
  • Node readiness and resource pressure
  • Pod restart trends and noisy-neighbour detection
  • Network and storage error visibility
  • Capacity, quota, and burst analysis
  • Optimise search and platform performance (workload rules, DMA, summary indexing, scheduling hygiene, concurrency tuning)
  • Implement intelligent alerting and automated routing into ITSM and ChatOps platforms, including enrichment, suppression windows, and maintenance scheduling
  • Govern data ingestion and security controls (RBAC, retention, PII handling, TLS, token governance, index and role mapping)
  • Integrate telemetry pipelines including OpenTelemetry, Prometheus, Fluentd/Fluent Bit/Vector, Kafka, CMDB and AIOps/ML solutions
  • Drive SLO/KPI alignment, golden signal monitoring, rollout/rollback health validation, and executive reporting

What You'll Bring

  • Deep expertise in Splunk Enterprise (SPL mastery, CIM alignment, saved searches, macros, KV stores, index/retention/RBAC design, performance tuning)
  • Strong experience with Splunk ITSI (service trees, KPIs, adaptive/time-based thresholds, NEAP tuning, Service Analyzer configuration)
  • Proven OpenShift/Kubernetes observability experience across control-plane metrics, events, logs, workload correlation, and capacity management
  • Hands-on experience with telemetry pipelines (OpenTelemetry/OTLP, Prometheus exporters, Fluentd/Fluent Bit/Vector, Kafka with TLS, HEC/UF/DS onboarding)
  • Strong understanding of reliability engineering principles (golden signals, SLO design, namespace/application KPI mapping)
  • Experience optimising performance and licensing costs using workload rules, DMA, and summary indexing
  • Solid security and compliance knowledge (TLS/mTLS, certificate/token hygiene, PII controls, auditability, role/index mapping)
  • Automation and integration expertise across ITSM, ChatOps, webhooks, CMDB enrichment, and AIOps tooling

Job Details

Company
CBSbutler Holdings Limited trading as CBSbutler
Location
Birmingham, West Midlands, West Midlands (County), United Kingdom
Employment Type
Contract
Salary
£400 - £490/day
Posted