Prometheus, Logz.io, SignalFX, Instana, Splunk, Honeycomb, Jaeger Hands-on experience with Infrastructure as a Code (Terraform/Ansible) Hands-on experience in technical integrations (OpenTelemetry/fluentd/fluentbit/filebeat/logstash) Hands-on experience with complex troubleshooting of Kubernetes and Docker container Good knowledge of Regex, Lucene, PromQL More ❯
Prometheus, Logz.io, SignalFX, Instana, Splunk, Honeycomb, Jaeger Hands-on experience with Infrastructure as a Code (Terraform/Ansible) Hands-on experience in technical integrations (OpenTelemetry/fluentd/fluentbit/filebeat/logstash) Hands-on experience with complex troubleshooting of Kubernetes and Docker container Good knowledge of Regex, Lucene, PromQL More ❯
Prometheus, Logz.io, SignalFX, Instana, Splunk, Honeycomb, Jaeger Hands-on experience with Infrastructure as a Code (Terraform/Ansible) Hands-on experience in technical integrations (OpenTelemetry/fluentd/fluentbit/filebeat/logstash) Hands-on experience with complex troubleshooting of Kubernetes and Docker container Good knowledge of Regex, Lucene, PromQL More ❯
Prometheus, Logz.io, SignalFX, Instana, Splunk, Honeycomb, Jaeger Hands-on experience with Infrastructure as a Code (Terraform/Ansible) Hands-on experience in technical integrations (OpenTelemetry/fluentd/fluentbit/filebeat/logstash) Hands-on experience with complex troubleshooting of Kubernetes and Docker container Good knowledge of RegEx, Lucene, PromQL More ❯
with cloud platforms (AWS, GCP, Azure) and DevOps tooling Familiarity with observability stacks like Grafana, Prometheus, Datadog, Splunk, Kibana, etc. Experience with technical integrations (OpenTelemetry, Fluentd, Fluentbit, Filebeat, etc.) Skilled in troubleshooting Kubernetes and containerised environments Strong communication skills — able to engage with technical teams and senior stakeholders Comfortable working More ❯
with cloud platforms (AWS, GCP, Azure) and DevOps tooling Familiarity with observability stacks like Grafana, Prometheus, Datadog, Splunk, Kibana, etc. Experience with technical integrations (OpenTelemetry, Fluentd, Fluentbit, Filebeat, etc.) Skilled in troubleshooting Kubernetes and containerised environments Strong communication skills — able to engage with technical teams and senior stakeholders Comfortable working More ❯
with cloud platforms (AWS, GCP, Azure) and DevOps tooling Familiarity with observability stacks like Grafana, Prometheus, Datadog, Splunk, Kibana, etc. Experience with technical integrations (OpenTelemetry, Fluentd, Fluentbit, Filebeat, etc.) Skilled in troubleshooting Kubernetes and containerised environments Strong communication skills — able to engage with technical teams and senior stakeholders Comfortable working More ❯
related field. 5+ years of experience as a Site Reliability Engineer or equivalent in a similar role. Proficient in application and infrastructure observability, Splunk OpenTelemetry preferred Experienced in production environments running in AWS Comfortable with Infrastructure as Code, Terraform is preferred Comfortable with CI/CD pipelines such as GitHub More ❯
and performance. Experience in implementing observability, instrumenting applications to provide insights into system performance. Hands-on experience with tools such as Dynatrace, Prometheus and OpenTelemetry for monitoring, tracing, and real-time alerting is highly sought after. An understanding of microservices and container orchestration with the ability to optimise containerised applications More ❯
SNS, SQS, EventBridge). Knowledge of GraphQL, WebSockets, or real-time data streaming. Exposure to DevOps and observability practices (e.g., Prometheus, Datadog, AWS CloudWatch, OpenTelemetry). Prior experience in leading distributed engineering teams. Carbon60, Lorien & SRG - The Impellam Group STEM Portfolio are acting as an Employment Business in relation to More ❯
in Go (Golang) , Java , Kotlin , JavaScript/Node.js , or Python Strong hands-on experience with Kubernetes (K8s) and OpenShift Experience with MongoDB , Kafka , Prometheus , OpenTelemetry , Grafana Familiarity with tools like Helm , Kustomize , Terraform , and Vault Proven experience with hybrid cloud environments (on-prem + public cloud) Ability to explain complex More ❯
in Go (Golang) , Java , Kotlin , JavaScript/Node.js , or Python Strong hands-on experience with Kubernetes (K8s) and OpenShift Experience with MongoDB , Kafka , Prometheus , OpenTelemetry , Grafana Familiarity with tools like Helm , Kustomize , Terraform , and Vault Proven experience with hybrid cloud environments (on-prem + public cloud) Ability to explain complex More ❯
skills and experiences are highly desirable: Experience with event-driven architecture and design patterns Knowledge of the Kubernetes ecosystem, specifically AWS EKS Proficiency with OpenTelemetry for observability Previous experience mentoring and guiding junior team members The Walt Disney Company is an Equal Opportunity Employer. We strive to be a diverse More ❯
databases (ideally Postgres, MongoDB). Experience of event streaming (Apache Kafka) would also be beneficial. Familiarity with observability platforms such as Grafana, Zabbix, Prometheus, OpenTelemetry/SigNoz. Experience of mobile telecoms principles and platforms would be advantageous but is not mandatory (such as EPC, DIAMETER/SS7 signalling, GTP and More ❯
skills with the ability to proactively engage with a wide range of stakeholders In depth experience with observability tools such as Grafana, Prometheus and OpenTelemetry Strong knowledge of publlic cloud environments such as AWS and GCP, and Infrastructure as Code tools such as Terraform More ❯
are working with here are migration to Dynatrace and you will be building Splunk Pipelines Design and implement monitoring pipelines using Splunk, Dynatrace, and OpenTelemetry (OTel). Automate the deployment of monitoring tools using Terraform, Ansible, and Jenkins. Manage configuration and version control with Bitbucket and Artifactory. Ensure seamless integration More ❯
transform the TechOps team Participate in the operational management of OpenShift Work with technologies such as Ansible, PowerShell, C#, SQL Server, Elastic Grafana, Prometheus, OpenTelemetry, Bare-metal builds, Hyper-V automation What we are looking for: Experience in TechOps, especially with Infrastructure as Code Familiarity with development technologies like C# More ❯
to drive our alerting, or coordinating across multiple teams to manage the response to an incident. Our technology stack: AWS (including ECS and RDS), OpenTelemetry, NewRelic, Python, Postgres, Liquibase, Angular, Docker Who you are: Four or more years professional experience in a customer-facing technical support or engineering role Excellent More ❯
owning the delivery of significant functionality, ideally having worked with peers of different levels to complete projects collaboratively. Our technology stack: Python (including FastAPI, OpenTelemetry, procrastinate, SQLAlchemy, Uvicorn), Postgres, MySQL, Liquibase, Retool, Docker, AWS Who you are: Seven or more years professional experience in software engineering Proven experience leading the More ❯
you'll contribute to influence and shape both the strategy and implementation of our evolving observability capabilities across the Birdie system; you'll leverage OpenTelemetry and SRE practices to support squads in proactively identifying issues before they impact customers; You'll play a vital role in building and maintaining our More ❯
VARIED DAY TO DAY RESPONSIBILITIES Ensuring system reliability, performance, and scalability through monitoring and automation Building and maintaining observability solutions using Grafana, Prometheus, Loki, OpenTelemetry Proactively identifying and resolving performance bottlenecks and infrastructure issues Automating infrastructure provisioning, configuration management, and deployments Implementing effective logging, monitoring, and alerting strategies Managing incident … solutions Strong proficiency with observability and monitoring tools such as Grafana, Prometheus, and Loki Strong experience with distributed tracing and telemetry tools such as OpenTelemetry An understanding of cloud networking architecture and load balancing techniques Experience with container orchestration platforms like Kubernetes Proficiency in infrastructure as code (IaC) tools such More ❯