of managing and orchestrating containerized applications at scale. Observability and Monitoring: Practical experience with observability tools and practices (e.g., Grafana, Prometheus, ELK Stack, Datadog, OpenTelemetry) to ensure system health and optimize performance via logging, metrics, and tracing. DevSecOps and Security: Strong expertise in DevSecOps practices, including automated security testing, policy More ❯
of managing and orchestrating containerized applications at scale. Observability and Monitoring: Practical experience with observability tools and practices (e.g., Grafana, Prometheus, ELK Stack, Datadog, OpenTelemetry) to ensure system health and optimize performance via logging, metrics, and tracing. DevSecOps and Security: Strong expertise in DevSecOps practices, including automated security testing, policy More ❯
Prometheus, Logz.io, SignalFX, Instana, Splunk, Honeycomb, Jaeger Hands-on experience with Infrastructure as a Code (Terraform/Ansible) Hands-on experience in technical integrations (OpenTelemetry/fluentd/fluentbit/filebeat/logstash) Hands-on experience with complex troubleshooting of Kubernetes and Docker container Good knowledge of Regex, Lucene, PromQL More ❯
Prometheus, Logz.io, SignalFX, Instana, Splunk, Honeycomb, Jaeger Hands-on experience with Infrastructure as a Code (Terraform/Ansible) Hands-on experience in technical integrations (OpenTelemetry/fluentd/fluentbit/filebeat/logstash) Hands-on experience with complex troubleshooting of Kubernetes and Docker container Good knowledge of Regex, Lucene, PromQL More ❯
Prometheus, Logz.io, SignalFX, Instana, Splunk, Honeycomb, Jaeger Hands-on experience with Infrastructure as a Code (Terraform/Ansible) Hands-on experience in technical integrations (OpenTelemetry/fluentd/fluentbit/filebeat/logstash) Hands-on experience with complex troubleshooting of Kubernetes and Docker container Good knowledge of RegEx, Lucene, PromQL More ❯
and performance. Experience in implementing observability, instrumenting applications to provide insights into system performance. Hands-on experience with tools such as Dynatrace, Prometheus and OpenTelemetry for monitoring, tracing, and real-time alerting is highly sought after. An understanding of microservices and container orchestration with the ability to optimise containerised applications More ❯
SNS, SQS, EventBridge). Knowledge of GraphQL, WebSockets, or real-time data streaming. Exposure to DevOps and observability practices (e.g., Prometheus, Datadog, AWS CloudWatch, OpenTelemetry). Prior experience in leading distributed engineering teams. More ❯
SNS, SQS, EventBridge). Knowledge of GraphQL, WebSockets, or real-time data streaming. Exposure to DevOps and observability practices (e.g., Prometheus, Datadog, AWS CloudWatch, OpenTelemetry). Prior experience in leading distributed engineering teams. More ❯
Terragrunt) Kubernetes expertise in container orchestration and cluster management Network engineering skills including load balancers, CDN, Istio, and security patterns Experience with observability platforms (OpenTelemetry) and distributed systems Nice-to-have skills: Python programming and Linux system debugging Database administration (SQL, MongoDB, Redis) Message broker and event streaming experience (Kafka More ❯
Terragrunt) Kubernetes expertise in container orchestration and cluster management Network engineering skills including load balancers, CDN, Istio, and security patterns Experience with observability platforms (OpenTelemetry) and distributed systems Nice-to-have skills: Python programming and Linux system debugging Database administration (SQL, MongoDB, Redis) Message broker and event streaming experience (Kafka More ❯
and Kubernetes. Manage CI/CD pipelines using GitHub Actions and ensure smooth delivery to production. Own monitoring, alerting, and observability, using tools like OpenTelemetry and Dynatrace. Security & Compliance: Ensure systems are compliant with PCI DSS, PSD2, and SCA. Champion secure coding practices and data protection across services. Collaboration & Mentoring More ❯
Sub sink to integrate with the rest of the system, emitting ComplianceDecision events, and land these in BigQuery for analytics. Light-up observability – deploy OpenTelemetry traces + Grafana dashboards, and config alerts for latency, failure %, override rate. More ❯
Sub sink to integrate with the rest of the system, emitting ComplianceDecision events, and land these in BigQuery for analytics. Light-up observability – deploy OpenTelemetry traces + Grafana dashboards, and config alerts for latency, failure %, override rate. More ❯
skills and experiences are highly desirable: Experience with event-driven architecture and design patterns Knowledge of the Kubernetes ecosystem, specifically AWS EKS Proficiency with OpenTelemetry for observability Previous experience mentoring and guiding junior team members The Walt Disney Company is an Equal Opportunity Employer. We strive to be a diverse More ❯
SNS, SQS, EventBridge). Knowledge of GraphQL, WebSockets, or real-time data streaming. Exposure to DevOps and observability practices (e.g., Prometheus, Datadog, AWS CloudWatch, OpenTelemetry). Prior experience in leading distributed engineering teams. Carbon60, Lorien & SRG - The Impellam Group STEM Portfolio are acting as an Employment Business in relation to More ❯
owning the delivery of significant functionality, ideally having worked with peers of different levels to complete projects collaboratively. Our technology stack: Python (including FastAPI, OpenTelemetry, procrastinate, SQLAlchemy, Uvicorn), Postgres, MySQL, Liquibase, Retool, Docker, AWS Who you are: Seven or more years professional experience in software engineering Proven experience leading the More ❯
roles (e.g. Solutions Architect, Sales Engineering, Pre-Sales). Background in enterprise SaaS, especially in infrastructure monitoring, analytics, or APM. Hands-on expertise with OpenTelemetry, Kubernetes, and modern cloud-native observability stacks. Familiarity with streaming data and real-time metric processing. Experience working in Agile environments and across the full More ❯
roles (e.g. Solutions Architect, Sales Engineering, Pre-Sales). Background in enterprise SaaS, especially in infrastructure monitoring, analytics, or APM. Hands-on expertise with OpenTelemetry, Kubernetes, and modern cloud-native observability stacks. Familiarity with streaming data and real-time metric processing. Experience working in Agile environments and across the full More ❯
to drive our alerting, or coordinating across multiple teams to manage the response to an incident. Our technology stack: AWS (including ECS and RDS), OpenTelemetry, NewRelic, Python, Postgres, Liquibase, Angular, Docker Who you are: Four or more years professional experience in a customer-facing technical support or engineering role Excellent More ❯
you'll contribute to influence and shape both the strategy and implementation of our evolving observability capabilities across the Birdie system; you'll leverage OpenTelemetry and SRE practices to support squads in proactively identifying issues before they impact customers; You'll play a vital role in building and maintaining our More ❯