related field. 5+ years of experience as a Site Reliability Engineer or equivalent in a similar role. Proficient in application and infrastructure observability, Splunk OpenTelemetry preferred Experienced in production environments running in AWS Comfortable with Infrastructure as Code, Terraform is preferred Comfortable with CI/CD pipelines such as GitHub More ❯
Prometheus, Logz.io, SignalFX, Instana, Splunk, Honeycomb, Jaeger Hands-on experience with Infrastructure as a Code (Terraform/Ansible) Hands-on experience in technical integrations (OpenTelemetry/fluentd/fluentbit/filebeat/logstash) Hands-on experience with complex troubleshooting of Kubernetes and Docker container Good knowledge of Regex, Lucene, PromQL More ❯
Prometheus, Logz.io, SignalFX, Instana, Splunk, Honeycomb, Jaeger Hands-on experience with Infrastructure as a Code (Terraform/Ansible) Hands-on experience in technical integrations (OpenTelemetry/fluentd/fluentbit/filebeat/logstash) Hands-on experience with complex troubleshooting of Kubernetes and Docker container Good knowledge of Regex, Lucene, PromQL More ❯
Prometheus, Logz.io, SignalFX, Instana, Splunk, Honeycomb, Jaeger Hands-on experience with Infrastructure as a Code (Terraform/Ansible) Hands-on experience in technical integrations (OpenTelemetry/fluentd/fluentbit/filebeat/logstash) Hands-on experience with complex troubleshooting of Kubernetes and Docker container Good knowledge of RegEx, Lucene, PromQL More ❯
Warwick, Warwickshire, United Kingdom Hybrid / WFH Options
ICEO
programming language (Python, GoLang, C++, or Java). Solid experience with Terraform for IaC. Hands-on skills with observability tools (Prometheus, Grafana, ELK stack, OpenTelemetry) and logging pipelines (Kibana, Elasticsearch). Expertise in Docker and container orchestration using Kubernetes (preferably on GCP) and Helm. Familiarity with CI/CD systems More ❯
Warwick, Warwickshire, United Kingdom Hybrid / WFH Options
ICEO
to implement redundancy and disaster recovery scenarios. Track record in scaling high-efficiency production systems. Proficiency with observability tools (e.g., Prometheus, Grafana, Grafana Mimir, OpenTelemetry). Strong written and spoken English (B2 level or higher). Nice to Have: Experience with Argo CD and Argo Rollouts. Familiarity with technologies such More ❯
in cloud-native environments at scale. Exposure to high-load, high-performance systems and large-scale microservices architectures. Experience with observability and monitoring frameworks (OpenTelemetry, Grafana, Prometheus). Knowledge of Graph Databases and AI integration in platform operations is a plus. Experience mentoring junior engineers and leading cross-functional initiatives. More ❯
Code (IaC) : Proficiency with Infrastructure as Code (IaC) tools such as Terraform or CloudFormation. Distributed Tracing : Experience with distributed tracing tools like Jaeger or OpenTelemetry for debugging microservices. Security : Strong knowledge of securing microservices, Kubernetes clusters, and cloud-based applications. Additional Information We believe that coming together as a community More ❯
SNS, SQS, EventBridge). Knowledge of GraphQL, WebSockets, or real-time data streaming. Exposure to DevOps and observability practices (e.g., Prometheus, Datadog, AWS CloudWatch, OpenTelemetry). Prior experience in leading distributed engineering teams. More ❯
SNS, SQS, EventBridge). Knowledge of GraphQL, WebSockets, or real-time data streaming. Exposure to DevOps and observability practices (e.g., Prometheus, Datadog, AWS CloudWatch, OpenTelemetry). Prior experience in leading distributed engineering teams. More ❯
best practices Experience implementing and managing logging solutions (such as ELK stack) Proficiency with monitoring platforms (such as Prometheus) Familiarity with tracing technologies (including OpenTelemetry or Jaeger) Background in performance optimization and resource allocation Industry certifications (cloud platforms preferred) Knowledge of Agile development practices Capability to diagnose and address critical More ❯
best practices Experience implementing and managing logging solutions (such as ELK stack) Proficiency with monitoring platforms (such as Prometheus) Familiarity with tracing technologies (including OpenTelemetry or Jaeger) Background in performance optimization and resource allocation Industry certifications (cloud platforms preferred) Knowledge of Agile development practices Capability to diagnose and address critical More ❯
and Kubernetes. Manage CI/CD pipelines using GitHub Actions and ensure smooth delivery to production. Own monitoring, alerting, and observability, using tools like OpenTelemetry and Dynatrace. Security & Compliance: Ensure systems are compliant with PCI DSS, PSD2, and SCA. Champion secure coding practices and data protection across services. Collaboration & Mentoring More ❯
infrastructure level Experience with monitoring and logging tools like DataDog or Grafana's observability stack (Prometheus, Tempo, Loki, Grafana) Familiarity with the open standard OpenTelemetry Excellent written and verbal communication skills, we're a collaborative team! PLEASE NOTE: Our engineering teams work fully remotely across Europe but we are focusing More ❯
on experience with containerization (Docker, Kubernetes). Strong security mindset with experience in compliance frameworks (SOC, PCI, GDPR). Familiarity with monitoring tools like OpenTelemetry, Instana, or LogicMonitor. Scripting experience (Ruby, Python, Bash) for automation and infrastructure management. More ❯
databases (ideally Postgres, MongoDB). Experience of event streaming (Apache Kafka) would also be beneficial. Familiarity with observability platforms such as Grafana, Zabbix, Prometheus, OpenTelemetry/SigNoz. Experience of mobile telecoms principles and platforms would be advantageous but is not mandatory (such as EPC, DIAMETER/SS7 signalling, GTP and More ❯
skills and experiences are highly desirable: Experience with event-driven architecture and design patterns Knowledge of the Kubernetes ecosystem, specifically AWS EKS Proficiency with OpenTelemetry for observability Previous experience mentoring and guiding junior team members The Walt Disney Company is an Equal Opportunity Employer. We strive to be a diverse More ❯
SNS, SQS, EventBridge). Knowledge of GraphQL, WebSockets, or real-time data streaming. Exposure to DevOps and observability practices (e.g., Prometheus, Datadog, AWS CloudWatch, OpenTelemetry). Prior experience in leading distributed engineering teams. Carbon60, Lorien & SRG - The Impellam Group STEM Portfolio are acting as an Employment Business in relation to More ❯
Job ID: 42014 Location: Birmingham : 1 Trinity Park : Bi Position Category: Information Technology Position Type: Employee Regular LRQA is a global assurance provider with operations in over 100+ countries and a mission to delight our customers. We have a diverse More ❯
on a hybrid basis as their offices in London. Essential Skills Python, Pytest Common python libraries such as Pandas/Numpy/Jupyter notebooks OpenTelemetry Git/Github Github actions Docker Microservices and/or lambdas Postgresql Streamlit Desirable skills include: The Fast API ecosystem (Pydantic, SQLAlchemy, Alembic) AWS – including More ❯
london, south east england, united kingdom Hybrid / WFH Options
Lorien
on a hybrid basis as their offices in London. Essential Skills Python, Pytest Common python libraries such as Pandas/Numpy/Jupyter notebooks OpenTelemetry Git/Github Github actions Docker Microservices and/or lambdas Postgresql Streamlit Desirable skills include: The Fast API ecosystem (Pydantic, SQLAlchemy, Alembic) AWS – including More ❯
and the technologies that revolve around them Demonstrate significant experience with implementing code with Go, C++ or Python Develop projects with technologies such as opentelemetry, CSI, CNI, CI/CD tooling, Load Balancing, Service Mesh frameworks Work in a way that works for you FlexBase, Akamai's Global Flexible Working More ❯
owning the delivery of significant functionality, ideally having worked with peers of different levels to complete projects collaboratively. Our technology stack: Python (including FastAPI, OpenTelemetry, procrastinate, SQLAlchemy, Uvicorn), Postgres, MySQL, Liquibase, Retool, Docker, AWS Who you are: Seven or more years professional experience in software engineering Proven experience leading the More ❯