maintain scalable and highly available systems using load balancing, auto-scaling, canary releases, and blue-green deployments. Develop and maintain monitoring and logging dashboards with tools like New Relic, Prometheus, Grafana, and Datadog, ensuring observability through metrics, tracing, log aggregation, and alerting. Help teams determine settings and thresholds for alerts and automations based on application performance requirements. Monitor, optimize, and … scalability, high availability patterns, and DevOps metrics such as DORA. Knowledge of SLM metrics (SLAs, SLOs, SLIs) and their application. Experience with monitoring and observability tools like New Relic, Prometheus, Grafana, and Datadog. Experience working with Kafka and improving performance in event-driven, real-time data architectures. Familiarity with cloud providers like AWS, Azure, or GCP. Experience with CI/ More ❯
scripting languages (e.g., Python, Bash) to automate repetitive tasks and knowledge of configuration management tools (e.g., Ansible, Puppet, Chef). Expertise in setting up and maintaining monitoring systems (e.g., Prometheus, Grafana). Some other highly valued skills may include: Experience with cloud platforms (e.g., AWS, Azure, Google Cloud). Knowledge of containerization and orchestration tools (e.g., Docker, Kubernetes). Ability More ❯
Chesterfield, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
Hands-on with Docker (Kubernetes is a plus), infrastructure-as-code, and CI/CD tooling Strong scripting and automation experience in Python and Bash Familiarity with observability stacks (Prometheus, OpenTelemetry, eBPF) Cloud infrastructure experience (AWS/GCP/Azure), with attention to IAM and software supply chain security Curious, persistent, and comfortable experimenting at the lowest levels of the More ❯
should have experience with: Messaging/Streaming products - Kafka, IBM MQ, IBM IIB/ACE DevOps tools - Ansible, Chef, Kubernetes, GitLab, Jenkins SRE logging & Monitoring Tools - ELK stack, Grafana, Prometheus, Open Telemetry Programming languages: Java, Python Establishing SLOs and SLIs, and ensure the adherence with promoting the culture of SRE practices to continuously measure, improve & respond to incidents for conducting More ❯
Northampton, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
Bonus if you have: Previous experience on a cloud migration project ( AWS ? Azure) Familiarity with Azure DevOps and IaC tools like ARM/Bicep Monitoring/logging tools experience ( Prometheus, Grafana, ELK, etc.) #J-18808-Ljbffr More ❯
Nottingham, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
Bonus if you have: Previous experience on a cloud migration project ( AWS ? Azure) Familiarity with Azure DevOps and IaC tools like ARM/Bicep Monitoring/logging tools experience ( Prometheus, Grafana, ELK, etc.) #J-18808-Ljbffr More ❯
Lincoln, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
Bonus if you have: Previous experience on a cloud migration project ( AWS ? Azure) Familiarity with Azure DevOps and IaC tools like ARM/Bicep Monitoring/logging tools experience ( Prometheus, Grafana, ELK, etc.) #J-18808-Ljbffr More ❯
Nottingham, Nottinghamshire, United Kingdom Hybrid / WFH Options
Nexgencloud
documentation. Nice to Have: Programming & Scripting: Basic Bash scripting, Python, or Golang knowledge. Familiarity with Typescript (Next.js, Tailwind frameworks). Tool Experience: Knowledge of monitoring tools and ELK stack (Prometheus, Elasticsearch). Experience with nova hypervisor, Postman, Rundeck, or Netbox. Industry Knowledge: Exposure to virtualization technologies and their impact on hardware performance. What We Offer: A competitive salary and comprehensive More ❯
for building and managing complex data flows and integration processes. Familiarity with containerization and orchestration tools such as Docker and Kubernetes. Experience with monitoring and alerting tools such as Prometheus, Grafana, or ELK for data infrastructure Knowledge of security practices for handling sensitive data, including encryption, anonymization, and access control. Familiarity with data governance, data quality management, and compliance standards More ❯
for building and managing complex data flows and integration processes. Familiarity with containerization and orchestration tools such as Docker and Kubernetes. Experience with monitoring and alerting tools such as Prometheus, Grafana, or ELK for data infrastructure Knowledge of security practices for handling sensitive data, including encryption, anonymization, and access control. Familiarity with data governance, data quality management, and compliance standards More ❯