Lambda or Azure Functions. DevOps and Automation: CI/CD: Jenkins, GitLab CI/CD, or CircleCI. IaC: Terraform or AWS CloudFormation. Monitoring: Prometheus, Grafana, or Datadog. More ❯
Lambda or Azure Functions. DevOps and Automation: CI/CD: Jenkins, GitLab CI/CD, or CircleCI. IaC: Terraform or AWS CloudFormation. Monitoring: Prometheus, Grafana, or Datadog. More ❯
clusters in production, with an understanding of how containers interact with network and system resources. Monitoring & Logging: Knowledge of monitoring and logging tools (Prometheus, Grafana, ELK stack, or similar) as well as how to instrument applications. Programming Skills: Proficiency in at least one programming language (e.g., Golang, Python, Java, C More ❯
and implementation - Bachelor's degree in Computer Science, Engineering, related field, or equivalent experience - Experience in designing and implementing comprehensive monitoring solutions using Prometheus, Grafana, and other observability tools - Experience in managing and orchestrating containerized applications using Docker and Kubernetes - Experience in building and maintaining CI/CD pipelines using More ❯
cloud-based Kubernetes services (e.g., EKS). Knowledge of service mesh technologies (e.g., Istio, Linkerd). Experience with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack). Familiarity with CI/CD pipelines and containerization best practices. Kubernetes certifications (e.g., CKA, CKAD). More ❯
cloud-based Kubernetes services (e.g., EKS). Knowledge of service mesh technologies (e.g., Istio, Linkerd). Experience with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack). Familiarity with CI/CD pipelines and containerization best practices. Kubernetes certifications (e.g., CKA, CKAD). !! This role is officed based , with More ❯
hybrid cloud solutions using Azure Arc and hybrid connectivity strategies. Monitoring & Resilience: Implement observability using Azure Monitor, Log Analytics, App Insights, and Prometheus/Grafana . Design for high availability (HA), disaster recovery (DR), and business continuity (BCP) . Conduct chaos engineering to test resilience and fault tolerance. Work closely More ❯
Cambridge, Landbeach, Cambridgeshire, United Kingdom
Polytec Personnel Ltd
as Code * Experience of using static code analysis tools, such as BlackDuck * Able to use and manage other monitoring tools, such as Nagios, SolarWinds, Grafana, Prometheus etc. * Experience of resolving complex issues using your debugging skills * Strong communication skills, including the ability to explain technical concepts to non-technical colleagues More ❯
as Code Experience of using static code analysis tools, such as BlackDuck Able to use and manage other monitoring tools, such as Nagios, SolarWinds, Grafana, Prometheus etc. Experience of resolving complex issues using your debugging skills Strong communication skills, including the ability to explain technical concepts to non-technical colleagues More ❯
cloud platform - Solid scripting and automation skills, using languages like Python, Bash, or PowerShell. - Experience with monitoring and logging tools (e.g., ELK Stack, Prometheus, Grafana) to ensure system reliability and performance. - Any Linux experience would be a bonus Key Responsibilities: - Drive the strategy and implementation for DevOps and SRE practices. More ❯
for Infrastructure as Code (IaC) management. Hands-on experience with Kubernetes and Helm for container orchestration. Proficiency in observability tools such as Elastic Cloud, Grafana, and Prometheus . Experience in building and managing CI/CD pipelines Solid knowledge of Linux systems and shell scripting. Proficiency in programming languages such More ❯
configuration and troubleshooting Proficiency with using Puppet for configuration management, automation and system provisioning Hands-on experience in monitoring and observability platforms such as Grafana, Prometheus, Elasticsearch, Jaeger Experience with cloud architectures such as GCP or AWS Familiarity with SQL databases and broker systems such as Kafka You are a More ❯
and Kubernetes. It is our mission to build highly resilient, dynamically scaling, self-healing systems by automating and monitoring everything using Terraform, Puppet, Prometheus, Grafana, Kibana, and Jenkins. Requirements: Strong understanding of operating systems, networking, and systems architecture; Strong experience working with Linux, as well as database, web, and file More ❯
and GCP Background knowledge and hands-on practice in Observability, specifically experience working with one or more of the following tools - Kibana, Open-Search, Grafana, Datadog, Sumo Logic, New Relic, AppDynamics, Dynatrace, Prometheus, Logz.io, SignalFX, Instana, Splunk, Honeycomb, Jaeger Hands-on experience with Infrastructure as a Code (Terraform/Ansible More ❯
and GCP Background knowledge and hands-on practice in Observability, specifically experience working with one or more of the following tools - Kibana, Open-Search, Grafana, Datadog, Sumo Logic, New Relic, AppDynamics, Dynatrace, Prometheus, Logz.io, SignalFX, Instana, Splunk, Honeycomb, Jaeger Hands-on experience with Infrastructure as a Code (Terraform/Ansible More ❯
systems to proactively detect and address potential issues, ensuring optimal performance and reliability, in environments like on-prem Prometheus/Thanos, as well as Grafana Cloud and Loki. Database Management: Manage hundreds of on-prem PostgreSQL databases, including performance tuning, backups, disaster recovery strategies, and their active/passive counterparts More ❯
and GCP Background knowledge and hands-on practice in Observability, specifically experience working with one or more of the following tools - Kibana, Open-Search, Grafana, Datadog, Sumologic, NewRelic, AppDynamics, Dynatrace, Prometheus, Logz.io, SignalFX, Instana, Splunk, Honeycomb, Jaeger Hands-on experience with Infrastructure as a Code (Terraform/Ansible) Hands-on More ❯
relational and non-relational databases, for example, RDS MySQL, RedShift Familiarity with any of the following logging, monitoring and alerting tools - CloudWatch, Splunk, StatusCake, Grafana, PagerDuty Scripting experience with PowerShell, Ruby, Python Experience working with Agile software development teams What's in it for you? Join an ever-growing, market More ❯
systems to proactively detect and address potential issues, ensuring optimal performance and reliability, in environments like on-prem Prometheus/Thanos, as well as Grafana Cloud and Loki. Database Management: Manage hundreds of on-prem PostgreSQL databases, including performance tuning, backups, disaster recovery strategies, and their active/passive counterparts More ❯
Technical Expertise Observability and SRE Practices: In-depth understanding of observability and Site Reliability Engineering practices. Familiarity with tools in the LGTM stack (Loki, Grafana, Tempo, Mimir) or equivalent observability platforms. Containerisation: Strong experience building and managing containerised applications, effectively leveraging container orchestration platforms such as Kubernetes. Cloud Expertise: Demonstrable More ❯
have experience in setting up monitoring, logging, and alerting for improved system observability. Tech Stack: GitHub, Kubernetes, Docker,Ansible, Terraform, Gitlab, Synk, Vault, Prometheus, Grafana, Splunk What’s in it for you At Accenture in addition to a competitive basic salary, you will also have an extensive benefits package which More ❯