monitoring of all critical components of our partners Datacenter to ensure efficient operations and minimize downtime. Responsibilities & Experience Icinga/Nagios. Prometheus + AlertManager. Grafana, ElasticSearch, Splunk (or similar tools like Zabbix, Graylog, Loki etc) Strong experience in managing automation tools such as Ansible, Puppet Jenkins and Bamboo Experience with more »
Desired Skills & Experience: Bonus points if you have experience with: Experience creating deployment pipelines with ArgoCD Technical exposure to Containerization, Kubernetes, Helm, Kibana, Elasticsearch, Grafana Security vulnerability assessment and resolution The Benefits: Joining our team comes with a host of benefits, including: Flexible working arrangements (fully remote or flexible hybrid more »
focus on maintaining high availability, reliability, and scalability of production systems. Strong expertise in monitoring, logging, and alerting tools such as Prometheus, ELK stack, Grafana, Azure Monitor etc., with the ability to take ownership of the observability suite. Experience managing APM tools such as Dynatrace or New Relic, utilizing their more »
frameworks/standards Desired understanding of data streaming and messaging frameworks (Kafka, Spark, etc.) Desired understanding of distributed tracing and monitoring (Zipkin, OpenTracing, Prometheus, Grafana, ELK stack, Micrometer metrics, etc.) Desired understanding of containers (Docker, Kubernetes, Helm, etc.) Desired experience in automating deployment, releases and testing in continuous integration, continuous more »
user of the service and its inner workings. Experience managing AWS or GCP. Experience in building or integrating Monitoring Tools (Datadog/Kibana/Grafana/Prometheus). Experience using Terraform/Docker/Kubernetes. Write software using either Java/Scala/Python . The following are nice to more »
of cloud solutions (such as AWS, Azure, etc.), including open-source tools, DevOps, and automation capabilities to enhance cyber defense (such as Zabbix, ELK, Grafana, Netbox, Netmiko, Ansible, Alienvault, OpenVas, etc.) Professional-level knowledge in public clouds, such as AWS security services and architectures. Extensive knowledge of Private Clouds and more »
BGP, redistribution, summarization), IP (IPV4 & IPV6, NAT, ACL's, DNS, DHCP), Multicast protocols (PIM SM/DM/SSM), and Cisco IOS Experience with Grafana Dashboards and Databases. Experience with Mattermost collaboration platform. Experience with Dell Hardware and iDRACs. Preferred Qualifications You Might Also Have: CCNA or CISSP is highly more »
BGP, redistribution, summarization), IP (IPV4 & IPV6, NAT, ACL's, DNS, DHCP), Multicast protocols (PIM SM/DM/SSM), and Cisco IOS Experience with Grafana Dashboards and Databases. Experience with Mattermost collaboration platform. Experience with Dell Hardware and iDRACs. Preferred Qualifications You Might Also Have: CCNA or CISSP is highly more »
BGP, redistribution, summarization), IP (IPV4 & IPV6, NAT, ACL's, DNS, DHCP), Multicast protocols (PIM SM/DM/SSM), and Cisco IOS Experience with Grafana Dashboards and Databases. Experience with Mattermost collaboration platform. Experience with Dell Hardware and iDRACs. Preferred Qualifications You Might Also Have: CCNA or CISSP is highly more »
2+ years of experience in a similar position 2+ years of experience working with security CI/CD pipelines - Jenkins, Git, Azure Monitoring - Prometheus, Grafana Automation - Python, Shell, Bash Terraform Kubernetes If you are interested in this role then please apply directly below or get in touch with Meghan Dodd more »
Platform Engineering (Java preferred) extensive experience with AWS, Kubernetes, Terraform, CI/CD tools strong observability experience, ideally with more modern approaches like Prometheus, Grafana, Open Telemetry comfortable with databases exposure to Kafka would be ideal more »
required for this position. Key Requirements: Expert scripting skills with Powershell, Bash, Python etc. In depth knowledge of monitoring & observability tools such as Promethus, Grafana and OpenTelemetry Strong knowledge of CI/CD tooling Experience with metrics and tracing instrumentation, such as LGTM and PromQL Knowledge of Windows & Linux platforms more »
or distributed tracing/monitoring, this will put you in a good position. Tech stack includes AWS, GCP, Azure, Kafka, Spark, Zipkin, OpenTracing, Prometheus, Grafana, ELK stack, Micrometer metrics, Docker, Kubernetes, Helm, automating deployment, releases, testing in CI, continuous delivery pipelines. more »
London, England, United Kingdom Hybrid / WFH Options
Hydrogen Group
platform issues effectively. · Track record of contributing to the delivery of capacity management strategy and roadmap components. · Familiarity with capacity management tools such as Grafana, Influx, TCO, or similar. · Proficiency in VMware, Linux, and Windows Server environments. Bonus if you have: · Background in numeracy or data science. · Practical experience in more »
Red hat Certification) Kubernetes Ansible Puppet Network analysis, tcpdump wireshark Shell Scripting Python Secondary Skills: SaltStack Ansible Puppet Kubernetes Keycloak Apache python bash Prometheus Grafana Splunk Responsibility: System Administration: Install, configure, and maintain Linux operating systems on both physical and virtual machines. Shell Scripting: Develop, maintain, and enhance shell scripts more »
Nottingham, Nottinghamshire, East Midlands, United Kingdom
Microlise
infrastructure and applications, and to resolving error-prone manual processes through automation. Technologies you will be using include: Powershell, Python, Ansible, ELK Stack, SolarWinds, Grafana, Prometheus, OpenTelemetry Do you have: TechOps experience, especially from an Infrastructure as Code approach Understanding of diverse monitoring requirements and tools Familiarity with development technologies more »
partners. · Previous production or application support experience, preferably with large-scale distributed systems. · Strong background in monitoring and logging of large-scale platforms (Prometheus, Grafana, Splunk, etc.) · Proficiency in handling incident management & problem management at an application support level. · Experience troubleshooting, analysing log files & resolving technical problems with Java-based more »
Description Leidos is seeking a Senior Kafka Site Reliability Engineer to be part of the mission solution and help lead SSA's Digital Modernization Strategy. Join one of our high performing teams responsible for building the next-generation enterprise APIs more »
like Cucumber for BDDs, JMeter for performance testing. Must have a basic knowledge and understanding of tools like Jenkins, Deployments, Splunk/Kibana/Grafana, GitHub. Must have knowledge on the microservice based application development. Experience using Spring Framework, Junit, GitHub, Microservices, Splunk and API’s Experience with Test driven more »
Platform Engineering (Java preferred) extensive experience with AWS, Kubernetes, Terraform, CI/CD tools strong observability experience, ideally with more modern approaches like Prometheus, Grafana, Open Telemetry comfortable with databases exposure to Kafka would be ideal more »
Cloud Technologies (AWS preferred) Experience with IaC Tools (Terraform preferred) Containerisation/Orchestration experience with Docker & Kubernetes Experience with Monitoring Tools e.g. Prometheus, ELK, Grafana Full right to work in the UK without visa sponsorship required at any point What's on Offer? Salary in the region of ~£75k Fully more »
Winchester, Hampshire, United Kingdom Hybrid / WFH Options
Context Recruitment
Reliability Engineer ( SRE ) position, dedicated to ensuring the high availability, reliability, and scalability of live systems. Proficient in observability tools like Prometheus, ELK stack, Grafana, and Azure Monitor, capable of fully managing the suite for optimal system oversight. Skilled in operating APM tools such as Dynatrace or New Relic, with more »
A scripting language, preferably PowerShell. A cloud platform, preferably Microsoft Azure. Experience with any of the following would be a bonus: Monitoring tools, preferably Grafana and Prometheus. Azure DevOps, especially CI/CD pipelines. A high-level language, preferably C#. Apache Solr. Scrum or Kanban. Skills & Traits We Like Effective more »
Stockport, England, United Kingdom Hybrid / WFH Options
Dematic
Configuration automation with Ansible · Terraform · Containerisation technologies such as Kubernetes and Docker · Vmware · Good Network skills (Firewalls & Switches) · A driving license. You can have: · Grafana and or Prometheus · Windows Server Knowledge · A knowledge of ITIL processes, and ideally ITIL certified more »