environment (k8s, docker ) and tools ( kubectl , Helm, kustomize , docker -compose) Proven experience in networking and security standards, protocols and best practices Proven experience in logging systems (e.g. ELKstack ) Proven experience in monitoring systems (e.g. Prometheus ) Proven experience in tracing systems (e.g. OpenTelemetry , Jaeger) Experience in performance optimization and resource management Relevant certifications (AWS, Google) Understanding of Agile More ❯
Virginia Beach, Virginia, United States Hybrid / WFH Options
HII
of database security concepts and proficiency in modern programming/scripting languages, including Python, Bash, YAML, JSON, and SQL Experience with monitoring and logging solutions, such as ELKStack, Splunk, Loki, Prometheus, Grafana, and other SIEM platforms Certifications & Clearance Must have a DoD 8570/8140 IAT Level II baseline certification (e.g., Security+ CE, CCNA-Security, CySA+, CND More ❯
results; collaborate with DevOps to optimize build times, parallelize tests, and reduce pipeline flakiness. Result Analysis & Root Cause • Analyze test outputs, system logs, and metrics (e.g., via ELKStack or Prometheus/Grafana) to pinpoint failures and performance regressions. • Lead root-cause investigations for infrastructure incidents, producing clear post-mortem reports and remediation recommendations. Defect Management • Log, triage More ❯
Desirable skills include: Cloud platforms and technologies (e.g., AWS Cloud Practitioner) Software and infrastructure testing principles Test management tools (TestRail, X-Ray) Reporting dashboards with Splunk or ELKstack #LI-JS2 Together, as owners, let’s turn meaningful insights into action. Life at CGI is rooted in ownership, teamwork, respect and belonging. Here, you’ll reach your full More ❯
web application technologies Familiarity with network protocols and security best practices Experience with database administration (e.g., MySQL, PostgreSQL, Oracle) Knowledge of monitoring and logging tools (e.g., Prometheus, ELKstack) Advanced SAFe certifications such as SAFe DevOps Practitioner Understanding of cybersecurity requirements for DoD systems Experience with high-availability and disaster recovery strategies Strong problem-solving skills and ability More ❯
such as AWS, GCP, or Azure. Strong knowledge of infrastructure as code (IaC) tools like Terraform or Ansible. Familiarity with monitoring/logging tools (e.g., Prometheus, Grafana, ELKstack). Good understanding of microservices architecture and RESTful APIs. More ❯
/Scrum environments and lead technical initiatives. Preferred Qualifications Experience with microservices architecture and event-driven integrations. Familiarity with DevOps practices and API monitoring tools (Datadog, Splunk, ELKStack, Prometheus). Certifications such as MuleSoft Certified Developer, Apigee Certification, or AWS/Azure Integration Expert. Experience integrating with ERP, CRM, or legacy enterprise systems. More ❯
Configuration Management Systems Puppet/Ansible Linux Administration Redhat family OS, including RHEL, Alma and some legacy CentOS Core internet applications protocols DHCP/DNS Monitoring Systems Icinga2/ElasticStack/InfluxDB/Grafana Application and network security best practices SSH/Iptables/TLS AWS (EC2/VPS/RDS/EKS/S3) Terraform More ❯
Configuration Management Systems – Puppet/Ansible Linux Administration – Redhat family OS, including RHEL, Alma and some legacy CentOS Core internet applications protocols – DHCP/DNS Monitoring Systems – Icinga2/ElasticStack/InfluxDB/Grafana Application and network security best practices – SSH/Iptables/TLS AWS (EC2/VPS/RDS/EKS/S3) Terraform More ❯
technologies. Continuous Innovation and Optimization Identify opportunities for innovation in processes, tools, and technologies to maintain a competitive edge. Implement monitoring and observability solutions (e.g., Prometheus, Grafana, ELKStack) to ensure system health and performance. Optimize cost, performance, and scalability of cloud-native solutions. What You Bring: Skills and Expertise Core Requirements Required Skills and Qualifications Education: Bachelor More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Michael Page Technology
The role of a Platform Support Engineer involves providing excellent technical support and maintenance for platform solutions within the technology and telecoms industry. You will ensure the smooth operation of systems, troubleshoot issues, and deliver high-quality service to internal More ❯
Hounslow, London, United Kingdom Hybrid / WFH Options
Sky Group
You'll Bring: You will be skilled in C/C++, Python, and Linux . Ideally you'll also have experience with log management and analysis tools such as ElasticStack (ELK), Splunk, and Grafana for data visualisation and monitoring. Proven expertise in at least one scripting language, such as Bash, Python, or Go. Ability to make More ❯
tools (Snyk, Trivy, Checkov, SonarQube) into automated workflows Manage authentication, access control, and secrets using Vault, AWS Secrets Manager, OAuth2.0, and Zero Trust principles Monitor environments with ELKStack, Splunk, and Prometheus to ensure visibility, auditing, and compliance Collaborate with engineering, operations, and security teams to promote DevSecOps best practices Key Skills & Experience Strong background in cloud platforms More ❯
tools (Snyk, Trivy, Checkov, SonarQube) into automated workflows Manage authentication, access control, and secrets using Vault, AWS Secrets Manager, OAuth2.0, and Zero Trust principles Monitor environments with ELKStack, Splunk, and Prometheus to ensure visibility, auditing, and compliance Collaborate with engineering, operations, and security teams to promote DevSecOps best practices Key Skills & Experience Strong background in cloud platforms More ❯
tools (Snyk, Trivy, Checkov, SonarQube) into automated workflows Manage authentication, access control, and secrets using Vault, AWS Secrets Manager, OAuth2.0, and Zero Trust principles Monitor environments with ELKStack, Splunk, and Prometheus to ensure visibility, auditing, and compliance Collaborate with engineering, operations, and security teams to promote DevSecOps best practices Key Skills & Experience Strong background in cloud platforms More ❯
tools (Snyk, Trivy, Checkov, SonarQube) into automated workflows Manage authentication, access control, and secrets using Vault, AWS Secrets Manager, OAuth2.0, and Zero Trust principles Monitor environments with ELKStack, Splunk, and Prometheus to ensure visibility, auditing, and compliance Collaborate with engineering, operations, and security teams to promote DevSecOps best practices Key Skills & Experience Strong background in cloud platforms More ❯
london (city of london), south east england, united kingdom
Damia Group
tools (Snyk, Trivy, Checkov, SonarQube) into automated workflows Manage authentication, access control, and secrets using Vault, AWS Secrets Manager, OAuth2.0, and Zero Trust principles Monitor environments with ELKStack, Splunk, and Prometheus to ensure visibility, auditing, and compliance Collaborate with engineering, operations, and security teams to promote DevSecOps best practices Key Skills & Experience Strong background in cloud platforms More ❯
tracking tools such as JIRA Knowledge of Hashicorp Packer for AMI creation and Hashicorp Vault for secrets are desirable Knowledge of Queues (IBM MQ and RabbitMQ) and monitoring tools (ElasticStack, AppDynamics) are preferable Passion for and ability to work with software development teams releasing production-ready Software daily Willingness to take ownership, be held accountable and More ❯
tracking tools such as JIRA Knowledge of Hashicorp Packer for AMI creation and Hashicorp Vault for secrets are desirable Knowledge of Queues (IBM MQ and RabbitMQ) and monitoring tools (ElasticStack, AppDynamics) are preferable Passion for and ability to work with software development teams releasing production-ready Software daily Willingness to take ownership, be held accountable and More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Michael Page Technology
The role of a Platform Support Engineer involves providing excellent technical support and maintenance for platform solutions within the technology and telecoms industry. You will ensure the smooth operation of systems, troubleshoot issues, and deliver high-quality service to internal More ❯
build, and operation of large-scale compute and storage environments that power a global trading platform. The successful candidate will be involved in every layer of the technology stack—from hardware and operating systems to automation and observability—while gaining exposure to how a world-class investment firm manages its technology infrastructure. Key Responsibilities Manage a distributed compute … environment and several petabyte-scale storage systems Install, configure, and monitor RHEL-based Linux environments Troubleshoot hardware and software issues across the stack Develop and maintain automation scripts and tools (Python, Ruby, or Bash) Collaborate with internal teams and external vendors to resolve complex issues and improve systems reliability Required Qualifications Up to 6 years of experience in … agile methodologies) Familiarity with infrastructure automation and configuration management tools (Chef, Puppet, or Ansible) Exposure to distributed storage systems and related protocols Experience with observability and monitoring tools (Elasticsearch, Logstash, Kibana, Datadog, Prometheus, Grafana) Strong written and verbal communication skills Demonstrated ability to learn quickly and adapt to evolving technologies Ability to work effectively in a fast-paced, collaborative environment More ❯
build, and operation of large-scale compute and storage environments that power a global trading platform. The successful candidate will be involved in every layer of the technology stack—from hardware and operating systems to automation and observability—while gaining exposure to how a world-class investment firm manages its technology infrastructure. Key Responsibilities Manage a distributed compute … environment and several petabyte-scale storage systems Install, configure, and monitor RHEL-based Linux environments Troubleshoot hardware and software issues across the stack Develop and maintain automation scripts and tools (Python, Ruby, or Bash) Collaborate with internal teams and external vendors to resolve complex issues and improve systems reliability Required Qualifications Up to 6 years of experience in … agile methodologies) Familiarity with infrastructure automation and configuration management tools (Chef, Puppet, or Ansible) Exposure to distributed storage systems and related protocols Experience with observability and monitoring tools (Elasticsearch, Logstash, Kibana, Datadog, Prometheus, Grafana) Strong written and verbal communication skills Demonstrated ability to learn quickly and adapt to evolving technologies Ability to work effectively in a fast-paced, collaborative environment More ❯
build, and operation of large-scale compute and storage environments that power a global trading platform. The successful candidate will be involved in every layer of the technology stack—from hardware and operating systems to automation and observability—while gaining exposure to how a world-class investment firm manages its technology infrastructure. Key Responsibilities Manage a distributed compute … environment and several petabyte-scale storage systems Install, configure, and monitor RHEL-based Linux environments Troubleshoot hardware and software issues across the stack Develop and maintain automation scripts and tools (Python, Ruby, or Bash) Collaborate with internal teams and external vendors to resolve complex issues and improve systems reliability Required Qualifications Up to 6 years of experience in … agile methodologies) Familiarity with infrastructure automation and configuration management tools (Chef, Puppet, or Ansible) Exposure to distributed storage systems and related protocols Experience with observability and monitoring tools (Elasticsearch, Logstash, Kibana, Datadog, Prometheus, Grafana) Strong written and verbal communication skills Demonstrated ability to learn quickly and adapt to evolving technologies Ability to work effectively in a fast-paced, collaborative environment More ❯
london (city of london), south east england, united kingdom
Hunter Bond
build, and operation of large-scale compute and storage environments that power a global trading platform. The successful candidate will be involved in every layer of the technology stack—from hardware and operating systems to automation and observability—while gaining exposure to how a world-class investment firm manages its technology infrastructure. Key Responsibilities Manage a distributed compute … environment and several petabyte-scale storage systems Install, configure, and monitor RHEL-based Linux environments Troubleshoot hardware and software issues across the stack Develop and maintain automation scripts and tools (Python, Ruby, or Bash) Collaborate with internal teams and external vendors to resolve complex issues and improve systems reliability Required Qualifications Up to 6 years of experience in … agile methodologies) Familiarity with infrastructure automation and configuration management tools (Chef, Puppet, or Ansible) Exposure to distributed storage systems and related protocols Experience with observability and monitoring tools (Elasticsearch, Logstash, Kibana, Datadog, Prometheus, Grafana) Strong written and verbal communication skills Demonstrated ability to learn quickly and adapt to evolving technologies Ability to work effectively in a fast-paced, collaborative environment More ❯
build, and operation of large-scale compute and storage environments that power a global trading platform. The successful candidate will be involved in every layer of the technology stack—from hardware and operating systems to automation and observability—while gaining exposure to how a world-class investment firm manages its technology infrastructure. Key Responsibilities Manage a distributed compute … environment and several petabyte-scale storage systems Install, configure, and monitor RHEL-based Linux environments Troubleshoot hardware and software issues across the stack Develop and maintain automation scripts and tools (Python, Ruby, or Bash) Collaborate with internal teams and external vendors to resolve complex issues and improve systems reliability Required Qualifications Up to 6 years of experience in … agile methodologies) Familiarity with infrastructure automation and configuration management tools (Chef, Puppet, or Ansible) Exposure to distributed storage systems and related protocols Experience with observability and monitoring tools (Elasticsearch, Logstash, Kibana, Datadog, Prometheus, Grafana) Strong written and verbal communication skills Demonstrated ability to learn quickly and adapt to evolving technologies Ability to work effectively in a fast-paced, collaborative environment More ❯