experience with AWS cloud infrastructure Deep understanding of IaC tools: Terraform, Packer, CloudFormation Proven leadership in multidisciplinary delivery teams Skills in Databases: MongoDB/Atlas; Messaging: Kafka; Observability: Prometheus, Grafana, Splunk Experience working in a DevOps environment, favoring and implementing Continuous Integration & Deployment over manual processes Experience designing, implementing, securing, and supporting Unix/Linux-based platforms (ideally RHEL/ More ❯
or HashiCorp Nomad. Excellent problem-solving, communication, and collaboration skills. Nice to have: Experience managing distributed systems, microservices, and event-driven architectures. Knowledge of observability tools such as Prometheus, Grafana, ELK Stack, or Datadog. Experience with security best practices, monitoring, and incident response. Familiarity with DevSecOps and compliance frameworks (ISO 27001, SOC 2, GDPR). Exposure to big data processing More ❯
language), Bash/Shell, YAML including any Development frameworks Extensive experience and in-depth knowledge of the Linux operating system for effective troubleshooting activities Experience with Observability tools like Grafana, Prometheus, ELK, OCI Observability We highly value ownership and initiative with capabilities to drive projects independently Dealing with changes on a daily basis in a very dynamic work environment Good More ❯
Agile teams using tools like Git , Jira , and Confluence Eligible for SC and NPPV3 clearance Desirable: Container orchestration with Kubernetes HashiCorp tools: Vault , Consul , Packer Monitoring and observability with Grafana , Prometheus , or similar Familiarity with cloud networking, VPCs, NAT Gateways, security groups, etc. Personal Attributes: Proactive and self-driven with a passion for technology Strong problem-solving mindset Collaborative team More ❯
Washington, Washington DC, United States Hybrid / WFH Options
ClearanceJobs
Automation: Enhance DevSecOps pipelines, automate deployments, and improve system resilience through tools like GitLab CI/CD, Jenkins, and Kubernetes. • Incident Response & Monitoring: Implement and manage monitoring solutions (Prometheus, Grafana, ELK Stack), respond to incidents, and conduct post-mortems. • Networking & Security: Configure and maintain VPCs, VPNs, security groups, and firewalls in AWS GovCloud, ensuring compliance with FedRAMP requirements. • GOV Production More ❯
Flux) Knowledge of IaC and configuration management tools (Terraform, OpenTofu, Crossplane, Pulumi, Ansible, CloudFormation) Strong problem-solving experience, focusing on automation Production experience with Monitoring and Observability tools (Prometheus, Grafana, Datadog, Thanos, New Relic, Open Telemetry) Understanding of Cloud Networking concepts (Mesh Networking, NAT, Load Balancers, SSL Certificates and TLS termination, API Gateways, proxies, etc) Strong written and verbal communication More ❯
Create infrastructure as code (IaC) using Terraform. Collaborate with development teams to ensure applications are built to scale and deploy efficiently. Manage monitoring and alerting solutions using CloudWatch, Prometheus, Grafana, or other monitoring tools. Automate security processes and ensure AWS best practices are followed. Troubleshoot production issues and coordinate with teams for effective resolution. Optimize cloud environments for cost, performance More ❯
Guildford, Surrey, United Kingdom Hybrid / WFH Options
BAE Systems (New)
similar platforms. Operating Systems : Proficiency in Linux environments, including scripting in Bash and/or Python for automation and tooling. Open Source Technologies : Familiarity with tools like Kafka , Elasticsearch , Grafana , or Prometheus for logging, monitoring, and streaming use cases. Version Control : Proficient in using Git for source code management, including branching strategies and code reviews. Desirable Skills & Experience Candidates meeting More ❯
Azure Devops, GitHub) Excellent CI/CD pipeline authoring. Automation is key after all. We use Azure DevOps. Experience in Observability and Reporting tools and their components (think ElasticSearch, Grafana, Prometheus, Thanos, Raygun) Know your way around Linux and Windows OS. If you come from a sysadmin background this is cool! Personal A positive and proactive attitude! We pride ourselves More ❯
and protocols - TCP/IP, DNS, HTTP Experience of deploying Continuous Integration solutions An awareness of security considerations in web application deployment Monitoring/Logging aka ELK, Prometheus/Grafana etc Strong AWS knowledge - EC2, EKS, RDS, Aurora, networking, cost management If you'd like to discuss this DevOps Engineer in more detail, please send your updated CV to and More ❯
IP, HTTP/S, DNS, VPNs). Expertise withcontainerand orchestrationtechnologies, including Docker and Kubernetes. Hands-on experience withHelmfor packaging, deploying, and managing Kubernetes applications. Experience withmonitoring and loggingsolutions likePrometheus,Grafana,ELK Stack, or similar. Knowledge of security best practices in DevOps and cloud environments. Terraform, Ansible or Chef experience is preferred. Nice to haves: knowledge of Concourse, Nexus, SonarQube, various More ❯
Columbia, Maryland, United States Hybrid / WFH Options
Codescratch LLC
development tool suites. Preferred Skills and Experience: Experience with Docker and Kubernetes Experience with Hadoop Experience with Spark Experience with Accumulo Experience monitoring application performance with metrics (Prometheus, InfluxDB, Grafana) and logs with ELK Stack (ElsticSearch, Logstash, Kibana) Experience with asynchronous messaging systems (RabbitMQ, Apache Kafka, etc.) Location: Columbia Annex, MD (60%+ telework) Salary Range: $115,000 - $200,000.00 More ❯
standard software development tool suites. Preferred Skills and Experience: Experience with Docker and Kubernetes Experience with Virtual Machines Experience with Networking Experience monitoring application performance with metrics (Prometheus, InfluxDB, Grafana) and logs with ELK Stack (ElasticSearch, Logstash, Kibana) Have, or obtain Security+ certification or equivalent DoD 8570 IAT II certification Location Fort Eisenhower, GA (Appx 50% hybrid telework) Salary Range More ❯
testable systems by design Nice-to-Haves: Exposure to regulated environments (e.g., BFSI, healthcare, public sector) Experience with performance, security, or chaos testing Familiarity with observability tooling (e.g., Prometheus, Grafana, OpenTelemetry) Knowledge of contract testing, mocking, or service virtualization Mindset & Cultural Fit A builder's mindset, focused on enabling early, frequent, and safe delivery through automated confidence A belief that More ❯
Cambridge, Cambridgeshire, United Kingdom Hybrid / WFH Options
Arm Limited
Skills and Experience: Proven experience in cloud infrastructure automation. Deep knowledge of cloud platforms (AWS, Azure, GCP), containerization (Docker, Kubernetes, Rancher), automation tools (Terraform, Ansible), and monitoring solutions (Prometheus, Grafana). Strong scripting and programming skills in Bash, Python, and Go. Experience in DevOps, SRE, or Platform Engineering roles with a focus on hybrid infrastructure. Familiarity with Agile methodologies and More ❯
languages, such as C#, Python, Perl, Java, C++ CICD tools such as Azure DevOps, GitHub Actions, Gitlab, Jenkins, TeamCity Scripting languages such as PowerShell, bash Observability/Monitoring: Prometheus, Grafana, Splunk Containerisation tools such as Docker, K8S, OpenShift, EC, containers Hosting technologies such as IIS, nginx, Apache, App Service, LightSail Analytical and creative approach to problem solving We encourage you More ❯
Linux internals, and security best practices. • Deep understanding of CI/CD tools and practices (GitHub Actions, Jenkins, ArgoCD, etc.). • Strong observability mindset-experience with tools like Prometheus, Grafana, Loki, etc. • Experience with hybrid service meshes, multi-cluster Kubernetes, or edge computing, preferred. • Knowledge of Kafka, Redis, Elasticsearch, or RDBMS (MySQL/Postgres), preferred. As a global leader in More ❯
Liverpool, Lancashire, United Kingdom Hybrid / WFH Options
The Granite Group
Hands-on experience with GitOps tools (e.g., ArgoCD, Flux). CI/CD - Skilled in building and managing pipelines using Azure DevOps, GitHub Actions, etc. Monitoring - Experience with Prometheus, Grafana, and other observability tools. Application Stack - Familiarity with .NET, Node.js, React, and web server technologies like Nginx. Relevant certifications or the ability to demonstrate equivalent experience, such as: Terraform Associate More ❯
expertise (any cloud provider is fine: GCP, AWS or Azure) Knowledge of GitLab CI/CD, Terraform, Ansible Experience in Kubernetes, Docker, SRE and IaC principles Monitoring with Prometheus, Grafana etc Any scripting experience will be a bonus What's in it for me? Competitive salary circa £85k + bonus Hybrid working (1 day onsite per month) Discretionary bonus scheme More ❯
knowledge of CI/CD processes and tools Good understanding of Linux systems (iptables, routing tables) and network configurations (VPCs, SGs) Exposure to monitoring and observability tools (e.g. CloudWatch, Grafana, Prometheus) An AWS certification (Architect Associate or similar) is expected 🧩 Nice to have, but not essential: AWS Architect Professional or additional AWS specialisations Related Microsoft, Linux, or Networking certifications Degree More ❯
languages, such as C#, Python, Perl, Java, C++ CICD tools such as Azure DevOps, GitHub Actions, Gitlab, Jenkins, TeamCity Scripting languages such as PowerShell, bash Observability/Monitoring: Prometheus, Grafana, Splunk Containerisation tools such as Docker, K8S, OpenShift, EC, containers Analytical and creative approach to problem solving We encourage you to apply , even if you don't meet all of More ❯
Birmingham, West Midlands, England, United Kingdom Hybrid / WFH Options
Bullion By Post
Linux-based infrastructure supporting a large-scale e-commerce platform Build and maintain deployment pipelines and infrastructure as code using Ansible Monitor performance and system health using Prometheus and Grafana Strengthen security, backups, and compliance Lead incident response, root cause analysis, and post-mortems Collaborate with development teams on CI/CD workflows and scalable architecture Document internal systems and More ❯
Cambridge, Cambridgeshire, United Kingdom Hybrid / WFH Options
Arm Limited
to: Modernise our infrastructure by leading the migration from Docker Swarm to Kubernetes Design and operate CI/CD pipelines using CloudBees and GitLab Build out observability with Prometheus, Grafana, OpenTelemetry, and Dynatrace Automate cloud deployments (AWS-first) using Terraform and platform tooling Improve security posture across IAM, secrets, and networking Help the team ship faster and safer by mentoring More ❯