of container orchestration (e.g., Kubernetes). • Experience with mobile application development (Android/iOS). • Knowledge of C# or other backend languages. • Familiarity with monitoring and observability tools (Grafana, Prometheus, etc.). • Experience with AI-assisted development tools (e.g., Copilot, ChatGPT integrations). Attributes & Behaviours • Clear, professional communication with customers and colleagues. • Strong problem-solving and troubleshooting ability. • Commitment to More ❯
City of London, London, United Kingdom Hybrid / WFH Options
AAA Global
full network stack. Automation Advocate: Strong background in IaC with tools like Terraform, Ansible, or similar; adept at scripting in Python, Bash, and other languages. Monitoring & Databases: Experience with Prometheus, Grafana, time-series databases (e.g., TimescaleDB, InfluxDB), and Kafka. Security & Compliance: In-depth understanding of cloud security practices and threat management; vigilant in safeguarding high-stakes environments. Crypto Enthusiast: Passionate More ❯
full network stack. Automation Advocate: Strong background in IaC with tools like Terraform, Ansible, or similar; adept at scripting in Python, Bash, and other languages. Monitoring & Databases: Experience with Prometheus, Grafana, time-series databases (e.g., TimescaleDB, InfluxDB), and Kafka. Security & Compliance: In-depth understanding of cloud security practices and threat management; vigilant in safeguarding high-stakes environments. Crypto Enthusiast: Passionate More ❯
language (Go, Python, JavaScript). Solid understanding and practical experience with Infrastructure as Code (IaC), CI/CD pipelines, and GitOps methodologies. Experience with monitoring and observability tools (e.g. Prometheus, Grafana, Datadog). Strong communication skills with a proven ability to collaborate with cross-functional teams (e.g. Data Scientists, Data Analysts, Product Managers, Product Engineers). Experience investigating and resolving More ❯
a big plus. Capable of writing clean, maintainable and well-tested code. Comfortable working in on-prem and cloud-native environments with an interest in observability, using tools like Prometheus and Grafana to keep services healthy and maintainable. Familiarity with AWS services and how to integrate them into modern applications. A keen focus on quality and security, combining testing and More ❯
CI/CD, and code reviews. Comfortable working with various technologies across the software and data engineering stack, including Airflow, Vertex AI, Kubernetes, Docker, GitHub Actions, Jenkins, Google Cloudbuild, Prometheus, and Grafana. Solid experience in cloud data storage , with particular expertise in Google BigQuery (GBQ), GCS/S3 Demonstrable ability to produce high-quality engineering solutions free of technical debt More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Hargreaves Lansdown
a big plus. Capable of writing clean, maintainable and well-tested code. Comfortable working in on-prem and cloud-native environments with an interest in observability, using tools like Prometheus and Grafana to keep services healthy and maintainable. Familiarity with AWS services and how to integrate them into modern applications. A keen focus on quality and security, combining testing and More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Hargreaves Lansdown
a big plus. Capable of writing clean, maintainable and well-tested code. Comfortable working in on-prem and cloud-native environments with an interest in observability, using tools like Prometheus and Grafana to keep services healthy and maintainable. Familiarity with AWS services and how to integrate them into modern applications. A keen focus on quality and security, combining testing and More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Hargreaves Lansdown
a big plus. Capable of writing clean, maintainable and well-tested code. Comfortable working in on-prem and cloud-native environments with an interest in observability, using tools like Prometheus and Grafana to keep services healthy and maintainable. Familiarity with AWS services and how to integrate them into modern applications. A keen focus on quality and security, combining testing and More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
Manchester Digital
by several microservices, also written in Python, utilising frameworks and libraries such as Celery, Eventlet, SQLAlchemy, etc. Additionally, GOV.UK Notify utilises AWS RDS (Postgres), AWS SQS, AWS ElastiCache, OpenTelemetry, Prometheus, Grafana and other related services. Concourse CI and Terraform are used to run build-pipelines and manage our infrastructure. For the frontend, we follow theGOV.UK Design System , making use of More ❯
line operations, and scripting. Experience in supporting real-time or mission-critical systems (security, IoT, or similar sectors). Familiarity with log aggregation, monitoring, and alerting tools (e.g., ELK, Prometheus, Grafana). Good understanding of networking, VPNs, load balancing, DNS, and firewalls. Comfortable with Git and CI/CD workflows. Excellent troubleshooting skills and structured problem-solving approach. Strong written … a plus. Experience with Kubernetes or OpenShift for container orchestration. Familiarity with CI/CD pipelines and automation tools (e.g., GitHub Actions, Jenkins). Exposure to monitoring tools like Prometheus, Grafana, or ELK stack. Experience supporting enterprise customers in a B2B SaaS or software product company. Experience with access control and intrusion detection systems. Familiarity with virtualization technologies (e.g., VMware More ❯
London, England, United Kingdom Hybrid / WFH Options
Cint
Kubernetes, Docker, Packer, Ansible and Jenkins. We support applications and services written in Golang, Python, Java, Scala and .Net. We monitor and alert on everything we deploy via Grafana, Prometheus, Graphite and ELK stacks. The team holds itself accountable to a high standard of build quality. We have recently completed the first major phase of a completely green-field infrastructure … GitHub Actions etc.) You have a grasp of “cloud native” and 12-Factor applications You have good knowledge of monitoring and alerting using one or more of: Graphite, Statsd, Prometheus, Grafana, PagerDuty You have expertise in at least one scripting or programming language (Python, Bash, Ruby, Node, Golang, Java) Bonus Points If You Have You have good knowledge of the More ❯
per day End date - 31st March 2026 Active SC clearance Onsite travel to Leeds/Newcastle/Manchester/Blackpool/Sheffield AWS Terraform Gitlab CI/CD Prometheus Grafana Splunk Gov experience More ❯
Cloud Engineering team. You will assist in building scalable, cutting edge, automated GCP Platform. Core skills: GCP Terraform/Terramate GKE/Kubernetes CI/CD Observability – Mimir, Grafana, Prometheus Python Desirable: Airflow PostgreSQL Helm Please apply ASAP for more information. More ❯
Cloud Engineering team. You will assist in building scalable, cutting edge, automated GCP Platform. Core skills: GCP Terraform/Terramate GKE/Kubernetes CI/CD Observability – Mimir, Grafana, Prometheus Python Desirable: Airflow PostgreSQL Helm Please apply ASAP for more information. More ❯
organization. What you’ll be doing: Building and maintaining a Kubernetes-hosted AI platform (AKS) Deploying and managing LLMOps tools such as LiteLLM, Langflow, and Langfuse Implementing observability with Prometheus, Grafana, and Loki Managing infrastructure through Terraform, ArgoCD, and GitHub Actions Supporting internal AI applications including RAG, document processing, and internal AI assistants What you’ll need: 2–4 years More ❯
with infrastructure automation and configuration management tools (Chef, Puppet, or Ansible) Exposure to distributed storage systems and related protocols Experience with observability and monitoring tools (Elasticsearch, Logstash, Kibana, Datadog, Prometheus, Grafana) Strong written and verbal communication skills Demonstrated ability to learn quickly and adapt to evolving technologies Ability to work effectively in a fast-paced, collaborative environment jhayne@hunterbond.com More ❯
with infrastructure automation and configuration management tools (Chef, Puppet, or Ansible) Exposure to distributed storage systems and related protocols Experience with observability and monitoring tools (Elasticsearch, Logstash, Kibana, Datadog, Prometheus, Grafana) Strong written and verbal communication skills Demonstrated ability to learn quickly and adapt to evolving technologies Ability to work effectively in a fast-paced, collaborative environment jhayne@hunterbond.com More ❯
maintaining CI/CD pipelines Hands-on experience with infrastructure-as-code (e.g. Terraform) Deep understanding of security best practices in cloud and application delivery Exposure to observability tooling (Prometheus, Grafana, structured logging, etc.) Confident debugging and resolving issues in complex distributed systems Background in B2B SaaS web applications, with familiarity in Node a plus Able to operate autonomously within More ❯
management CI/CD: GitHub Actions or Azure DevOps pipelines with end-to-end automation Event-Driven Architecture: Kafka or similar messaging systems Monitoring & Observability: Azure Monitor, Open Telemetry, Prometheus etc. Secure-by-Design Practices: Policy as Code, automated validation, compliance controls Nice to Haves Experience in regulated environments (banking, fintech, healthcare) Background in Site Reliability Engineering or DevOps transformation More ❯
management CI/CD: GitHub Actions or Azure DevOps pipelines with end-to-end automation Event-Driven Architecture: Kafka or similar messaging systems Monitoring & Observability: Azure Monitor, Open Telemetry, Prometheus etc. Secure-by-Design Practices: Policy as Code, automated validation, compliance controls Nice to Haves Experience in regulated environments (banking, fintech, healthcare) Background in Site Reliability Engineering or DevOps transformation More ❯
management CI/CD: GitHub Actions or Azure DevOps pipelines with end-to-end automation Event-Driven Architecture: Kafka or similar messaging systems Monitoring & Observability: Azure Monitor, Open Telemetry, Prometheus etc. Secure-by-Design Practices: Policy as Code, automated validation, compliance controls Nice to Haves Experience in regulated environments (banking, fintech, healthcare) Background in Site Reliability Engineering or DevOps transformation More ❯
management CI/CD: GitHub Actions or Azure DevOps pipelines with end-to-end automation Event-Driven Architecture: Kafka or similar messaging systems Monitoring & Observability: Azure Monitor, Open Telemetry, Prometheus etc. Secure-by-Design Practices: Policy as Code, automated validation, compliance controls Nice to Haves Experience in regulated environments (banking, fintech, healthcare) Background in Site Reliability Engineering or DevOps transformation More ❯