concepts. Mindset: Pragmatic, customer-focused, and driven by efficiency and automation. Education: Minimum 2:1 degree in a STEM subject or equivalent experience. Desirable: Exposure to observability tooling (Grafana, Prometheus, Mimir). Interest in data platforms or AI-enabled development workflows. Learn More For more information, contact George Harris at Harrington Starr for a confidential conversation, or click “Apply” to More ❯
GCP, or Azure). Experience with relational databases and data processing and query engines (Spark, Trino, or similar). Familiarity with monitoring, observability, and alerting systems for production ML (Prometheus, Grafana, Datadog, or equivalent). Understanding of ML concepts. You don't need to train models, but you should speak the language of Research Engineers and understand their constraints. A More ❯
GCP, or Azure). Experience with relational databases and data processing and query engines (Spark, Trino, or similar). Familiarity with monitoring, observability, and alerting systems for production ML (Prometheus, Grafana, Datadog, or equivalent). Understanding of ML concepts. You don't need to train models, but you should speak the language of Research Engineers and understand their constraints. A More ❯
/Lambda or Cloud Run/GKE), containerized with Docker. Own CI/CD (GitHub Actions), IaC (Terraform), logging/metrics/tracing ( OpenTelemetry , CloudWatch/Stackdriver, Grafana/Prometheus), and SLOs . Optimize p95 latency, throughput, and cost ; manage secrets, networking, VPCs, and build resilient retries/backoffs. 15% Collaborate Work closely with design/PM on specs and More ❯
/Lambda or Cloud Run/GKE), containerized with Docker. Own CI/CD (GitHub Actions), IaC (Terraform), logging/metrics/tracing ( OpenTelemetry , CloudWatch/Stackdriver, Grafana/Prometheus), and SLOs . Optimize p95 latency, throughput, and cost ; manage secrets, networking, VPCs, and build resilient retries/backoffs. 15% Collaborate Work closely with design/PM on specs and More ❯
Cambridge, England, United Kingdom Hybrid / WFH Options
RegGenome
willingness to learn. Hands-on experience with Kubernetes and Terraform/Terragrunt/OpenTofu. Strong cloud infrastructure knowledge in either AWS or GCP. Nice to Have: Monitoring stack tools: Prometheus, Thanos, Loki, Alertmanager, Grafana. CI/CD experience with FluxCD (or ArgoCD). Database performance optimization and management experience. Qualities We Value: Solution-oriented mindset with a knack for solving More ❯
Lambda, ECS/EKS, API Gateway, SQS/SNS). Experience integrating application services with CI/CD pipelines and contributing to application monitoring/observability using tools like Prometheus, Grafana, or Datadog. Experience with containerization (Docker) and a solid understanding of how application code runs within a Kubernetes or serverless environment. Leadership & Collaboration Strong mentoring abilities and passion for More ❯
Lambda, ECS/EKS, API Gateway, SQS/SNS). Experience integrating application services with CI/CD pipelines and contributing to application monitoring/observability using tools like Prometheus, Grafana, or Datadog. Experience with containerization (Docker) and a solid understanding of how application code runs within a Kubernetes or serverless environment. Leadership & Collaboration Strong mentoring abilities and passion for More ❯
problem-solving skills. Experience with FinOps or cloud cost management tools. Strong understanding of ITIL processes, security compliance, and incident management. Proficiency in monitoring, alerting, and incident management tools (Prometheus, Grafana, PagerDuty). Solid understanding of networking, distributed systems, and performance tuning. Preferred Qualifications Experience in a high-scale, high-availability SaaS environment. Familiarity with security and compliance in cloud More ❯
Database technologies such as Oracle SQL, Mongo, Postgres Know your way around Linux and Windows command lines, e.g. Bash and PowerShell Monitoring large systems using technologies such as Grafana, Prometheus, ELK, Splunk Experience of working in Agile teams, and the tooling that supports it, e.g. Atlassian Diagnosing and troubleshooting application issues resulting in service outages Troubleshooting skills across different levels More ❯
etc.) ⚙️ Strong IaC skills with Terraform and CI/CD pipelines 🐳 Kubernetes operations expertise on AWS (EKS) 🔒 Solid grounding in Linux, networking, and cloud security 📊 Familiarity with observability stacks (Prometheus, Grafana, Loki) If you’re ready to shape the infrastructure behind cutting-edge AI used by global enterprises, we’d love to hear from you. More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Oliver Bernard
Services Expert knowledge of Containerisation with Docker and Kubernetes Strong Infrastructure as Code experience with Terraform History of working across CI/CD pipelines Monitoring and Observability experience with Prometheus, Grafana, and/or DataDog Prior experience overseeing Change and Incident Management processes Previous work in an Architectural capacity is also a massive bonus This position is open to Lead More ❯
Services Expert knowledge of Containerisation with Docker and Kubernetes Strong Infrastructure as Code experience with Terraform History of working across CI/CD pipelines Monitoring and Observability experience with Prometheus, Grafana, and/or DataDog Prior experience overseeing Change and Incident Management processes Previous work in an Architectural capacity is also a massive bonus This position is open to Lead More ❯
or Databricks for data processing. Familiarity with AWS services (ECS, EKS, S3, Lambda, etc.). Basic scripting in Python for automation or data manipulation. Secondary Skills Experience with Datadog, Prometheus, or other monitoring tools. Exposure to CI/CD pipelines and DevOps practices. Knowledge of data engineering best practices and real-time analytics. More ❯
Key Details: Salary: £100k–£180k (flexible for strong profiles) + equity Working Model: On-site, London Tech Stack: AWS/GCP/Azure, Kubernetes, Docker, Terraform, Python, MLflow/Prometheus/Grafana If you want to shape the backbone of one of Europe’s most ambitious AI startups, we’d love to hear from you. More ❯
Key Details: Salary: £100k–£180k (flexible for strong profiles) + equity Working Model: On-site, London Tech Stack: AWS/GCP/Azure, Kubernetes, Docker, Terraform, Python, MLflow/Prometheus/Grafana If you want to shape the backbone of one of Europe’s most ambitious AI startups, we’d love to hear from you. More ❯
Terraform or equivalent IaC exposure Python or Go for automation and tooling Strong Linux fundamentals and troubleshooting skills Nice to Have Exposure to LLM infrastructure or AI-driven services Prometheus & Grafana This is a hybrid role with 2 days per week in the Belfast office. If you'd like to apply, send in your CV or contact Adam Whitehurst at More ❯
of Terraform and Infrastructure as Code principles Hands-on experience with CI/CD systems and release automation Comfortable working with Python, Bash, or Go Monitoring and alerting tools (Prometheus, CloudWatch, Grafana, OpenTelemetry, etc.) Passionate about developer tooling, DevOps culture, and improving engineering workflows Any experience in Fintech or with public APIs would be a bonus Sound Interesting? Feel free More ❯
of Terraform and Infrastructure as Code principles Hands-on experience with CI/CD systems and release automation Comfortable working with Python, Bash, or Go Monitoring and alerting tools (Prometheus, CloudWatch, Grafana, OpenTelemetry, etc.) Passionate about developer tooling, DevOps culture, and improving engineering workflows Any experience in Fintech or with public APIs would be a bonus Sound Interesting? Feel free More ❯
control (802.1x, RADIUS), or zero-trust security concepts. Exposure to infrastructure-as-code (Terraform, Ansible) and version control systems (Git). Experience with monitoring and observability tools (LogicMonitor, Grafana, Prometheus). Knowledge of hybrid cloud networking, including AWS Direct Connect or GCP Interconnect. Relevant certifications such as CCNP, AWS Advanced Networking Specialty, or Google Cloud Network Engineer. More ❯
Hands-on experience in technical integrations and POCs Comfortable coding in any high-level programming language (Java, Go, Python) Strong hands-on knowledge of Kubernetes, AWS, Azure, GCP, Docker, Prometheus, and OpenTelemetry Industry knowledge and opinions on Monitoring, Observability, Log Management, SIEM Engineering/DevOps Background - advantage Experience in Technical Sales of Log Analytics/Monitoring/APM/SIEM More ❯