automation, and container orchestration. You will be instrumental in shaping enterprise-ready cloud solutions by applying deep technical expertise in AWS alongside knowledge of multi-cloud environments, identity management, observability, and cost optimisation. Key Responsibilities Design and implement secure, scalable AWS cloud architectures Drive Infrastructure as Code (IaC) adoption using Terraform and CloudFormation Build, optimise, and automate CI/CD … GitHub Actions, and related tools Deploy and manage containerised solutions with Docker, Kubernetes, and Helm Implement strong security and access controls using IAM, Vault, and Secrets Manager Enhance platform observability using Prometheus, Grafana, and ELK Stack Collaborate with cross-functional teams to deliver robust, high-availability solutions Key Skills & Experience Extensive hands-on experience with AWS (Azure knowledge beneficial) Expertise … in Terraform, CloudFormation, and automation tooling Strong containerisation skills with Kubernetes, Docker, and related platforms Proven background in cloud security, IAM, and governance Solid understanding of monitoring and observability stacks Ability to influence architecture decisions and align solutions to best practices Desired Certifications AWS Certified Solutions Architect – Associate/Professional AWS Certified Security – Specialty HashiCorp Certified: Terraform Associate Kubernetes Certified More ❯
london (city of london), south east england, united kingdom
Damia Group
automation, and container orchestration. You will be instrumental in shaping enterprise-ready cloud solutions by applying deep technical expertise in AWS alongside knowledge of multi-cloud environments, identity management, observability, and cost optimisation. Key Responsibilities Design and implement secure, scalable AWS cloud architectures Drive Infrastructure as Code (IaC) adoption using Terraform and CloudFormation Build, optimise, and automate CI/CD … GitHub Actions, and related tools Deploy and manage containerised solutions with Docker, Kubernetes, and Helm Implement strong security and access controls using IAM, Vault, and Secrets Manager Enhance platform observability using Prometheus, Grafana, and ELK Stack Collaborate with cross-functional teams to deliver robust, high-availability solutions Key Skills & Experience Extensive hands-on experience with AWS (Azure knowledge beneficial) Expertise … in Terraform, CloudFormation, and automation tooling Strong containerisation skills with Kubernetes, Docker, and related platforms Proven background in cloud security, IAM, and governance Solid understanding of monitoring and observability stacks Ability to influence architecture decisions and align solutions to best practices Desired Certifications AWS Certified Solutions Architect – Associate/Professional AWS Certified Security – Specialty HashiCorp Certified: Terraform Associate Kubernetes Certified More ❯
hosted on AWS. Architect and optimise systems: Define service boundaries, data ownership, and failure-recovery patterns for scalable, high-availability systems. Raise engineering quality: Champion best practices for testing, observability, and security. Review critical PRs and guide technical decisions across the team. Operate and improve production systems: Monitor performance, reliability, and cost efficiency. Lead incident response and drive continuous improvement. … Django) Cloud: AWS (Lambda, ECS/Fargate, S3, DynamoDB, CloudWatch, API Gateway) Data & Messaging: PostgreSQL, Redis, Kafka or SQS CI/CD & Infrastructure: Docker, Terraform, GitHub Actions, CloudFormation Monitoring & Observability: Prometheus, Grafana, OpenTelemetry Testing: Pytest, integration and load testing frameworks Key Skills & Expertise Proven experience designing and delivering production systems using Python on AWS . Strong understanding of distributed systems … API design, and event-driven architectures. Deep knowledge of system observability, logging, and performance optimisation. Familiarity with modern security and data-privacy best practices. Excellent communicator who can document and articulate technical trade-offs clearly. Behaviours & Attributes Ownership: Takes full responsibility for systems from design to operation. Pragmatism: Balances long-term architecture with delivery velocity. Influence: Raises standards and mentors More ❯
hosted on AWS. Architect and optimise systems: Define service boundaries, data ownership, and failure-recovery patterns for scalable, high-availability systems. Raise engineering quality: Champion best practices for testing, observability, and security. Review critical PRs and guide technical decisions across the team. Operate and improve production systems: Monitor performance, reliability, and cost efficiency. Lead incident response and drive continuous improvement. … Django) Cloud: AWS (Lambda, ECS/Fargate, S3, DynamoDB, CloudWatch, API Gateway) Data & Messaging: PostgreSQL, Redis, Kafka or SQS CI/CD & Infrastructure: Docker, Terraform, GitHub Actions, CloudFormation Monitoring & Observability: Prometheus, Grafana, OpenTelemetry Testing: Pytest, integration and load testing frameworks Key Skills & Expertise Proven experience designing and delivering production systems using Python on AWS . Strong understanding of distributed systems … API design, and event-driven architectures. Deep knowledge of system observability, logging, and performance optimisation. Familiarity with modern security and data-privacy best practices. Excellent communicator who can document and articulate technical trade-offs clearly. Behaviours & Attributes Ownership: Takes full responsibility for systems from design to operation. Pragmatism: Balances long-term architecture with delivery velocity. Influence: Raises standards and mentors More ❯
london (city of london), south east england, united kingdom
Creo Recruitment
hosted on AWS. Architect and optimise systems: Define service boundaries, data ownership, and failure-recovery patterns for scalable, high-availability systems. Raise engineering quality: Champion best practices for testing, observability, and security. Review critical PRs and guide technical decisions across the team. Operate and improve production systems: Monitor performance, reliability, and cost efficiency. Lead incident response and drive continuous improvement. … Django) Cloud: AWS (Lambda, ECS/Fargate, S3, DynamoDB, CloudWatch, API Gateway) Data & Messaging: PostgreSQL, Redis, Kafka or SQS CI/CD & Infrastructure: Docker, Terraform, GitHub Actions, CloudFormation Monitoring & Observability: Prometheus, Grafana, OpenTelemetry Testing: Pytest, integration and load testing frameworks Key Skills & Expertise Proven experience designing and delivering production systems using Python on AWS . Strong understanding of distributed systems … API design, and event-driven architectures. Deep knowledge of system observability, logging, and performance optimisation. Familiarity with modern security and data-privacy best practices. Excellent communicator who can document and articulate technical trade-offs clearly. Behaviours & Attributes Ownership: Takes full responsibility for systems from design to operation. Pragmatism: Balances long-term architecture with delivery velocity. Influence: Raises standards and mentors More ❯
and help shape how platform engineering is done as the team continues to scale. Tech stack AWS (Core services - EC2, RDS, S3, IAM, etc.) Configuration Management Ansible Monitoring and Observability Grafana, Prometheus Kubernetes (building and managing production clusters) Terraform (IaC provisioning) GitHub Actions (CI/CD pipelines) What They’re Looking For Experience in AWS cloud infrastructure (ideally in a … regulated or high-traffic environment) Previous experience working with Monitoring and Observability Tools Hands-on Kubernetes know-how, specifically with EKS. Solid IaC experience with Terraform. Experience with containerisation (Docker, Helm) and CI/CD (GitHub Actions or similar) A good communicator who enjoys working collaboratively across product and engineering The client is willing to take someone that doesn't More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Harrington Starr
Manage and optimise key platforms such as Airflow , BigQuery , and PostgreSQL clusters. Developer Experience: Enhance internal developer productivity through Coder remote dev environments, GitLab CI/CD pipelines, and observability tooling. Collaboration: Partner closely with Data Engineering, Trading Technology, and Platform teams to deliver robust, scalable cloud solutions. Required Skills and Experience Experience: 2-4 years in a Cloud, Platform … and continuous integration concepts. Mindset: Pragmatic, customer-focused, and driven by efficiency and automation. Education: Minimum 2:1 degree in a STEM subject or equivalent experience. Desirable: Exposure to observability tooling (Grafana, Prometheus, Mimir). Interest in data platforms or AI-enabled development workflows. Learn More For more information, contact George Harris at Harrington Starr for a confidential conversation, or More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Areti Group | B Corp™
Own CI/CD pipelines and Docker -based runtime on AWS ; Infrastructure-as-Code via CDK/Terraform (CDKTF) . Apply secure-by-design and TDD ; instrument apps for observability and performance . Collaborate with product, platform, and security teams to meet operational and compliance requirements. The toolkit you’ll use Frontend: TypeScript, React.js, Vite, Material-UI, HTML5, CSS Backend … Docker , CI/CD . Building and consuming RESTful APIs ; JSON schemas; integration testing. Comfortable in AWS and modern Infrastructure-as-Code approaches. Strong engineering fundamentals: code reviews, testing, observability, performance tuning . Security Clearance: Active SC or DV (must be current). Nice-to-haves Military background (RAF/Army/Navy) or delivery in defence, aerospace, or government More ❯
london (city of london), south east england, united kingdom Hybrid / WFH Options
Areti Group | B Corp™
Own CI/CD pipelines and Docker -based runtime on AWS ; Infrastructure-as-Code via CDK/Terraform (CDKTF) . Apply secure-by-design and TDD ; instrument apps for observability and performance . Collaborate with product, platform, and security teams to meet operational and compliance requirements. The toolkit you’ll use Frontend: TypeScript, React.js, Vite, Material-UI, HTML5, CSS Backend … Docker , CI/CD . Building and consuming RESTful APIs ; JSON schemas; integration testing. Comfortable in AWS and modern Infrastructure-as-Code approaches. Strong engineering fundamentals: code reviews, testing, observability, performance tuning . Security Clearance: Active SC or DV (must be current). Nice-to-haves Military background (RAF/Army/Navy) or delivery in defence, aerospace, or government More ❯
code across the stack. Participating in architectural discussions and helping shape engineering best practices. Troubleshooting and resolving production issues across services and systems. Contributing to CI/CD pipelines, observability, and automation alongside platform engineers. Your Skills & Experience: Must-haves to be successful in this role: Strong experience writing backend services in Go. Proficiency in React and modern JavaScript/… and code styles. Nobody can do everything, but here are a few related things we’re interested in: Experience working lower in the stack, e.g., databases, infrastructure, Kubernetes, or observability tooling. Exposure to CI/CD tooling Interest in natural language processing, AI, or distributed systems. Here’s our promise to you: We are going to work with you – to More ❯
london (city of london), south east england, united kingdom
Xapien
code across the stack. Participating in architectural discussions and helping shape engineering best practices. Troubleshooting and resolving production issues across services and systems. Contributing to CI/CD pipelines, observability, and automation alongside platform engineers. Your Skills & Experience: Must-haves to be successful in this role: Strong experience writing backend services in Go. Proficiency in React and modern JavaScript/… and code styles. Nobody can do everything, but here are a few related things we’re interested in: Experience working lower in the stack, e.g., databases, infrastructure, Kubernetes, or observability tooling. Exposure to CI/CD tooling Interest in natural language processing, AI, or distributed systems. Here’s our promise to you: We are going to work with you – to More ❯
Help to improve the resilience, automation, and observability of production systems that power a mission-critical quant trading platform for a systematic hedge fund. This isn’t your typical ops role - they're looking for Engineers who can write code to eliminate toil, improve reliability and automate release, monitoring and recovery processes. You'll build and maintain automated tools in More ❯
london (city of london), south east england, united kingdom
Saragossa
Help to improve the resilience, automation, and observability of production systems that power a mission-critical quant trading platform for a systematic hedge fund. This isn’t your typical ops role - they're looking for Engineers who can write code to eliminate toil, improve reliability and automate release, monitoring and recovery processes. You'll build and maintain automated tools in More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Sharpe Search
and JavaScript/TypeScript (React.JS) being essential to drive their frontend and backend systems. You will be designing and delivering scalable, high-performance solutions from product requirements, ensuring robust observability through metrics and monitoring. You’ll work on event-driven architectures using CQRS, apply SOLID principles, and leverage Docker to build high-availability, high-throughput platforms. Experience with AWS services More ❯
london (city of london), south east england, united kingdom Hybrid / WFH Options
Sharpe Search
and JavaScript/TypeScript (React.JS) being essential to drive their frontend and backend systems. You will be designing and delivering scalable, high-performance solutions from product requirements, ensuring robust observability through metrics and monitoring. You’ll work on event-driven architectures using CQRS, apply SOLID principles, and leverage Docker to build high-availability, high-throughput platforms. Experience with AWS services More ❯
complex data ecosystem Design flexible data ingestion and transformation pipelines for financial market data and trading systems Build and maintain AI/ML infrastructure, including model serving, evaluation, and observability frameworks Collaborate directly with clients to ensure the platform meets real-world enterprise requirements Contribute to both strategic technical direction and hands-on implementation as part of a small, high More ❯
global economy. At Vyntra , we’re a fintech innovator combining cutting-edge AI with deep financial intelligence to build solutions that matter. From fraud detection and AML to transaction observability, our products empower the world’s leading financial institutions to see risk differently and act faster. We’re looking for a marketer who’s hungry to learn, eager to grow More ❯
london (city of london), south east england, united kingdom
Vyntra Global
the global economy. At Vyntra , were a fintech innovator combining cutting-edge AI with deep financial intelligence to build solutions that matter. From fraud detection and AML to transaction observability, our products empower the worlds leading financial institutions to see risk differently and act faster. Were looking for a marketer whos hungry to learn, eager to grow, and ready to More ❯
end Scale ingestion and indexing for 30+ blockchains, including high-throughput chains Operate a secure fleet of full nodes and indexers with clear SLAs and cost controls Set SLOs, observability, incident management, and make on call boring Build and lead six plus squads. Org design, hiring, mentoring, standards, and SDLC Partner with product, compliance, and customers to turn outcomes into More ❯
business problems at scale. What you’ll bring: Expertise in the deployment of enterprise-grade AI solutions to cloud and on-premise customer environments with a focus on availability, observability and security. Proven track record with at least one of the major cloud providers and an understanding of DevOps best practices. Hands-on experience building production-grade solutions using LLMs More ❯
london (city of london), south east england, united kingdom
causaLens
business problems at scale. What you’ll bring: Expertise in the deployment of enterprise-grade AI solutions to cloud and on-premise customer environments with a focus on availability, observability and security. Proven track record with at least one of the major cloud providers and an understanding of DevOps best practices. Hands-on experience building production-grade solutions using LLMs More ❯
ensure accuracy and quality is obtained collaboratively with our 3rd party suppliers. Assess and manage risks associated with services and recurring problems. Work across the ecosystem to continuously improve observability capabilities such as reporting, dashboarding and alerting which will drive robust proactive problem management. Ensure, these are communicated on a weekly basis and available for all of the team to More ❯
ll design and implement database services that can be consumed on demand — secure, compliant, and self-service. Working closely with Platform, SRE, and DevOps teams, you’ll bring automation, observability, and scalability to their database layer, enabling hundreds of developers to ship faster with confidence. What You’ll Do 💾 Design, build, and operate PostgreSQL and ElasticSearch clusters for production. ⚙️ Automate … provisioning, upgrades, and HA/DR with Terraform, Ansible, Helm, and Kubernetes Operators. 🌐 Embed databases into the Internal Developer Platform through APIs, GitOps workflows, and self-service tools. 📊 Implement observability with Prometheus, Grafana, and centralized logging. 🧠 Define and maintain SLOs for uptime and performance, embedding compliance and security controls. 🤝 Collaborate with development and platform teams to refine database automation standards … of Kubernetes and stateful workloads . ✅ Proficiency with Infrastructure as Code (Terraform, Ansible, Helm). ✅ Some development experience (Python, Go, or similar) for automation and API integration. ✅ Knowledge of observability tooling – Prometheus, Grafana, ELK, or Datadog. 🎁 Bonus: experience with ElasticSearch , MySQL , or SQL Server , plus exposure to AWS , GCP , or Azure . Why This Role ✨ Greenfield impact – build database-as More ❯
london (city of london), south east england, united kingdom
Humankind Global Recruitment
ll design and implement database services that can be consumed on demand — secure, compliant, and self-service. Working closely with Platform, SRE, and DevOps teams, you’ll bring automation, observability, and scalability to their database layer, enabling hundreds of developers to ship faster with confidence. What You’ll Do 💾 Design, build, and operate PostgreSQL and ElasticSearch clusters for production. ⚙️ Automate … provisioning, upgrades, and HA/DR with Terraform, Ansible, Helm, and Kubernetes Operators. 🌐 Embed databases into the Internal Developer Platform through APIs, GitOps workflows, and self-service tools. 📊 Implement observability with Prometheus, Grafana, and centralized logging. 🧠 Define and maintain SLOs for uptime and performance, embedding compliance and security controls. 🤝 Collaborate with development and platform teams to refine database automation standards … of Kubernetes and stateful workloads . ✅ Proficiency with Infrastructure as Code (Terraform, Ansible, Helm). ✅ Some development experience (Python, Go, or similar) for automation and API integration. ✅ Knowledge of observability tooling – Prometheus, Grafana, ELK, or Datadog. 🎁 Bonus: experience with ElasticSearch , MySQL , or SQL Server , plus exposure to AWS , GCP , or Azure . Why This Role ✨ Greenfield impact – build database-as More ❯
in London. Working alongside software and cybersecurity engineers, you’ll help design, build, and automate a hybrid multi-cloud estate across AWS and Azure—enhancing CI/CD pipelines, observability, and developer experience. You’ll take ownership of business-critical infrastructure, shaping cloud strategy end-to-end and collaborating with global teams across the US and Europe to drive efficiency … CI/CD pipelines through tools such as Azure DevOps, GitHub Actions, or Octopus. You’ll also be adept at automating workflows in Python or PowerShell and implementing modern observability solutions including DataDog, OpenSearch, and LogicMonitor. This is a rare opportunity to join a high-performing, global hedge fund where technology and engineering directly drive investment performance and operational scale. More ❯