on experience in Microsoft Azure ML Studio. (LEAD) Experience using business intelligence tools, preferably Power BI. Experience applying Generative AI and prompting techniques. Strong understanding of data governance, model observability, and compliance frameworks. Proven ability to deliver secure, scalable, and responsible data science solutions. Excellent communication and presentation skills. Extensive experience working collaboratively with diverse colleagues and stakeholders. ✅ Nice to More ❯
london (city of london), south east england, united kingdom
La Fosse
adoption of infrastructure-as-code and GitOps principles for consistent, automated delivery. Lead design forums and provide architectural governance across multiple projects. Develop cloud roadmaps covering network segmentation, identity, observability, and resilience. Embed security, compliance, and resilience into all architectural designs. Manage cost optimisation, including RI/SP planning and right-sizing. Mentor engineers and architects on AWS best practices More ❯
london (city of london), south east england, united kingdom
Eden Scott
on experience in Microsoft Azure ML Studio. (LEAD) Experience using business intelligence tools, preferably Power BI. Experience applying Generative AI and prompting techniques. Strong understanding of data governance, model observability, and compliance frameworks. Proven ability to deliver secure, scalable, and responsible data science solutions. Excellent communication and presentation skills. Extensive experience working collaboratively with diverse colleagues and stakeholders. ✅ Nice to More ❯
Collaborate with the Quantitative and Risk teams to productionize new models, ensuring best practices in testing and code review. Contribute to DevOps automation, including CI/CD, containerization, and observability, to ensure the 24x6 resilience of the platform. Experience: 3-10 years of experience building production systems in a front-office trading, hedge fund, or high-growth tech environment. Python More ❯
Your Impact Be the technical lead for backend initiatives powering supply chain and real-time operations Define engineering strategy and drive architecture decisions Own and improve backend quality, performance, observability, and uptime Collaborate directly with users in stores and warehouses to improve tooling that matters Participate in planning and roadmap discussions as a core team leader We have interview slots More ❯
london (city of london), south east england, united kingdom
Wave Talent
Your Impact Be the technical lead for backend initiatives powering supply chain and real-time operations Define engineering strategy and drive architecture decisions Own and improve backend quality, performance, observability, and uptime Collaborate directly with users in stores and warehouses to improve tooling that matters Participate in planning and roadmap discussions as a core team leader We have interview slots More ❯
City of London, London, United Kingdom Hybrid / WFH Options
IVC Evidensia
the design and management of AWS cloud infrastructure using CDK Guide the development and maintenance of CI/CD pipelines using AWS-native tools Foster best practices in automation, observability, and platform-as-a-product thinking Collaborate closely with engineering, security, and product leaders to align platform capabilities with business needs Champion operational excellence using Datadog and AWS monitoring/ More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Quantum Technology Solutions Inc
granular permission structures, RBAC, and least-privilege configurations across all resources. · Build and manage infrastructure using Terraform and Azure CLI , enabling consistency, traceability, and automated change control. · Implement strong observability and compliance frameworks (metrics, logging, tracing, and audits) to guarantee visibility, reliability, and adherence to high-regulation standards. · Support and automate development, staging/UAT, and production environments with robust More ❯
london (city of london), south east england, united kingdom Hybrid / WFH Options
Quantum Technology Solutions Inc
granular permission structures, RBAC, and least-privilege configurations across all resources. · Build and manage infrastructure using Terraform and Azure CLI , enabling consistency, traceability, and automated change control. · Implement strong observability and compliance frameworks (metrics, logging, tracing, and audits) to guarantee visibility, reliability, and adherence to high-regulation standards. · Support and automate development, staging/UAT, and production environments with robust More ❯
other internal teams to fully understand client requirements and deliver tailored technical solutions. Design and implement scalable, future-proof architectures for new third-party connectors and integrations. Enhance system observability by improving diagnostics, logging, and tracing to aid technical support teams in resolving issues swiftly. Oversee the ongoing development and management of the public API, covering REST and event streaming More ❯
audit-ready, supporting both speed and compliance. Vendor & Partner Management Select, negotiate with, and manage key infrastructure and SaaS vendors; optimise cost and performance. Monitoring, Performance & Reliability Implement monitoring, observability, and reporting across infrastructure and platforms to ensure uptime, resilience, and scalability. Incident Response Act as escalation point for operational incidents, drive root cause remediation, and ensure lessons learned feed More ❯
london (city of london), south east england, united kingdom
Humanoid
audit-ready, supporting both speed and compliance. Vendor & Partner Management Select, negotiate with, and manage key infrastructure and SaaS vendors; optimise cost and performance. Monitoring, Performance & Reliability Implement monitoring, observability, and reporting across infrastructure and platforms to ensure uptime, resilience, and scalability. Incident Response Act as escalation point for operational incidents, drive root cause remediation, and ensure lessons learned feed More ❯
AWS (Core Services – EC2, RDS, S3, IAM, Lambda, CloudWatch) Infrastructure as Code: Terraform Containerisation & Orchestration: Docker, Kubernetes (EKS), Helm Configuration Management: Ansible CI/CD Pipelines: GitHub Actions Monitoring & Observability: Grafana, Prometheus Scripting/Automation: Python or Java What We’re Looking For Proven experience managing and scaling AWS cloud environments , ideally supporting live software products or high-traffic platforms. … Strong background in Terraform and Infrastructure as Code best practices. Practical experience with Kubernetes (EKS) in production. Familiarity with monitoring and observability tools such as Grafana and Prometheus. Hands-on experience building CI/CD pipelines (GitHub Actions, Jenkins, CircleCI, etc.). Solid scripting and automation experience using Python or Java . A collaborative engineer who enjoys working closely with More ❯
DevOps, infrastructure, and platform engineering. Tech Stack Cloud: AWS (EC2, RDS, S3, IAM, CloudWatch, Lambda) Infrastructure as Code: Terraform Containerisation & Orchestration: Docker, Kubernetes (EKS), Helm Configuration Management: Ansible Monitoring & Observability: Grafana, Prometheus CI/CD: GitHub Actions Automation & Scripting: Python, Bash, Go or Java What We’re Looking For Proven experience running AWS cloud infrastructure in a production or regulated … financial) environment. Hands-on experience managing Kubernetes clusters (preferably EKS). Strong understanding of Infrastructure as Code using Terraform. Familiarity with monitoring and observability stacks such as Prometheus and Grafana. Experience building and maintaining CI/CD pipelines (GitHub Actions or similar). Strong scripting or automation skills using Python, Bash, Go or Java . A collaborative mindset — comfortable working More ❯
Job Title: DevOps Observability Engineer Duration: long-term contract Location: Hybrid (Sheffield or London) Visa: Only British/ILR/Dependent Visa (No Sponsorship Available, no PSW) We are looking for a hands-on DevOps Observability Engineer to design, implement, and lead enterprise observability solutions. You will drive DevOps adoption, build scalable telemetry systems, integrate with monitoring tools, and optimize … performance across cloud and on-prem infrastructure. Key Responsibilities: Design and implement observability solutions using OpenTelemetry across storage platforms. Develop and maintain CI/CD pipelines , distributed tracing, metrics, and logging. Integrate telemetry with tools like Prometheus, Grafana, Kafka, Splunk, Loki . Analyze telemetry data, optimize performance, and troubleshoot issues. Document setups, maintain standards, and support DevOps adoption across teams. More ❯
Job Title: DevOps Observability Engineer Duration: long-term contract Location: Hybrid (Sheffield or London) Visa: Only British/ILR/Dependent Visa (No Sponsorship Available, no PSW) We are looking for a hands-on DevOps Observability Engineer to design, implement, and lead enterprise observability solutions. You will drive DevOps adoption, build scalable telemetry systems, integrate with monitoring tools, and optimize … performance across cloud and on-prem infrastructure. Key Responsibilities: Design and implement observability solutions using OpenTelemetry across storage platforms. Develop and maintain CI/CD pipelines , distributed tracing, metrics, and logging. Integrate telemetry with tools like Prometheus, Grafana, Kafka, Splunk, Loki . Analyze telemetry data, optimize performance, and troubleshoot issues. Document setups, maintain standards, and support DevOps adoption across teams. More ❯
Modeling and Performance tuning. Should have experience in designing and developing dashboards Strong Knowledge in Hadoop, Kafka, SQL/NoSQL Should have experience in creating roadmap to improve platform Observability Experience in leading mid-scale teams with strong communication skills Experience in Machine Learning and GCP would be added advantage Must have experience in Banking or Insurance domain Must have More ❯
london (city of london), south east england, united kingdom
HCLTech
Modeling and Performance tuning. Should have experience in designing and developing dashboards Strong Knowledge in Hadoop, Kafka, SQL/NoSQL Should have experience in creating roadmap to improve platform Observability Experience in leading mid-scale teams with strong communication skills Experience in Machine Learning and GCP would be added advantage Must have experience in Banking or Insurance domain Must have More ❯
Power Apps, and Dataverse Experience tunning LLMs on Microsoft Azure ML Studio/Azure AI Foundry and applying Generative AI and prompt engineering techniques. Strong understanding of AI governance, observability, and compliance frameworks. Proven ability to deliver secure, scalable, and responsible AI solutions. Excellent communication and presentation skills. Extensive experience working collaboratively with diverse colleagues and stakeholders. More ❯
london (city of london), south east england, united kingdom
Eden Scott
Power Apps, and Dataverse Experience tunning LLMs on Microsoft Azure ML Studio/Azure AI Foundry and applying Generative AI and prompt engineering techniques. Strong understanding of AI governance, observability, and compliance frameworks. Proven ability to deliver secure, scalable, and responsible AI solutions. Excellent communication and presentation skills. Extensive experience working collaboratively with diverse colleagues and stakeholders. More ❯
production-grade AI/ML applications, including LLMs and anomaly detection models. Familiarity with cloud infrastructure (AWS preferred), container orchestration (Kubernetes), and workflow tools (Airflow, Argo). Experience with observability tools (e.g., Grafana, CloudWatch) and RESTful API development. More ❯
london (city of london), south east england, united kingdom
Selby Jennings
production-grade AI/ML applications, including LLMs and anomaly detection models. Familiarity with cloud infrastructure (AWS preferred), container orchestration (Kubernetes), and workflow tools (Airflow, Argo). Experience with observability tools (e.g., Grafana, CloudWatch) and RESTful API development. More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Understanding Recruitment
What You’ll Do Design and implement AI systems that automate customer workflows Contribute to the Virtual Agent development platform Integrate and optimise ML models in production Drive reliability, observability, and performance across AI services Collaborate with customers and internal teams to deliver scalable solutions About You Strong skills in Python and applied Machine Learning Passion for solving real-world More ❯
City of London, London, United Kingdom Hybrid / WFH Options
SR2 | Socially Responsible Recruitment | Certified B Corporation™
ideally Python , Rust is a bonus Experience with distributed systems, REST APIs, and microservices Knowledge of Kafka (or similar), PostgreSQL , and time-series data Familiar with Docker, monitoring, and observability tools ✅ Experience in a startup or scale-up , collaborating closely with engineers in a fast-moving environment Bonus points if you’ve worked in energy markets, trading systems, industrial control More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Amber Labs
environments. Excellent communication skills and a strong interest in the application of AI in public services. Desirable: Experience with multi-agent orchestration (LangGraph, AutoGen, CrewAI). Familiarity with AI observability tools (TruLens, Helicone). Awareness of AI safety and reliability frameworks (Guardrails AI). Experience working in government or public sector digital projects . More ❯