platform team covering ecommerce, customer and other key business capabilities including line management of 1-2 internal engineers and a QA Manager. Promote consistent use of tooling, automation, and observability to increase reliability and reduce manual overhead. Foster strong relationships with engineers, product managers, architects, and other technical leaders to encourage shared ownership and technical excellence. Guide prioritisation of tech More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Hlx Technology
neuroscience, and clinical datasets Build a unified feature store to serve ML training and downstream biological analysis Develop scalable storage, ingestion, and validation systems with a focus on robustness, observability, and versioning Collaborate with ML researchers and biologists to translate raw data into actionable insights and high-quality training data Scale distributed systems using Kubernetes, Terraform, and orchestration tools such More ❯
neuroscience, and clinical datasets Build a unified feature store to serve ML training and downstream biological analysis Develop scalable storage, ingestion, and validation systems with a focus on robustness, observability, and versioning Collaborate with ML researchers and biologists to translate raw data into actionable insights and high-quality training data Scale distributed systems using Kubernetes, Terraform, and orchestration tools such More ❯
london, south east england, united kingdom Hybrid / WFH Options
Hlx Technology
neuroscience, and clinical datasets Build a unified feature store to serve ML training and downstream biological analysis Develop scalable storage, ingestion, and validation systems with a focus on robustness, observability, and versioning Collaborate with ML researchers and biologists to translate raw data into actionable insights and high-quality training data Scale distributed systems using Kubernetes, Terraform, and orchestration tools such More ❯
london (city of london), south east england, united kingdom Hybrid / WFH Options
Hlx Technology
neuroscience, and clinical datasets Build a unified feature store to serve ML training and downstream biological analysis Develop scalable storage, ingestion, and validation systems with a focus on robustness, observability, and versioning Collaborate with ML researchers and biologists to translate raw data into actionable insights and high-quality training data Scale distributed systems using Kubernetes, Terraform, and orchestration tools such More ❯
APIs Experience of writing performance critical code Experience of using Git or similar to track changes Experience of both the full .NET Framework and .NET Core Experience of using observability systems such as Elastic APM or DataDog to track and diagnose issues in production A solid understanding of security principles and secure coding including OWASP Top 10 Nice to haves More ❯
Develop integrations and services that communicate between different backend components. Furthering Developer Experience (DevEx) by mentoring others in writing code that is intuitive, clear, and easy to test Developing observability for new and existing ML applications and GenAI/LLM integrations, making use of the Grafana Stack (Prometheus, Loki, Tempo) Working closely with Data Scientists and ML Engineers throughout the More ❯
and other open standards from the ASWF. Experience with web frameworks (e.g., Flask, FastAPI, Django) and database systems (PostgreSQL, Neo4j, Redis). Experience with performance profiling, system monitoring, and observability tools. Understanding of network protocols, security best practices, and compliance requirements. Open-source contributions or technical writing experience. Entrepreneurial mindset or experience working with startups or fast-paced teams. About More ❯
proprietary data that will power AI products. Champion best practices for orchestrating multi-cloud environments (AWS, Azure, GCP) to enhance platform performance, scalability, and cost efficiency. Implement robust security, observability, monitoring frameworks, and data governance to ensure data reliability, minimize downtime, and maintain compliance. Manage budget, implement charge-back models for platforms and services you provide to your customers. Lead More ❯
teams to align on data architecture and ensure our ML systems meet overarching business objectives. Evolve our MLOps infrastructure, driving the strategy for model versioning, automated deployments, monitoring, and observability using modern tools like Prefect. Mentor and guide other members of the team, fostering a culture of technical excellence and continuous improvement through code reviews, design discussions, and knowledge sharing. More ❯
communication skills, able to engage both technical and non-technical stakeholders Leadership experience within data teams Desirable DAMA certified (CDMP) Knowledge of Lakehouse and other database architectures Familiarity with observability principles and BI tools (e.g. Power BI) Experience working in Agile environments More ❯
and resolve application-level production incidents The Person: 5+ years in SRE, DevOps, or infrastructure engineering Strong experience with AWS, EKS/Kubernetes, and Terraform Familiar with Kafka and observability tools like Datadog or Grafana Able to troubleshoot issues across infrastructure and application layers Reference number: BBBH259300 To apply for this role or for to be considered for further roles More ❯
/CD pipelines with Azure DevOps, ensuring robust version control, testing, and seamless deployment. * Monitor production ML systems for performance, data drift, and anomalies using Azure Monitor or other observability tools. * Schedule and automate model retraining pipelines to maintain performance over time. 3. Data Engineering & Preprocessing * Develop and maintain scalable ETL/ELT data pipelines using Azure Data Factory, Data More ❯
a high-performing engineering team, splitting time between coding and people management. Drive delivery of new crypto product features end-to-end, from design to production. Ensure code quality, observability, scalability, and security are embedded in every release. Foster a collaborative, growth-focused team culture with clear goals and high accountability. Coordinate closely with Product, Design, and cross-functional teams More ❯
as Data Engineering and Product, to build a more effective and cohesive ML ecosystem. Deep expertise in data science and engineering best practices (version control, CI/CD, testing, observability) and a history of applying them to build robust, scalable machine learning systems. Exceptional analytical and problem-solving skills, with a demonstrated ability to define and solve highly ambiguous, complex More ❯
platforms managed by network engineering technology services. • Integration: Collaborate with design & platform teams and support the Implementation of flawless change into the live network, prioritising the use of automation. • Observability: Developing bespoke monitoring solutions to improve visibility of the network • Automation Tools: Utilise tools such as Ansible and Python to provision and manage infrastructure resources in a scalable and efficient More ❯
an initial 6 month contract. You'll be primarily responsible for working in a team that designs, builds, and maintains the organisations cloud infrastructure, with a focus on automation, observability and scalability. Essential skills/experience required: AWS Infrastructure as code using Terraform Cloudflare Developing CI/CD pipelines Incredibly beneficial: Snowflake MLOps Security best practices The role is confirmed More ❯
secure handling of sensitive operational data and compliance with relevant standards Developed and maintained robust APIs for system integration Drove operational excellence and continuous improvement Implemented and managed monitoring, observability, and troubleshooting tools for deployed systems Designed and handled containerised applications (e.g., Docker, Kubernetes) Qualifications Bachelor's degree in Computer Science, Engineering, or a related technical field Relevant experience as More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Opus Recruitment Solutions Ltd
transformation and governance Working closely with Engineering, Analytics, Product, and Pricing teams to ensure priorities are aligned Driving improvements in tooling, infrastructure, and engineering practices (CI/CD, testing, observability) Required Experience Proven experience leading Data Engineering teams Strong technical background (5+ years) in building scalable data platforms Excellent communication and stakeholder management skills Hands-on experience with modern data More ❯
of student lifecycle processes in Higher Education and relevant data domains. Knowldge of event-driven and message-based architectures (Event Hub, Kafka, or Service Bus) Experience with monitoring and observability tools like Azure Monitor, Application Insights, and Log Analytics. Awareness of data security, GDPR, and compliance in educational or public sector environments. Exposure to OpenAPI/Swagger, API lifecycle management More ❯
data is delivered on time and without failure. The ideal candidate will have a strong experience working with streaming and batch data systems, a solid understanding of monitoring a observability, and hands-on experience working with AWS, Apache Flink, Kafka, and Python. This is a fantastic opportunity to step into a SRE role focused on data reliability in a modern More ❯
City of London, London, United Kingdom Hybrid / WFH Options
Rise Technical Recruitment Limited
data is delivered on time and without failure. The ideal candidate will have a strong experience working with streaming and batch data systems, a solid understanding of monitoring a observability, and hands-on experience working with AWS, Apache Flink, Kafka, and Python. This is a fantastic opportunity to step into a SRE role focused on data reliability in a modern More ❯
Cloud Run, IAM) and Azure (App Services, AD authentication) Familiarity with CI/CD pipelines, GitHub Actions, Docker, and security tools like Snyk or Dependabot Knowledge of monitoring and observability tools such as Google Cloud Operations Suite and Azure Monitor Comfortable working in agile, cross-functional teams alongside AI engineers, cloud architects, and product managers Job Offer 6 month contract More ❯
teams to operationalize models and ship ML-powered features into production. Continuously assess and iterate on production models, balancing long-term ML strategy with tactical improvements. Champion code quality, observability, and resilience within their ML systems through reviews and hands-on contributions. Help shape their internal ML standards and practices, ensuring they stay ahead of industry advancements. Offer technical mentorship More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Method Resourcing
teams to operationalize models and ship ML-powered features into production. Continuously assess and iterate on production models, balancing long-term ML strategy with tactical improvements. Champion code quality, observability, and resilience within their ML systems through reviews and hands-on contributions. Help shape their internal ML standards and practices, ensuring they stay ahead of industry advancements. Offer technical mentorship More ❯