the use of Large Language Models Take ownership of the design, deployment, and maintenance of machine learning models Recommend, implement, and use tooling to improve the development, operations, and observability of machine learning models, large language models, and AI-related services Essential skills: Previous experience working on online Chat or Chatbors, particular voice, is a must-have. Strong recent hands More ❯
Manchester, Lancashire, England, United Kingdom Hybrid / WFH Options
Interquest
Engineer, you’d also have the opportunity to mentor other team members an collaborate with product managers. Skills: TypeScript (Node, React) AWS (Lambda, Fargate, S3, Dynamo, Event Bridge etc.) Observability tools (Datadog, Dynatrace, Honeycomb, CloudWatch etc.) The money is good too – up to £70k plus benefits including hybrid working (2 days per week in Manchester) and a 2pm finish every More ❯
South West London, London, United Kingdom Hybrid / WFH Options
Purview Consultancy Services Ltd
and agentic workflows Drive architectural reviews for LlamaParse/Azure Document Intelligence integration Design fault-tolerant, high-availability AI systems with automatic failover and load balancing Establish comprehensive monitoring, observability, and performance optimization strategies Mentor technical teams and establish AI engineering best practices using modern toolchains Oversee model performance evaluation using LangGraph evals and DeepEval frameworks More ❯
london, south east england, united kingdom Hybrid / WFH Options
Purview Consultancy Services Ltd
and agentic workflows Drive architectural reviews for LlamaParse/Azure Document Intelligence integration Design fault-tolerant, high-availability AI systems with automatic failover and load balancing Establish comprehensive monitoring, observability, and performance optimization strategies Mentor technical teams and establish AI engineering best practices using modern toolchains Oversee model performance evaluation using LangGraph evals and DeepEval frameworks JBRP1_UKTJ More ❯
south west london, south east england, united kingdom Hybrid / WFH Options
Purview Consultancy Services Ltd
and agentic workflows Drive architectural reviews for LlamaParse/Azure Document Intelligence integration Design fault-tolerant, high-availability AI systems with automatic failover and load balancing Establish comprehensive monitoring, observability, and performance optimization strategies Mentor technical teams and establish AI engineering best practices using modern toolchains Oversee model performance evaluation using LangGraph evals and DeepEval frameworks JBRP1_UKTJ More ❯
data modeling (star schema), SQL, Python, and data governance tools (e.g., Purview, Unity Catalog). Experience implementing AI/ML solutions in Databricks or similar platforms. Knowledge of data observability, monitoring, and incident management (ITIL best practices). Excellent communication and stakeholder management skills. Experience in a regulated financial environment is desirable. Relevant certifications such as Azure Data Engineer Associate More ❯
improvements across tooling, processes, and culture Drive infrastructure enhancements to improve website speed, reliability, and customer experience Conduct full reviews of New Relic, Rollbar, and Cloudflare usage - ensuring effective observability, alerting, and performance Maintain high standards of cloud security and ensure compliance Ensure robust performance of Redis and Couchbase databases Manage vendor relationships, licencing, and cost optimisation Inspire a culture More ❯
improvements across tooling, processes, and culture Drive infrastructure enhancements to improve website speed, reliability, and customer experience Conduct full reviews of New Relic, Rollbar, and Cloudflare usage - ensuring effective observability, alerting, and performance Maintain high standards of cloud security and ensure compliance Ensure robust performance of Redis and Couchbase databases Manage vendor relationships, licencing, and cost optimisation Inspire a culture More ❯
Bristol, Somerset, United Kingdom Hybrid / WFH Options
Synergize Consulting Ltd
with experience in: NVIDIA technologies - GPU acceleration for AI/ML workloads and system optimisation. OpenShift - leveraging containerisation for CI/CD pipelines and production readiness. Cilium - advanced networking, observability, and security in Kubernetes clusters. MongoDB - scalable database design and integration. Podman - secure container runtime and management. Red Hat Enterprise Linux (RHEL) - enterprise-level platform experience. Personal Attributes Excellent interpersonal More ❯
Streaming Data Strategy with a comprehensive approach to data control, compliance, and security; unconstrained by their infrastructure providers. Our platform mitigates data security risks while enhancing communication, automation, and observability across data flows, enabling teams to collaborate effortlessly across the organisation. With hubs in London and New York, we're looking for people who are passionate about our mission and More ❯
a collaborative and supportive team environment through experienced, empathetic leadership Commit to continuous learning and stay current with emerging technologies and best practices Implement and maintain application monitoring and observability, proactively identifying and resolving system issues Own the full software lifecycle from system design and development through deployment, monitoring, and maintenance Person Specification Experience Essential What youll bring to the More ❯
key part of your role will be championing continuous improvement across processes, tooling, and engineering culture, encouraging the adoption of modern practices such as DevOps, CI/CD, and observability at scale. Your influence will shape the way teams work, enabling high performance and delivering a world-class developer experience across IAG Loyalty and BA Holidays. What we're looking More ❯
applications of AI for the construction domain, pushing the boundaries of what's possible. Build core infrastructure that allows us to build and ship LLM apps quickly - this includes observability, how we work with several LLM providers + our own fine tuned models. Work with other engineers in the product and research teams to bring new models and applications to More ❯
and help shape how platform engineering is done as the team continues to scale. Tech stack AWS (Core services - EC2, RDS, S3, IAM, etc.) Configuration Management Ansible Monitoring and Observability Grafana, Prometheus Kubernetes (building and managing production clusters) Terraform (IaC provisioning) Python or Java (scripting, automation) GitHub Actions (CI/CD pipelines) What Theyre Looking For Experience in AWS cloud … infrastructure (ideally in a regulated or high-traffic environment) Previous experience working with Monitoring and Observability Tools Hands-on Kubernetes know-how, specifically with EKS. Solid IaC experience with Terraform. Experience with containerisation (Docker, Helm) and CI/CD (GitHub Actions or similar) Solid scripting/Automation experience with Python or Java A good communicator who enjoys working collaboratively across More ❯
cycles. Professional experience developing with functional programming languages (e.g. Elixir, Erlang, Clojure, etc.) or infrastructure-focused programming languages (Go, Rust, Ruby, etc.). Strong expertise in designing systems for observability, including effective monitoring, detailed logging, comprehensive performance testing strategies, and hands-on experience with modern observability tools such as Grafana, Prometheus, or CloudWatch to implement and manage monitoring solutions. Hands More ❯
bulk of our codebase, currently in Java (11+), and ideally Spring Boot. You will be working with SQL and large SQL databases, Docker, Kubernetes, OpenAPI specifications, and distributed system observability tooling (e.g., Datadog APM). Infrastructure automation is primarily owned by the infrastructure team, but you will be a consumer of their work; familiarity with AWS, Terraform and Docker is … Ability to communicate effectively with technical and non-technical stakeholders Modern Cloud-Native architectures and practices (high availability, high scalability, microservices, 12-factor apps, CI/CD, automation and observability) TDD, BDD and Contract testing Experience in a DevOps environment or willingness to work in one Proven delivery of well-tested, scalable, fault-tolerant and performant solutions A pragmatic, self More ❯
UKIC DV Cleared Site Reliability/DevOp Engineer London - 5 Days Onsite Up to £550 per day (Umbrella, Inside IR35) 12-Month Contract Must hold UKIC DV Clearance Are you passionate about reliability, automation, and supporting mission-critical systems? Join More ❯
UKIC DV Cleared Site Reliability/DevOp Engineer London - 5 Days Onsite Up to 550 per day (Umbrella, Inside IR35) 12-Month Contract Must hold UKIC DV Clearance Are you passionate about reliability, automation, and supporting mission-critical systems? Join More ❯
easy to maintain code A strong understanding of Cloud security, networking and APIs Experience in problem-solving, able to demonstrate logical thinking and excellent troubleshooting skills Hands-on with Observability Tooling (Observability as Code and SLO-based Dynatrace Monitoring) Strong understanding and demonstrable use of source control practice and collaborative working as part of an engineering team Experience of developing More ❯
Caldecotte, Milton Keynes, Buckinghamshire, England, United Kingdom
Connells Group HQ
practices in software development and deployment Implement best practice coding in relation to Development coding standards Provides direction and technical context for more junior developers Fosters a culture of observability across the engineering team. Helps teams across engineering use operational data to improve stability and performance of their applications. Awareness of application security considerations Leads incident response across the engineering More ❯
outcomes, engaging credibly with CTOs/CIOs on strategy, modernization, and transformation Strong architectural fluency in secure, high-scale, highly available cloud environments (distributed systems, microservices evolution, data platforms, observability, cybersecurity) to lead executive-level discussions without needing to be hands-on. Deep experience with SaaS go-to-market, outcome/value selling, and business/digital transformation, including executive More ❯
Leeds, West Yorkshire, England, United Kingdom Hybrid / WFH Options
FPSG Connect
and industry security standards (e.g. OWASP CI/CD, SAMM) are adhered to across systems Managing and improving cloud security posture (Azure Defender, Prisma Cloud etc) Implementing and optimising observability platforms for holistic system monitoring Supporting and securing software delivery lifecycle, from development to deployment and ongoing operations The successful Security Engineer's essential skills will include: Demonstrated experience in More ❯
strategic way, with continuous improvement of solutions. Experience with configuration and change management, incident/problem resolution, and evolving endpoint solutions using modern infrastructure standards and practices including automation, observability, and continuous deployment. Experience working with architects and project managers to agree enterprise-wide designs and implement across central and multi-region estates. We will provide The opportunity to be More ❯
client satisfaction. Collaborating with Client Solutions and other teams to understand requirements and deliver tailored solutions. Designing and implementing scalable, future-proof architectures for new connectors and integrations. Enhancing observability with better diagnostics, logging, and tracing to support technical teams. Overseeing the development and management of the public API (REST + event streaming functionality). Producing clear, accessible technical documentation More ❯
interest in learning them. Bonus points Prior experience integrating with UK and European healthcare systems (e.g., EMIS, TPP SystmOne, Cerner Millennium, GDT, Dedalus, Maincare, Epic, etc.). Knowledge of observability tools (logging, metrics, tracing) and performance tuning. Familiarity with GDPR, NHS DSP Toolkit, and healthcare security best practices . Attitude matters more than experience! If you are motivated, collaborative, and More ❯