to design, build, and maintain the platforms and tooling that underpin our infrastructure provisioning and delivery lifecycle. You'll work collaboratively with cross-functional teams to automate infrastructure, enhance observability, and embed best practices in VMware Hypervisor and DevOps . Key Responsibilities: Build and maintain on-prem and cloud infrastructure (VMware Hypervisor, vSphere, OpenStack, AWS, GCP, Azure). Apply deep More ❯
to design, build, and maintain the platforms and tooling that underpin our infrastructure provisioning and delivery lifecycle. You'll work collaboratively with cross-functional teams to automate infrastructure, enhance observability, and embed best practices in VMware Hypervisor and DevOps . Key Responsibilities: Build and maintain on-prem and cloud infrastructure (VMware Hypervisor, vSphere, OpenStack, AWS, GCP, Azure). Apply deep More ❯
GitHub Actions, or GitLab CI. Solid understanding of containerization technologies (Docker, Kubernetes). Working knowledge of Python and SQL for automation and data pipeline development. Familiarity with monitoring and observability tools (Grafana, Prometheus, CloudWatch). Strong grasp of data architecture principles and ETL design patterns. Financial services or regulated industry experience (desirable). More ❯
Wokingham, Berkshire, United Kingdom Hybrid / WFH Options
Experis
Collaborate with Agile teams to automate deployment, monitoring, and infrastructure management. Ensure platform and business application reliability and performance against strict SLAs and KPIs. Implement and maintain cloud-native observability stacks (Prometheus, Grafana, Loki, Tempo). Develop and maintain Infrastructure as Code (IaC) using tools like Kustomize or Helm. Manage CI/CD pipelines using Tekton and ArgoCD. Support and More ❯
Wokingham, Berkshire, United Kingdom Hybrid / WFH Options
Experis
Collaborate with Agile teams to automate deployment, monitoring, and infrastructure management. Ensure platform and business application reliability and performance against strict SLAs and KPIs. Implement and maintain cloud-native observability stacks (Prometheus, Grafana, Loki, Tempo). Develop and maintain Infrastructure as Code (IaC) using tools like Kustomize or Helm. Manage CI/CD pipelines using Tekton and ArgoCD. Support and More ❯
design and evolution of our API schemas, ensuring they meet the complex demands of a rapidly growing platform. Champion best practice in code quality, automated testing (Vitest, Playwright) and observability to deliver resilient, maintainable, and production-ready business logic. Drive DevOps excellence by collaborating on CI/CD pipelines (Jenkins, Concourse), containerisation (Docker) and Kubernetes deployments. Mentor and empower fellow More ❯
consistency, repeatability, and auditability across environments Develop and maintain developer tooling and golden templates (CI/CD pipelines, scaffolds, environments) to standardize best practices across teams Design and implement observability frameworks (metrics, tracing, logging, alerting) that are easy to consume and part of the platform baseline Eliminate repetitive tasks through automation and opinionated defaults, so teams are not blocked by … and orchestration (Docker, Kubernetes) Familiarity with CI/CD systems (GitHub Actions, GitLab CI, Jenkins, etc.) Hands-on experience with infrastructure-as-code tools (e.g., Terraform, CloudFormation) Knowledge of observability tools (Prometheus, Grafana, ELK stack, Datadog, etc.). Solid grasp of Linux systems and networking fundamentals Strong problem-solving and debugging skills Your Package & Perks: A competitive salary Flexible working More ❯
consistency, repeatability, and auditability across environments Develop and maintain developer tooling and golden templates (CI/CD pipelines, scaffolds, environments) to standardize best practices across teams Design and implement observability frameworks (metrics, tracing, logging, alerting) that are easy to consume and part of the platform baseline Eliminate repetitive tasks through automation and opinionated defaults, so teams are not blocked by … and orchestration (Docker, Kubernetes) Familiarity with CI/CD systems (GitHub Actions, GitLab CI, Jenkins, etc.) Hands-on experience with infrastructure-as-code tools (e.g., Terraform, CloudFormation) Knowledge of observability tools (Prometheus, Grafana, ELK stack, Datadog, etc.). Solid grasp of Linux systems and networking fundamentals Strong problem-solving and debugging skills Your Package & Perks: A competitive salary Flexible working More ❯
Alto preferred), network access control (802.1x, RADIUS), or zero-trust security concepts. Exposure to infrastructure-as-code (Terraform, Ansible) and version control systems (Git). Experience with monitoring and observability tools (LogicMonitor, Grafana, Prometheus). Knowledge of hybrid cloud networking, including AWS Direct Connect or GCP Interconnect. Relevant certifications such as CCNP, AWS Advanced Networking Specialty, or Google Cloud Network More ❯
for leading and executing the migration of data, dashboards, alerts, and configurations from Splunk systems to Elasticsearch. This role involves deep technical expertise in Splunk architecture, data ingestion, and observability tools, along with strong project management and stakeholder communication skills. Must have skills: -Splunk -ELK Stack -Kibana Nice to have skills: -stakeholder communication skills -strong project management Responsibilities: Minimum number More ❯
Wokingham, Berkshire, South East, United Kingdom Hybrid / WFH Options
Sanderson Government and Defence
for a sharp-minded Site Reliability Engineer to join our cloud-native mission in Azure. If you thrive in Agile teams, live for automation, and know your way around observability stacks and CI/CD pipelines - this is your playground. What you'll be doing: Automating deployment, monitoring & infrastructure with precision Owning platform reliability, performance & SLAs Building IaC with Helm More ❯
ground models with enterprise data (SharePoint, Dataverse, SQL, Azure AI Search/RAG). Craft, test and version prompts ; define evaluation metrics, safety rails and guardrails. Implement telemetry/observability (App Insights/Kusto), A/B tests and continuous improvement loops. Work with Security/Compliance on data access, DLP, retention and audit ; follow least-privilege and secure-by More ❯
insight, and proactive incident management. Key Responsibilities Translate high-level monitoring non-functional requirements (NFRs) into actionable configurations across tools such as Splunk, Dynatrace, and AppDynamics. Deliver full-stack observability solutions, including application-aware network performance monitoring (NPM), synthetics, log analytics, and infrastructure metrics. Provide live support for monitoring technologies and assist with live service support, including key business events More ❯
functional teams delivering and maintaining large-scale digital platforms, ensuring high availability, scalability, and resilience. The role requires a blend of technical depth and leadership capability particularly in automation, observability, and mentoring team members. Key Skills & Experience: DevOps/SRE experience (5+ years) – ownership of projects, strong automation and Infrastructure-as-Code approach, incident management, and leadership of initiatives. Terraform … state management, and AWS integration. Kafka – experience with production clusters, scaling, tuning, troubleshooting, and event-driven systems. MongoDB – strong admin experience including replication, sharding, tuning, and backups. Monitoring/Observability – Prometheus, Grafana, ELK, Datadog, with strong alerting/SLO design. AWS – expertise across EC2, VPC, S3, RDS, IAM, ALB/NLB, and cost optimisation. Linux – advanced administration, performance debugging, and More ❯
UKIC DV Cleared Site Reliability/DevOp Engineer London - 5 Days Onsite Up to £550 per day (Umbrella, Inside IR35) 12-Month Contract Must hold UKIC DV Clearance Are you passionate about reliability, automation, and supporting mission-critical systems? Join More ❯
South West London, London, United Kingdom Hybrid / WFH Options
Purview Consultancy Services Ltd
and agentic workflows Drive architectural reviews for LlamaParse/Azure Document Intelligence integration Design fault-tolerant, high-availability AI systems with automatic failover and load balancing Establish comprehensive monitoring, observability, and performance optimization strategies Mentor technical teams and establish AI engineering best practices using modern toolchains Oversee model performance evaluation using LangGraph evals and DeepEval frameworks More ❯
company's customer experience (CX) vision. You will collaborate closely with other software engineers, product teams, and AI specialists to develop LLM AI-powered applications, ensuring their scalability, security, observability and performance. This role is hands-on, with a primary focus on coding, testing, and deploying AI solutions in a fast-paced, agile environment. Responsibilities: Code Development and Testing Write More ❯
company's customer experience (CX) vision. You will collaborate closely with other software engineers, product teams, and AI specialists to develop LLM AI-powered applications, ensuring their scalability, security, observability and performance. This role is hands-on, with a primary focus on coding, testing, and deploying AI solutions in a fast-paced, agile environment. Responsibilities: Code Development and Testing Write More ❯
specialism in vulnerability management Self-starter, able to work in technical detail and motivate a diverse group of stakeholders to build sponsorship for significant and impactful change Desired: Establishing observability platforms Capabilities adjacent to exposure/vulnerability management capabilities (ie cyber security asset management, attack surface management, etc) Pragmatic application of zero-trust philosophies Cloud based security (GCP, AWS and More ❯
South West London, London, United Kingdom Hybrid / WFH Options
Purview Consultancy Services Ltd
Intelligence Implement advanced RAG systems with text-embedding-3-large and Azure DB for Postgres Lead hands-on development using Claude Code for rapid agentic workflow creation Establish AI observability and monitoring using Arize Phoenix and Azure AI Foundry Fine-tune and optimize Azure OpenAI GPT-5 models for financial document understanding Implement comprehensive evaluation strategies using LangGraph evals and More ❯
business processes. (LEAD) Familiarity with Microsoft Power Platform concepts, including Power Automate, Power Apps, and Dataverse. (LEAD) Experience applying Generative AI and prompting techniques. Strong understanding of AI governance, observability, and compliance frameworks. Proven ability to deliver secure, scalable, and responsible AI solutions. Excellent communication and presentation skills Extensive experience working collaboratively with diverse colleagues and stakeholders. Knowledge of the More ❯
experiences. Proven experience as a Business Analyst in an Agile environment Strong knowledge of market data and market data supervision Financial Services experience is mandatory Strong understanding of monitoring, observability, and telemetry (metrics, logs, traces) Ability to translate technical concepts into actionable business requirements Hands-on experience with tools such as Datadog, BigPanda, Grafana would be desirable Excellent stakeholder management More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Devonshire Hayes Recruitment Specialists Limited
business processes. (LEAD) Familiarity with Microsoft Power Platform concepts, including Power Automate, Power Apps, and Dataverse. (LEAD) Experience applying Generative AI and prompting techniques. Strong understanding of AI governance, observability, and compliance frameworks. Proven ability to deliver secure, scalable, and responsible AI solutions. Excellent communication and presentation skills Extensive experience working collaboratively with diverse colleagues and stakeholders. Knowledge of the More ❯
business processes. (LEAD) Familiarity with Microsoft Power Platform concepts, including Power Automate, Power Apps, and Dataverse. (LEAD) Experience applying Generative AI and prompting techniques. Strong understanding of AI governance, observability, and compliance frameworks. Proven ability to deliver secure, scalable, and responsible AI solutions. Excellent communication and presentation skills Extensive experience working collaboratively with diverse colleagues and stakeholders. Knowledge of the More ❯