driven architecture . Databases & Messaging: Strong knowledge of both SQL and NoSQL databases, as well as Kafka . Tools: Familiarity with Jenkins , GitHub , and monitoring tools like Splunk or Grafana . Good to Have: Experience with reactive programming , caching mechanisms , and Agile projects. If you are a passionate and skilled developer, we encourage you to apply and join our team. More ❯
Burgess Hill, West Sussex, England, United Kingdom
Randstad Technologies
driven architecture . Databases & Messaging: Strong knowledge of both SQL and NoSQL databases, as well as Kafka . Tools: Familiarity with Jenkins , GitHub , and monitoring tools like Splunk or Grafana . Good to Have: Experience with reactive programming , caching mechanisms , and Agile projects. If you are a passionate and skilled developer, we encourage you to apply and join our team.Apply More ❯
Burgess Hill, West Sussex, South East, United Kingdom Hybrid / WFH Options
Randstad Digital
driven architecture . Databases & Messaging: Strong knowledge of both SQL and NoSQL databases, as well as Kafka . Tools: Familiarity with Jenkins , GitHub , and monitoring tools like Splunk or Grafana . Good to Have: Experience with reactive programming , caching mechanisms , and Agile projects. If you are a passionate and skilled developer, we encourage you to apply and join our team. More ❯
with Azure DevOps, Monday dot com, Teams etc. Integrating Azure SSO/RBAC into all the collaboration and development tools. Implement monitoring, logging, and alerting using tools like Prometheus, Grafana, or ELK Manage containerised services using Docker and Kubernetes (EKS) Candidate would also be required to add features and support existing system. Skills: Proven hands-on experience with Microsoft Azure … DevOps, GitHub Actions, etc.) Git-based workflow (PR, Merges, Jira status, CI and then CD). IaC (Terraform, ARM) Scripting (Python, PowerShell, Bash) Monitoring and alerting tools (e.g. Prometheus, Grafana, Azure Monitor). Collaboration tools (Jira and Monday dot com). Integration tools like Power Automate, Slack etc. Good to have Programming language (C#/.Net/Python More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Eligo Recruitment
ll Bring Strong experience with GCP , Terraform , and Infrastructure-as-Code Deep knowledge of cloud networking, security automation, and compliance standards Proficiency in CI/CD pipelines , monitoring tools (Grafana, Datadog), and scripting A collaborative mindset with excellent communication and mentoring skills Why Join? Shape a next-gen AI infrastructure with autonomy and purpose Hybrid working with regular meetups in More ❯
/CD pipelines using GitLab and ArgoCD. Design and operate containerised workloads with EKS, Fargate, and Kubernetes. Manage Kubernetes deployments using Helm charts. Implement observability solutions using OpenTelemetry (OTel), Grafana, and Splunk. Optimise infrastructure with Karpenter for autoscaling and cost efficiency. Ensure robust AWS networking (VPC, Transit Gateway, PrivateLink, Route 53) and enforce security best practices. Drive incident response, monitoring … and performance tuning. Key Technologies: AWS (EKS, Fargate, EC2, S3), Terraform, CloudFormation, GitLab, ArgoCD, Docker, Kubernetes, Helm, Cassandra, OTel, Grafana, Splunk, Karpenter, Python, Bash. Desirable: Experience with Google Cloud Platform (GCP), Apigee Hybrid, and hybrid/multi-cloud environments. Carbon60, Lorien & SRG - The Impellam Group STEM Portfolio are acting as an Employment Business in relation to this vacancy. More ❯
Wokingham, Berkshire, England, United Kingdom Hybrid / WFH Options
Opus Recruitment Solutions Ltd
architecture are key. What You’ll Be Working With: MySQL , Vitess , and Linux in production (Dont worry if you haven't worked with Vitess) Monitoring tools like Prometheus and Grafana Shard allocation, replication tuning, disk performance Backup, restore, and DR testing Data migrations and custom table loads for NHS tenants Zero-downtime patching and performance baselining What You’ll Bring More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Eligo Recruitment
indexing, and capacity planning for mission-critical systems Develop secure backup, recovery, and disaster recovery procedures Explore multi-tenant and sharded architectures to support growth Implement monitoring strategies using Grafana, Datadog, and CI/CD integrations Champion database best practices, mentor teams, and standardize tooling and automation What You’ll Bring Extensive experience managing cloud-hosted PostgreSQL at scale Proficiency More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Get Staffed Online Recruitment Limited
with throttling and versioning. Developing durable workflows. Writing efficient and scalable SQL queries , stored procedures, and scripts. Integrating external systems with custom data synchronisation logic. Utilising Open Telemetry and Grafana for logs, metrics, tracing, and alerting across backend services. Contributing to technical design discussions, code reviews, and deployments. What They’re Looking For: Strong experience in C#/.NET backend More ❯
at the heart of technology delivery. Responsibilities include: Designing and enforcing SLOs, SLIs, and SLAs to ensure high reliability and performance. Building and maintaining monitoring/observability solutions (Datadog, Grafana, Azure Application Insights, Log Analytics). Managing Infrastructure as Code (Terraform, Pulumi, CloudFormation) for scalable, repeatable deployments. Automating with PowerShell, Python, or Bash to drive efficiency. Supporting Kubernetes and AKS … Required: Proven Site Reliability Engineering background. Strong Terraform skills with live environment deployment. Kubernetes/AKS expertise. Scripting in PowerShell, Python or Bash. Monitoring experience (Datadog preferred, Azure or Grafana considered). Background in web applications and distributed systems. Desirable Skills: Knowledge of Microservices Architecture. Familiarity with Kanban. Experience with Puppet or Chef If you’re passionate about Site Reliability More ❯
API endpoints and overseeing model deployment workflows to ensure seamless integration and scalability. Key Responsibilities: Platform Operations & Monitoring • Monitor ML model endpoints and overall platform health using tools like Grafana and Domino Data Lab. • Respond to incidents and alerts, perform code fixes, manage incidents internally and manages changes through ServiceNow • Interface directly with Domino Data Lab support to resolve model … monitoring. • Working knowledge of core data science concepts, such as model evaluation metrics, overfitting, data drift, and feature importance. • Proficiency in AWS services (like S3, RedShift etc) • Experience with Grafana for monitoring and alerting. • Good to have hands-on experience with Domino Data Lab platform. • Solid understanding of CI/CD pipelines, version control, containerization, and orchestration. • Ability to communicate More ❯
API endpoints and overseeing model deployment workflows to ensure seamless integration and scalability. Key Responsibilities: Platform Operations & Monitoring • Monitor ML model endpoints and overall platform health using tools like Grafana and Domino Data Lab. • Respond to incidents and alerts, perform code fixes, manage incidents internally and manages changes through ServiceNow • Interface directly with Domino Data Lab support to resolve model … monitoring. • Working knowledge of core data science concepts, such as model evaluation metrics, overfitting, data drift, and feature importance. • Proficiency in AWS services (like S3, RedShift etc) • Experience with Grafana for monitoring and alerting. • Good to have hands-on experience with Domino Data Lab platform. • Solid understanding of CI/CD pipelines, version control, containerization, and orchestration. • Ability to communicate More ❯
london (city of london), south east england, united kingdom
HCLTech
API endpoints and overseeing model deployment workflows to ensure seamless integration and scalability. Key Responsibilities: Platform Operations & Monitoring • Monitor ML model endpoints and overall platform health using tools like Grafana and Domino Data Lab. • Respond to incidents and alerts, perform code fixes, manage incidents internally and manages changes through ServiceNow • Interface directly with Domino Data Lab support to resolve model … monitoring. • Working knowledge of core data science concepts, such as model evaluation metrics, overfitting, data drift, and feature importance. • Proficiency in AWS services (like S3, RedShift etc) • Experience with Grafana for monitoring and alerting. • Good to have hands-on experience with Domino Data Lab platform. • Solid understanding of CI/CD pipelines, version control, containerization, and orchestration. • Ability to communicate More ❯