Cambridge, Cambridgeshire, United Kingdom Hybrid / WFH Options
Arm Limited
/green & canary releases, and automated rollbacks. Proficiency with Docker, Kubernetes, and related cloud-native orchestration patterns. Proven track record building dashboards and visualizations across platforms such as Grafana, Datadog, and AWS. Experience with instrumentation tools like Prometheus and managing time-series stores such as Graphite and VictoriaMetrics. Solid understanding of networking, security, and compliance in cloud environments. Excellent written More ❯
Manage cloud infrastructure (OCI, AWS, Azure, or GCP) using Infrastructure as Code tools like Terraform or Serverless Functions. Monitor system health and performance using tools like Prometheus, Grafana, or Datadog or NewRelic. Collaborate closely with development teams to automate builds, performance tests, and deployments. Ensure system security, compliance, and best practices are followed in deployment pipelines. Ensure network security with More ❯
native technologies: Experience in deploying to cloud platforms (e.g., AWS, GCP or Azure), an understanding of containerisation (e.g., Docker), infrastructure-as-code software (e.g., Terraform), and observability platforms (e.g., Datadog or Grafana). Curiosity : A hunger to learn and grow your skills. Problem solving: Strong analytical problem-solving skills and attention to detail. You have the ability to break down More ❯
native technologies: Experience in deploying to cloud platforms (e.g., AWS, GCP or Azure), an understanding of containerisation (e.g., Docker), infrastructure-as-code software (e.g., Terraform), and observability platforms (e.g., Datadog or Grafana). Curiosity : A hunger to learn and grow your skills. Problem solving: Strong analytical problem-solving skills and attention to detail. You have the ability to break down More ❯
Job Title: Cloud Engineer Location: Essex (hybrid- 1-2 days onsite) Industry: Financial Services/Fintech Salary: £(phone number removed) per annum Overview: We are seeking a skilled and motivated Cloud Engineer to join our dynamic IT team at a More ❯
AKS API Management and DevOps Pipelines and AWS including EKS Lambda and CloudFormation Infrastructure as Code and GitOps : Terraform Bicep Pulumi ArgoCD and FluxCD Observability : Prometheus Grafana OpenTelemetry and Datadog Security and Compliance : HashiCorp Vault Azure Key Vault AWS KMS OPA Gatekeeper and Drata or similar ? Interested in exploring this further This is a high impact role in a fast More ❯
AKS API Management and DevOps Pipelines and AWS including EKS Lambda and CloudFormation Infrastructure as Code and GitOps : Terraform Bicep Pulumi ArgoCD and FluxCD Observability : Prometheus Grafana OpenTelemetry and Datadog Security and Compliance : HashiCorp Vault Azure Key Vault AWS KMS OPA Gatekeeper and Drata or similar ? Interested in exploring this further This is a high impact role in a fast More ❯
as "git doesn't work," identifying and resolving underlying infrastructure problems. Analyze and resolve network connectivity issues between systems and services. Implement and maintain monitoring and observability tools like Datadog for proactive issue detection. Ensure robust Identity and Access Management (IAM) configurations. Minimum Qualifications: • Bachelor's degree in Engineering, Information Systems, Computer Science, or related field and 6+ years of More ❯
Cambourne, Cambridgeshire, United Kingdom Hybrid / WFH Options
Remotestar
Job description RemoteStar is looking to hire a Senior Site Reliability Engineering Manager on behalf of our client based in the UK with a fully remote work policy. About Client: The client building, the B2B marketplace for diamonds. It's More ❯
IAC) solutions. Proven experience in monitoring and observability tools to proactively manage system health. Skills and Strengths: AWS (Amazon Web Services) Auto Scaling Fargate Route53 Observability tools (New Relic, DataDog, Splunk) Scripting (Ansible, Bash, Python, GO) CI/CD Primary Job Responsibilities: Design and support EC2/ECS/EKS/Fargate environments for high availability and fault tolerance. Implement More ❯
IAC) solutions. Proven experience in monitoring and observability tools to proactively manage system health. Skills and Strengths: AWS (Amazon Web Services) Auto Scaling Fargate Route53 Observability tools (New Relic, DataDog, Splunk) Scripting (Ansible, Bash, Python, GO) CI/CD Primary Job Responsibilities: Design and support EC2/ECS/EKS/Fargate environments for high availability and fault tolerance. Implement More ❯
IAC) solutions. Proven experience in monitoring and observability tools to proactively manage system health. Skills and Strengths: AWS (Amazon Web Services) Auto Scaling Fargate Route53 Observability tools (New Relic, DataDog, Splunk) Scripting (Ansible, Bash, Python, GO) CI/CD Primary Job Responsibilities: Design and support EC2/ECS/EKS/Fargate environments for high availability and fault tolerance. Implement More ❯
IAC) solutions. Proven experience in monitoring and observability tools to proactively manage system health. Skills and Strengths: AWS (Amazon Web Services) Auto Scaling Fargate Route53 Observability tools (New Relic, DataDog, Splunk) Scripting (Ansible, Bash, Python, GO) CI/CD Primary Job Responsibilities: Design and support EC2/ECS/EKS/Fargate environments for high availability and fault tolerance. Implement More ❯
IAC) solutions. Proven experience in monitoring and observability tools to proactively manage system health. Skills and Strengths: AWS (Amazon Web Services) Auto Scaling Fargate Route53 Observability tools (New Relic, DataDog, Splunk) Scripting (Ansible, Bash, Python, GO) CI/CD Primary Job Responsibilities: Design and support EC2/ECS/EKS/Fargate environments for high availability and fault tolerance. Implement More ❯
IAC) solutions. Proven experience in monitoring and observability tools to proactively manage system health. Skills and Strengths: AWS (Amazon Web Services) Auto Scaling Fargate Route53 Observability tools (New Relic, DataDog, Splunk) Scripting (Ansible, Bash, Python, GO) CI/CD Primary Job Responsibilities: Design and support EC2/ECS/EKS/Fargate environments for high availability and fault tolerance. Implement More ❯
IAC) solutions. Proven experience in monitoring and observability tools to proactively manage system health. Skills and Strengths: AWS (Amazon Web Services) Auto Scaling Fargate Route53 Observability tools (New Relic, DataDog, Splunk) Scripting (Ansible, Bash, Python, GO) CI/CD Primary Job Responsibilities: Design and support EC2/ECS/EKS/Fargate environments for high availability and fault tolerance. Implement More ❯
IAC) solutions. Proven experience in monitoring and observability tools to proactively manage system health. Skills and Strengths: AWS (Amazon Web Services) Auto Scaling Fargate Route53 Observability tools (New Relic, DataDog, Splunk) Scripting (Ansible, Bash, Python, GO) CI/CD Primary Job Responsibilities: Design and support EC2/ECS/EKS/Fargate environments for high availability and fault tolerance. Implement More ❯
IAC) solutions. Proven experience in monitoring and observability tools to proactively manage system health. Skills and Strengths: AWS (Amazon Web Services) Auto Scaling Fargate Route53 Observability tools (New Relic, DataDog, Splunk) Scripting (Ansible, Bash, Python, GO) CI/CD Primary Job Responsibilities: Design and support EC2/ECS/EKS/Fargate environments for high availability and fault tolerance. Implement More ❯
IAC) solutions. Proven experience in monitoring and observability tools to proactively manage system health. Skills and Strengths: AWS (Amazon Web Services) Auto Scaling Fargate Route53 Observability tools (New Relic, DataDog, Splunk) Scripting (Ansible, Bash, Python, GO) CI/CD Primary Job Responsibilities: Design and support EC2/ECS/EKS/Fargate environments for high availability and fault tolerance. Implement More ❯
IAC) solutions. Proven experience in monitoring and observability tools to proactively manage system health. Skills and Strengths: AWS (Amazon Web Services) Auto Scaling Fargate Route53 Observability tools (New Relic, DataDog, Splunk) Scripting (Ansible, Bash, Python, GO) CI/CD Primary Job Responsibilities: Design and support EC2/ECS/EKS/Fargate environments for high availability and fault tolerance. Implement More ❯
IAC) solutions. Proven experience in monitoring and observability tools to proactively manage system health. Skills and Strengths: AWS (Amazon Web Services) Auto Scaling Fargate Route53 Observability tools (New Relic, DataDog, Splunk) Scripting (Ansible, Bash, Python, GO) CI/CD Primary Job Responsibilities: Virtual job fairs Design and support EC2/ECS/EKS/Fargate environments for high availability and More ❯
Cambourne, Cambridgeshire, United Kingdom Hybrid / WFH Options
Remotestar
We're assembling a team of elite founding software engineers for a startup, building the future of e-commerce in MENA, bringing together community, shopping and entertainment. Location: Remote We are looking for engineers who are passionate about creating scalable More ❯
Cambridge, England, United Kingdom Hybrid / WFH Options
ZipRecruiter
Strong communication and stakeholder management skills Familiarity with enterprise IT, infrastructure, or DevOps environments Ability to lead cross-functional teams in fast-paced settings Exposure to observability tools (e.g., Datadog, Grafana, Splunk, etc.) Knowledge of Agile or hybrid delivery methodologies Background in SRE or ITSM practices Change management certification #J-18808-Ljbffr More ❯
Peterborough, Cambridgeshire, UK Hybrid / WFH Options
Few&Far
achieve optimal outcomes with their models, focusing on aspects like optimisation, scalability & efficiency You’ll work alongside teams that have joined from world-class tech businesses like NVIDIA, Amazon, Datadog, Vercel, Meta, GitHub and Uber Key Responsibilities Partner with customers to identify and address their ML deployment needs Implement and optimise ML solutions using Python, open-source tools & infrastructure Collaborate More ❯
achieve optimal outcomes with their models, focusing on aspects like optimisation, scalability & efficiency You’ll work alongside teams that have joined from world-class tech businesses like NVIDIA, Amazon, Datadog, Vercel, Meta, GitHub and Uber Key Responsibilities Partner with customers to identify and address their ML deployment needs Implement and optimise ML solutions using Python, open-source tools & infrastructure Collaborate More ❯