Cambridge, Cambridgeshire, United Kingdom Hybrid / WFH Options
Birketts LLP
and regulations Implement data protection measures to protect the firms data and assets Performance Monitoring : Monitor the performance of cloud and infrastructure systems Ensure highavailability, reliability and performance of cloud environments Identify and resolve issues promptly to maintain highavailability and reliability Manage a team More ❯
Cambridge, Cambridgeshire, United Kingdom Hybrid / WFH Options
AI Tech Suite
reliability engineering, Kubernetes administration, or related role. Deep expertise of Kubernetes and containers. Strong understanding of cloud infrastructure, automation tools, and best practices for highavailability and performance. Responsibilities: Monitor system performance and reliability. Hebbia is an enterprise-grade AI platform that empowers knowledge workers by automating complex … tasks and providing insights from various data sources. It's designed for seamless integration and high security. Experience Requirements: 4+ years software development experience at a venture-backed startup or top technology firm. Proven experience as a Site Reliability Engineer, DevOps Engineer, or similar role. Strong expertise in managing More ❯
certification, and livesite operations - Industry AWS experience - Experience writing and debugging Infrastructure as Code (CloudFormation, TerraForm, Ansible, Chef, Puppet) PREFERRED QUALIFICATIONS - Experience working with high-availability, distributed systems and services - Experience leading the design, build and deployment of complex and performant (reliable and scalable) software solutions in production More ❯
configuring, and maintaining the servers and software stack. A successful candidate will work directly with Darktrace researchers and software engineers, ensuring optimal performance and availability for ongoing AI and HPC (high-performance computing) projects. This is a hybrid role, with a compulsory attendance of 2 days a week … projects (managing access and ensuring optimal performance). Additional responsibilities include: Monitoring server and application performance, identifying bottlenecks, and taking corrective actions to maintain highavailability, Implementing and maintaining server security, including patch management, vulnerability scanning, and intrusion detection, Collaborating with network administrators, hardware engineers, and researchers to More ❯