reliability engineering, Kubernetes administration, or related role. Deep expertise of Kubernetes and containers. Strong understanding of cloud infrastructure, automation tools, and best practices for highavailability and performance. Responsibilities: Monitor system performance and reliability. Hebbia is an enterprise-grade AI platform that empowers knowledge workers by automating complex … tasks and providing insights from various data sources. It's designed for seamless integration and high security. Experience Requirements: 4+ years software development experience at a venture-backed startup or top technology firm. Proven experience as a Site Reliability Engineer, DevOps Engineer, or similar role. Strong expertise in managing More ❯
Cambridge, Cambridgeshire, United Kingdom Hybrid / WFH Options
AI Tech Suite
reliability engineering, Kubernetes administration, or related role. Deep expertise of Kubernetes and containers. Strong understanding of cloud infrastructure, automation tools, and best practices for highavailability and performance. Responsibilities: Monitor system performance and reliability. Hebbia is an enterprise-grade AI platform that empowers knowledge workers by automating complex … tasks and providing insights from various data sources. It's designed for seamless integration and high security. Experience Requirements: 4+ years software development experience at a venture-backed startup or top technology firm. Proven experience as a Site Reliability Engineer, DevOps Engineer, or similar role. Strong expertise in managing More ❯
Job Title: Senior Backend Engineer About KX : Our mission is to accelerate data and AI-driven innovation with high-performance analytics solutions, enabling our customers to transform into AI-first enterprises. KX is trusted by the world's leading organisations across financial services, aerospace and defence, life sciences, telecoms … our platform is the fastest independent time series and vector data analytics engine on the market, delivering unmatched speed and scale. Our technology powers high-performance applications across cloud, on-premise, and edge environments, enabling customers to discover richer insights and make better-informed decisions, faster. Key Responsibilities : Backend … for compute, storage, and messaging. Containerization & Orchestration : Build and run containerized applications using Docker and Kubernetes. Performance & Reliability : Monitor and enhance system performance, ensuring highavailability and fault tolerance. Cross-Team Collaboration : Work closely with DevOps, frontend engineers, product managers, and stakeholders to deliver cohesive, end-to-end More ❯
Technical Account Managers are trusted advisors and consult their customers upon their monitoring, security & observability journey. This role embodies the critical intersection of very high technical expertise and a focus on customer satisfaction, renewal and expansion. Technical Account Managers are senior-level roles and are expected to professionally and … complex troubleshooting of Kubernetes and Docker container Good knowledge of Regex, Lucene, PromQL Good knowledge of Linux Experience in customer facing positions and excellent high energy customer-facing skills Excellent communication skills in English Strong presentation skills with the ability to establish credibility with executives Highavailability for fast response to customers Comfortable coding in any high-level programming language (Java, Go, Python) - advantage BSc degree in Computer Science/Engineering - advantage Experience in SAAS B2B software companies - advantage This role is located in London and is a hybrid role - 2 days per week in More ❯
Technical Account Managers are trusted advisors and consult their customers upon their monitoring, security & observability journey. This role embodies the critical intersection of very high technical expertise and a focus on customer satisfaction, renewal and expansion. Technical Account Managers are senior-level roles and are expected to professionally and … complex troubleshooting of Kubernetes and Docker container Good knowledge of Regex, Lucene, PromQL Good knowledge of Linux Experience in customer facing positions and excellent high energy customer-facing skills Excellent communication skills in English Strong presentation skills with the ability to establish credibility with executives Highavailability for fast response to customers Comfortable coding in any high-level programming language (Java, Go, Python) - advantage BSc degree in Computer Science/Engineering - advantage Experience in SAAS B2B software companies - advantage This role is located in London and is a hybrid role - 2 days per week in More ❯
Technical Account Managers are trusted advisors and consult our customers upon their monitoring, security & observability journey. This role embodies the critical intersection of very high technical expertise and a focus on customer satisfaction, renewal and expansion. Technical Account Managers are senior-level roles and are expected to professionally and … complex troubleshooting of Kubernetes and Docker container Good knowledge of RegEx, Lucene, PromQL Good knowledge of Linux Experience in customer facing positions and excellent high energy customer-facing skills Excellent communication skills in English Strong presentation skills with the ability to establish credibility with executives Highavailability for fast response to customers Comfortable coding in any high-level programming language (Java, Go, Python) - advantage BSc degree in Computer Science/Engineering - advantage Experience in SAAS B2B software companies - advantage Cultural Fit We're seeking candidates who are hungry, humble, and smart. Coralogix fosters a culture More ❯
all of IT and drive the modernisation of our platform going forwards. Key accountabilities Build and maintain production and non-production environments to ensure highavailability and cost optimization Build and maintain continuous integration and deployment pipelines to achieve fast, effect software delivery Improve performance and scalability of More ❯
daily basis to develop existing Python-based tools and analyze data. Design and develop Python applications to meet functional and non-functional requirements, ensuring highavailability and high performance. Develop Python interface with PowerFactory using python APIs and DPL or other techniques. Deploy the Python code into More ❯
are eager to apply your technical expertise to the financial services industry, this is the role for you. Key Responsibilities: Design, develop, and maintain high-performance Back End services using Golang to support financial applications and services, including trading platforms, investment systems, and risk management tools. Build and deploy … requirements, and ensure that applications meet financial regulatory standards and business needs. Optimize the performance of Back End services, ensuring low-latency responses and highavailability, critical for financial services. Implement CI/CD pipelines , automated testing, and monitoring systems to ensure the reliability and stability of production … outcomes. About You: Proven experience (2+ years) in Golang Back End development, with a strong focus on performance optimization and building scalable systems for high-volume, high-frequency financial applications. Strong experience working with Amazon Web Services (AWS) , including EC2, S3, RDS, DynamoDB, Lambda, and other cloud-native More ❯
cloud‑native infrastructure, champion CI/CD best practices, and ensure our GenAI services run reliably, securely, and cost‑effectively across staging, test, and high‑availability production environments. This is an early‑stage, high‑growth environment—perfect for builders who like green‑field architecture, rapid iteration, and … model (LLM) training jobs on distributed GPU clusters (Slurm, Ray, Kubeflow, or AWS SageMaker). Optimize model‑serving (Triton, vLLM, TorchServe) for low‑latency, high‑throughput inference. Cost & Performance Optimization Track cloud spend, right‑size resources, and introduce autoscaling strategies (Karpenter, Cluster‑Autoscaler, HPA/VPA). Champion FinOps … storage systems (Ceph, MinIO, S3) and artifact registries. Certifications: CKA/CKAD, AWS DevOps Engineer Professional, or equivalent. Track record in early‑stage or high‑growth tech environments. Excellent communication skills; ability to partner with researchers, backend engineers, and product stakeholders. More ❯
cloud‑native infrastructure, champion CI/CD best practices, and ensure our GenAI services run reliably, securely, and cost‑effectively across staging, test, and high‑availability production environments. This is an early‑stage, high‑growth environment—perfect for builders who like green‑field architecture, rapid iteration, and … model (LLM) training jobs on distributed GPU clusters (Slurm, Ray, Kubeflow, or AWS SageMaker). Optimize model‑serving (Triton, vLLM, TorchServe) for low‑latency, high‑throughput inference. Cost & Performance Optimization Track cloud spend, right‑size resources, and introduce autoscaling strategies (Karpenter, Cluster‑Autoscaler, HPA/VPA). Champion FinOps … storage systems (Ceph, MinIO, S3) and artifact registries. Certifications: CKA/CKAD, AWS DevOps Engineer Professional, or equivalent. Track record in early‑stage or high‑growth tech environments. Excellent communication skills; ability to partner with researchers, backend engineers, and product stakeholders. More ❯
Southampton, Hampshire, South East, United Kingdom
FBI &TMT
projects Collaborating with stakeholders, including customers, to develop and maintain software Assisting the Software Engineering Manager with requirements management, estimation, and planning Focusing on high-level architecture and long-term technical strategy Devising and implementing innovative solutions to improve software processes and quality Integrating software with hardware to deliver … complete systems Optimising application architectures for scalability and performance Monitoring system performance and troubleshooting issues to ensure highavailability and reliability Designing, implementing, and maintaining CI/CD pipelines to automate software delivery processes Job Requirements: Experience in making high-stake decisions about architecture and technology Extensive More ❯
portsmouth, hampshire, south east england, united kingdom
FBI &TMT
projects Collaborating with stakeholders, including customers, to develop and maintain software Assisting the Software Engineering Manager with requirements management, estimation, and planning Focusing on high-level architecture and long-term technical strategy Devising and implementing innovative solutions to improve software processes and quality Integrating software with hardware to deliver … complete systems Optimising application architectures for scalability and performance Monitoring system performance and troubleshooting issues to ensure highavailability and reliability Designing, implementing, and maintaining CI/CD pipelines to automate software delivery processes Job Requirements: Experience in making high-stake decisions about architecture and technology Extensive More ❯
Basingstoke, Hampshire, United Kingdom Hybrid / WFH Options
Bright Horizons Family Solutions, LLC
You should be familiar with agile practices and possess excellent troubleshooting skills. Bright Horizons is trusted by families and employers around the world for high-quality child care and early education, back-up care, and workplace education. We partner with some of the world's best companies to provide … Essential Functions/Responsibilities Design and implement advanced DevOps architectures to support scalable and reliable software delivery. Manage and optimize cloud infrastructure to ensure highavailability and performance. Develop and enforce best practices for Cloud architecture, infrastructure as code (IaC) and configuration management. Monitor, analyze, and manage technical More ❯
Windows Optimising cloud infrastructure performance for efficiency and cost-effectiveness Utilising automation tools for streamlining processes Influencing and developing relationships through confident, engaging, professional, high-impact interactions Utilising visibility/observability solutions. e.g., Platform Monitoring, User Experience Monitoring, Application Performance Monitoring, Application Resource Monitoring, Experience with Backups, HighAvailability and DR What you'll bring: Hands-on experience with major cloud platforms such as Amazon Web Services (AWS), Google Cloud Platform and Microsoft Azure. In-depth experience in designing and integrating IaaS, PaaS, and SaaS solutions across AWS, Azure and Google Cloud Platform. Proficiency in Data … cloud networking concepts, including VPCs, Direct Connect/Expressroute, VPNs, subnets, firewalls and cloud native security, and load balancers. Experience/knowledge of providing high quality architecture within an Agile delivery environment Knowledge and/or experience with DevOps practices. DevOps methodologies, DevOps tools, Site Reliability Engineering, Platform Engineering. More ❯
Windows Optimising cloud infrastructure performance for efficiency and cost-effectiveness Utilising automation tools for streamlining processes Influencing and developing relationships through confident, engaging, professional, high-impact interactions Utilising visibility/observability solutions. e.g., Platform Monitoring, User Experience Monitoring, Application Performance Monitoring, Application Resource Monitoring, Experience with Backups, HighAvailability and DR What you'll bring: Hands-on experience with major cloud platforms such as Amazon Web Services (AWS), Google Cloud Platform and Microsoft Azure. In-depth experience in designing and integrating IaaS, PaaS, and SaaS solutions across AWS, Azure and Google Cloud Platform. Proficiency in Data … cloud networking concepts, including VPCs, Direct Connect/Expressroute, VPNs, subnets, firewalls and cloud native security, and load balancers. Experience/knowledge of providing high quality architecture within an Agile delivery environment Knowledge and/or experience with DevOps practices. DevOps methodologies, DevOps tools, Site Reliability Engineering, Platform Engineering. More ❯
technologies (AWS), and a passion for building robust and efficient systems. You will collaborate closely with data scientists, engineers, and product teams to deliver high-quality ML solutions that directly impact our business. The mission of the Data Science Team is critical for the continuous development and success of … and implementation of MLOps best practices Cloud Infrastructure Management: Manage and optimize our AWS cloud infrastructure for machine learning, ensuring cost-effectiveness, security, and highavailability CI/CD Pipeline Development: Develop and maintain robust CI/CD pipelines for continuous integration and deployment of ML models and More ❯
Altrincham, Cheshire, United Kingdom Hybrid / WFH Options
Thermo Fisher Scientific Inc
to design and deliver the infrastructure for our SampleManager LIMS application hosted in the AWS cloud. The ideal candidate will ensure our infrastructure demonstrates highavailability, scalability, fault-tolerance, quality, and security. They will also be able to support our engineering teams with automated processes to improve efficiency. … templates using CloudFormation and TerraForm. Knowledge of agile software development and release management processes. Proficient in effectively prioritizing and carrying out tasks in a high-pressure scenario. Exceptional customer service orientation. Systematic approach to problem-solving and strong sense of ownership. Excellent written and verbal communication skills. Proficiency in More ❯
Azure Arc and hybrid connectivity strategies. Monitoring & Resilience: Implement observability using Azure Monitor, Log Analytics, App Insights, and Prometheus/Grafana . Design for highavailability (HA), disaster recovery (DR), and business continuity (BCP) . Conduct chaos engineering to test resilience and fault tolerance. Work closely with development More ❯
deployment, monitoring, testing and maintenance of infrastructure and applications Develop and maintain policies and procedures for security, compliance, and disaster recovery Optimise infrastructure for highavailability, fault tolerance, and cost efficiency Continuously monitor and improve infrastructure and application performance Troubleshoot and resolve infrastructure and application issues Manage and More ❯
london, south east england, United Kingdom Hybrid / WFH Options
Digital Skills ltd
deployment, monitoring, testing and maintenance of infrastructure and applications Develop and maintain policies and procedures for security, compliance, and disaster recovery Optimise infrastructure for highavailability, fault tolerance, and cost efficiency Continuously monitor and improve infrastructure and application performance Troubleshoot and resolve infrastructure and application issues Manage and More ❯
in cloud based environments: Develop and implement robust IT architecture strategies for cloud and hybrid environments, leveraging AWS best practices. Design scalable, secure, and high-availability solutions tailored to business needs. Architect and optimize data platforms to enable efficient data collection, storage, and processing. Implement and manage cloud More ❯
in cloud based environments: Develop and implement robust IT architecture strategies for cloud and hybrid environments, leveraging AWS best practices. Design scalable, secure, and high-availability solutions tailored to business needs. Architect and optimize data platforms to enable efficient data collection, storage, and processing. Implement and manage cloud More ❯
london, south east england, United Kingdom Hybrid / WFH Options
Parser
in cloud based environments: Develop and implement robust IT architecture strategies for cloud and hybrid environments, leveraging AWS best practices. Design scalable, secure, and high-availability solutions tailored to business needs. Architect and optimize data platforms to enable efficient data collection, storage, and processing. Implement and manage cloud More ❯
leased/colocation data centers. CE co-ordinates with AWS DCEO( Data Center Engineering Operations) teams to implement new builds, upgrades etc and ensures highavailability and high resiliency of these control systems. Development of control panel BOM's Development of ISA data sheets for temperature, level More ❯