Machine Learning Infrastructure Engineer to join their founding team. They are looking for people with skills directly linked to creating and managing high-performance computing (HPC) clusters across GPU/TPU chips + serving large machine learning models at scale. Not 1-10 GPU's but 1000's. You … researchers, founders and advisors to develop the next generation of high availability LLM’s. ML engineering experience: Experienced in creating and managing high-performance computing clusters across GPU/TPU, preferably in PyTorch. Proficient in efficient serving of large machine learning models at scale, including quantisation and distributedcomputing, experience with libraries such deepspeed. Strong software engineering experience in Python. Understanding around the latest AI research. Your background: Worked at a leading machine learning company. Worked a fast growing start-up. Pleas submit your CV to find out more. more »
Greater London, England, United Kingdom Hybrid / WFH Options
Anson McCade
/17/20) and developing low latency code. Bachelor's degree in a Quantitative Field; maths, statistics, computer science, etc Experience with DistributedComputing, Platform Development, Networking, and System Design. Exceptional analytical and quantitative skills BENEFITS Competitive base salary between £140,000 and £200,000 Bonus based more »
development covering both front and backend development. We're looking for a software engineer who bring fresh ideas from all areas, including information retrieval, distributedcomputing, large-scale system design. Any experience of networking and data storage, security, artificial intelligence, natural language processing, UI design and mobile would more »
Greater London, England, United Kingdom Hybrid / WFH Options
NetMind.AI
time, Onsite, 5 days a week; remote working arrangements may be discussed with the line manager. About Us NetMind is a cutting-edge, massively distributedcomputing platform designed for AI modelling and applications. Currently, we are running a start-up project within the life sciences sector, NetMind.life, with more »
trading and research infrastructure. You can expect to design and maintain the companies largest compute infrastructure which covers everything from OS, tooling and HPC computing for research and trading. You'll be working in a distributedcomputing environment with a primary focus on linux based systems. This more »
a core focus on increasing developer productivity and overall developer experience. 💡 What You Need: Strong SWE skills - Python or Golang preferred. Expert knowledge of DistributedComputing technologies - Kubernetes strongly preferred. Competent Front-End chops - React & JavaScript preferred - open to similar frameworks/tech. Strong Automation & Config management tooling more »
trading and research infrastructure. This team designs and maintains the firms largest compute infrastructure which includes operating system platforms, software development tooling, high-performance computing, networking and storage for research and trading. You’ll have the opportunity to work on a wide variety of technology initiatives in a distributedcomputing environment with a primary focus on Linux-based systems. This includes, workload scheduling design and implementation, fleet management, clustered file system design and operation, software design and life cycle (SDLC), kernel and network performance tuning for low-latency and high-throughput applications, metrics collection and data mining more »
learning models. They run on AWS and soon Azure, with plans to also add GCP and on-prem. They are adding extensive usage of distributed compute on Spark, starting with their more complex ETL and advanced analytics functions, e.g. Time Series Processing. They soon plan to integrate other approaches … including native distributed PyTorch/Tensorflow, Spark-based distributor libraries, or Horovod. TECH STACK: Python, Flask, Redis, Postgres, React, Plotly, Docker. Temporal; AWS Athena SQL, Athena & EMR Spark, ECS Fargate; Azure Synapse/Data Lake Analytics, HDInsight. KEY RESPONSIBILITIES Lead the productionisation of Monolith’s ML models and data … from junior developers to senior leaders Laser focused on identifying and progressing on critical path, both individually and as a team NICE TO HAVES Distributed machine learning (Spark distributors, Horovod, native distributed on PyTorch or Tensorflow) Productionisation experience on multiple clouds Familiarity with MLOps principles and practices, and more »
Cambridge, England, United Kingdom Hybrid / WFH Options
Intrasonics, an Ipsos Company
trying to improve its workflow, and we are not afraid of investing in new ideas. If you like to design, develop, and deploy resilient distributed systems you will fit right in. You will have the chance to work on established projects, kickstart new ones, and make a case for … tools. Experience of cloud-based services (AWS, Google Cloud). Experience of user interface and experience design. Experience with microservices architecture Knowledge of serverless computing architectures Knowledge of CI/CD workflows. Knowledge of issue/project tracking software, e.g. Jira Knowledge of container and orchestration technologies (e.g. Docker … Kubernetes, LXC) Other technologies associated with multi-tier architectures, and distributed computing. E.g. Redis, RabbitMQ, Prometheus, Apache Kafka, ELK, load balancers, ... What is in it for me? Ipsos UK offer an attractive basic salary and a rewards package including 25 days annual leave, a pension scheme and a more »