scripts. Familiarity with ELT (Extract, Load, Transform) processes is a plus. Big Data Technologies : Familiarity with big data frameworks such as Apache Hadoop and Apache Spark, including experience with distributedcomputing and data processing. Cloud Platforms: Proficient in using cloud platforms (e.g., AWS, Google Cloud Platform, Microsoft Azure) for data storage, processing, and deployment of data solutions. Data More ❯
PySpark, Python, SQL with atleast 5 years of experience • Working experience in Palantir Foundry platform is must • Experience designing and implementing data analytics solutions on enterprise data platforms and distributedcomputing (Spark/Hive/Hadoop preferred). • Proven track record of understanding and transforming customer requirements into a best-fit design and architecture. • Demonstrated experience in end … Data Science or similar discipline. • and K8S systems is a plus. Person specification: • Knowledge of Insurance Domain or Financial Industry is a strong plus. • Experienced working in multicultural globally distributed teams. • Self-starter with a positive attitude and a willingness to learn, who can manage their own workload. • Strong analytical and problem-solving skills. • Strong interpersonal and communication skills More ❯
A track record of actively promoting equality, diversity, and inclusion within all areas of responsibility and in particularly in Data & Analytics function. Expert proficiency in Python, R, SQL, and distributedcomputing frameworks (e.g., Spark, Hadoop). Advanced knowledge of data engineering tools (e.g., Airflow, Kafka, Snowflake, Databricks). Proficiency in machine learning frameworks (TensorFlow, PyTorch, Scikit-learn). More ❯
London, England, United Kingdom Hybrid / WFH Options
PhysicsX
problems Design, build and optimise machine learning models with a focus on scalability and efficiency in our application domain Transform prototype model implementations to robust and optimised implementations Implement distributed training architectures (e.g., data parallelism, parameter server, etc.) for multi-node/multi-GPU training and explore federated learning capacity using cloud (e.g., AWS, Azure, GCP) and on-premise … or PhD in computer science, machine learning, applied statistics, mathematics, physics, engineering, software engineering, or a related field, with a record of experience in any of the following: Scientific computing; High-performance computing (CPU/GPU clusters); Parallelised/distributed training for large/foundation models Ideally >1 years of experience in a data-driven role, with … exposure to: scaling and optimising ML models, training and serving foundation models at scale (federated learning a bonus); distributedcomputing frameworks (e.g., Spark, Dask) and high-performance computing frameworks (MPI, OpenMP, CUDA, Triton); cloud computing (on hyper-scaler platforms, e.g., AWS, Azure, GCP); building machine learning models and pipelines in Python, using common libraries and frameworks More ❯
A track record of actively promoting equality, diversity, and inclusion within all areas of responsibility and in particularly in Data & Analytics function. Expert proficiency in Python, R, SQL, and distributedcomputing frameworks (e.g., Spark, Hadoop). Advanced knowledge of data engineering tools (e.g., Airflow, Kafka, Snowflake, Databricks). Proficiency in machine learning frameworks (TensorFlow, PyTorch, Scikit-learn). More ❯
of MLOps practices, including model training, deployment, and monitoring. Hands-on experience with Kubernetes and containerised environments. Technical Skills: Proficiency in programming languages such as Python & SQL. Experience with distributedcomputing frameworks such as Spark. Familiarity with version control systems (e.g., Git) and CI/CD pipelines. Soft Skills: Strong problem-solving skills and the ability to work More ❯
London, England, United Kingdom Hybrid / WFH Options
PhysicsX Ltd
problems. Design, build and optimise machine learning models with a focus on scalability and efficiency in our application domain. Transform prototype model implementations to robust and optimised implementations. Implement distributed training architectures (e.g., data parallelism, parameter server, etc.) for multi-node/multi-GPU training and explore federated learning capacity using cloud (e.g., AWS, Azure, GCP) and on-premise … or PhD in computer science, machine learning, applied statistics, mathematics, physics, engineering, software engineering, or a related field, with a record of experience in any of the following: Scientific computing; High-performance computing (CPU/GPU clusters); Parallelised/distributed training for large/foundation models. Ideally >1 years of experience in a data-driven role, with … exposure to: scaling and optimising ML models, training and serving foundation models at scale (federated learning a bonus); distributedcomputing frameworks (e.g., Spark, Dask) and high-performance computing frameworks (MPI, OpenMP, CUDA, Triton); cloud computing (on hyper-scaler platforms, e.g., AWS, Azure, GCP); building machine learning models and pipelines in Python, using common libraries and frameworks More ❯
London, England, United Kingdom Hybrid / WFH Options
Merantix
with a talented team to build and deploy scalable data pipelines to aggregate, prepare, and process data for use with machine learning. Your skills span across data processing and distributed systems with a software engineering base. You are excited to collaborate with ML engineers to build generative AI features in Autodesk products. You will report to Senior Manager, Autodesk … to work remotely, in an office, or a mix of both. Responsibilities Collaborate on engineering projects for product with a diverse, global team of researchers and engineers Develop scalable distributed systems to process, filter, and deploy datasets for use with machine learning Process large, unstructured, multi-modal (text, images, 3D models, code snippets, metadata) data sources into formats suitable … such as AWS, Azure, and GCP Containerization technologies, such as Docker and Kubernetes Documenting code, architectures, and experiments Linux systems and bash terminals Preferred Qualifications Hands-on experience with: Distributedcomputing frameworks, such as Ray Data and Spark. Databases and/or data warehousing technologies, such as Apache Hive. Data transformation via SQL and DBT. Orchestration platforms, such More ❯
London, England, United Kingdom Hybrid / WFH Options
Autodesk
and platform engineers to build and deploy scalable data pipelines to aggregate, prepare, and process data for use with machine learning . Your skills span across data processing and distributed systems with a software engineering base. You are excited to collaborate with ML engineers to build generative AI features in Autodesk products, and comfortable working at the intersection of … to work remotely, in an office, or a mix of both. Responsibilities Collaborate on engineering projects for product with a diverse, global team of researchers and engineers Develop scalable distributed systems to process, filter, and deploy datasets for use with machine learning Process large, unstructured, multi-modal (text, images, 3D models, code snippets, metadata) data sources into formats suitable … have experience in data modelling, architecture, and processing skills with varied unstructured data representations Processing unstructured data, such as 3D geometric data Large scale, data-intensive systems in production Distributedcomputing frameworks, such as Spark, Dask, Ray Data etc. Cloud platforms such as AWS, Azure, or GCP Docker Documenting code, architectures, and experiments Linux systems and bash terminals More ❯
understanding customer requirements, creating consulting proposals and creating packaged Big Data service offerings. Delivery - Engagements include short on-site projects proving the use of AWS services to support new distributedcomputing solutions that often span private cloud and public cloud services. Engagements will include migration of existing applications and development of new applications using AWS cloud services. About … experiences, don't let it stop you from applying. Why AWS? Amazon Web Services (AWS) is the world's most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating - that's why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. More ❯
or similar), and containerisation (Docker) Strong knowledge of cloud platforms like Azure, AWS or GCP for deploying and managing ML models Familiarity with data engineering tools and practices, e.g., distributedcomputing (e.g., Spark, Ray), cloud-based data platforms (e.g., Databricks) and database management (e.g., SQL) Strong communication skills, capability to present technical concepts to technical and non-technical More ❯
Quant and Front Office technology teams to integrate pricing models and workflow enhancements within the ACE application. There will be exposure to a wide range of technological frameworks, including distributedcomputing architecture. The role will involve tasks such as: Developing and maintaining the Counterparty Credit Risk applications, leveraging in-house Python and C++ model libraries. Supporting and improving More ❯
Strong proficiency in Python Extensive experience with cloud platforms (AWS, GCP, or Azure) Experience with: Data warehousing and lake architectures ETL/ELT pipeline development SQL and NoSQL databases Distributedcomputing frameworks (Spark, Kinesis etc) Software development best practices including CI/CD, TDD and version control. Containerisation tools like Docker or Kubernetes Experience with Infrastructure as Code More ❯
for data augmentation, denoising, and domain adaptation to enhance model performance. 3. Model Training and Optimization: -Design and implement efficient training pipelines for large-scale generative AI models. -Leverage distributedcomputing resources, such as GPUs and cloud platforms, for efficient model training. -Optimize model architectures, hyperparameters, and training strategies to achieve superior performance and generalization. 4. Model Evaluation … experiences, don't let it stop you from applying. Why AWS? Amazon Web Services (AWS) is the world's most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating - that's why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. … BERT, or T5. - Familiarity with reinforcement learning techniques and their applications in generative AI. - Understanding of ethical AI principles, bias mitigation techniques, and responsible AI practices. - Experience with cloud computing platforms (e.g., AWS, GCP, Azure) and distributedcomputing frameworks (e.g., Apache Spark, Dask). - Strong problem-solving, analytical, and critical thinking skills. - Strong communication, collaboration, and leadership More ❯
and building solutions. You will collaborate with scientists, research engineers, and platform engineers to build and deploy scalable data pipelines for machine learning. Your expertise spans data processing and distributed systems, with a focus on software engineering. You will work at the intersection of research and product, building generative AI features in Autodesk products. You will report to: Manager … supports a hybrid work environment, allowing remote, in-office, or mixed work arrangements. Responsibilities Collaborate on engineering projects with a diverse, global team of researchers and engineers. Develop scalable distributed systems for processing, filtering, and deploying datasets for machine learning. Process large, unstructured, multi-modal data sources (text, images, 3D models, code, metadata) into ML-ready formats. Conduct experiments … testing, and deployment. Experience in data modeling, architecture, and processing unstructured data. Experience with processing 3D geometric data. Experience with large-scale, data-intensive systems in production. Knowledge of distributedcomputing frameworks (Spark, Dask, Ray). Experience with cloud platforms (AWS, Azure, GCP). Proficiency with Docker, Linux, and bash. Ability to document code, architectures, and experiments. Preferred More ❯
Birmingham, England, United Kingdom Hybrid / WFH Options
Autodesk
engineers, and platform engineers to build and deploy scalable data pipelines to aggregate, prepare, and process data for use with machine learning. Your skills span across data processing and distributed systems with a software engineering base. You are excited to collaborate with ML engineers to build generative AI features in Autodesk products, and comfortable working at the intersection of … to work remotely, in an office, or a mix of both. Responsibilities · Collaborate on engineering projects for product with a diverse, global team of researchers and engineers · Develop scalable distributed systems to process, filter, and deploy datasets for use with machine learning · Process large, unstructured, multi-modal (text, images, 3D models, code snippets, metadata) data sources into formats suitable … have experience in data modelling, architecture, and processing skills with varied unstructured data representations · Processing unstructured data, such as 3D geometric data · Large scale, data-intensive systems in production · Distributedcomputing frameworks, such as Spark, Dask, Ray Data etc. · Cloud platforms such as AWS, Azure, or GCP · Docker · Documenting code, architectures, and experiments · Linux systems and bash terminals More ❯
Cambridge, England, United Kingdom Hybrid / WFH Options
Autodesk
engineers, and platform engineers to build and deploy scalable data pipelines to aggregate, prepare, and process data for use with machine learning. Your skills span across data processing and distributed systems with a software engineering base. You are excited to collaborate with ML engineers to build generative AI features in Autodesk products, and comfortable working at the intersection of … to work remotely, in an office, or a mix of both. Responsibilities · Collaborate on engineering projects for product with a diverse, global team of researchers and engineers · Develop scalable distributed systems to process, filter, and deploy datasets for use with machine learning · Process large, unstructured, multi-modal (text, images, 3D models, code snippets, metadata) data sources into formats suitable … have experience in data modelling, architecture, and processing skills with varied unstructured data representations · Processing unstructured data, such as 3D geometric data · Large scale, data-intensive systems in production · Distributedcomputing frameworks, such as Spark, Dask, Ray Data etc. · Cloud platforms such as AWS, Azure, or GCP · Docker · Documenting code, architectures, and experiments · Linux systems and bash terminals More ❯
Newbury, England, United Kingdom Hybrid / WFH Options
Autodesk
engineers, and platform engineers to build and deploy scalable data pipelines to aggregate, prepare, and process data for use with machine learning. Your skills span across data processing and distributed systems with a software engineering base. You are excited to collaborate with ML engineers to build generative AI features in Autodesk products, and comfortable working at the intersection of … to work remotely, in an office, or a mix of both. Responsibilities · Collaborate on engineering projects for product with a diverse, global team of researchers and engineers · Develop scalable distributed systems to process, filter, and deploy datasets for use with machine learning · Process large, unstructured, multi-modal (text, images, 3D models, code snippets, metadata) data sources into formats suitable … have experience in data modelling, architecture, and processing skills with varied unstructured data representations · Processing unstructured data, such as 3D geometric data · Large scale, data-intensive systems in production · Distributedcomputing frameworks, such as Spark, Dask, Ray Data etc. · Cloud platforms such as AWS, Azure, or GCP · Docker · Documenting code, architectures, and experiments · Linux systems and bash terminals More ❯
Bath, England, United Kingdom Hybrid / WFH Options
Autodesk
engineers, and platform engineers to build and deploy scalable data pipelines to aggregate, prepare, and process data for use with machine learning. Your skills span across data processing and distributed systems with a software engineering base. You are excited to collaborate with ML engineers to build generative AI features in Autodesk products, and comfortable working at the intersection of … to work remotely, in an office, or a mix of both. Responsibilities · Collaborate on engineering projects for product with a diverse, global team of researchers and engineers · Develop scalable distributed systems to process, filter, and deploy datasets for use with machine learning · Process large, unstructured, multi-modal (text, images, 3D models, code snippets, metadata) data sources into formats suitable … have experience in data modelling, architecture, and processing skills with varied unstructured data representations · Processing unstructured data, such as 3D geometric data · Large scale, data-intensive systems in production · Distributedcomputing frameworks, such as Spark, Dask, Ray Data etc. · Cloud platforms such as AWS, Azure, or GCP · Docker · Documenting code, architectures, and experiments · Linux systems and bash terminals More ❯
Sheffield, England, United Kingdom Hybrid / WFH Options
Autodesk
engineers, and platform engineers to build and deploy scalable data pipelines to aggregate, prepare, and process data for use with machine learning. Your skills span across data processing and distributed systems with a software engineering base. You are excited to collaborate with ML engineers to build generative AI features in Autodesk products, and comfortable working at the intersection of … to work remotely, in an office, or a mix of both. Responsibilities · Collaborate on engineering projects for product with a diverse, global team of researchers and engineers · Develop scalable distributed systems to process, filter, and deploy datasets for use with machine learning · Process large, unstructured, multi-modal (text, images, 3D models, code snippets, metadata) data sources into formats suitable … have experience in data modelling, architecture, and processing skills with varied unstructured data representations · Processing unstructured data, such as 3D geometric data · Large scale, data-intensive systems in production · Distributedcomputing frameworks, such as Spark, Dask, Ray Data etc. · Cloud platforms such as AWS, Azure, or GCP · Docker · Documenting code, architectures, and experiments · Linux systems and bash terminals More ❯
Stockton-on-Tees, England, United Kingdom Hybrid / WFH Options
Autodesk
engineers, and platform engineers to build and deploy scalable data pipelines to aggregate, prepare, and process data for use with machine learning. Your skills span across data processing and distributed systems with a software engineering base. You are excited to collaborate with ML engineers to build generative AI features in Autodesk products, and comfortable working at the intersection of … to work remotely, in an office, or a mix of both. Responsibilities · Collaborate on engineering projects for product with a diverse, global team of researchers and engineers · Develop scalable distributed systems to process, filter, and deploy datasets for use with machine learning · Process large, unstructured, multi-modal (text, images, 3D models, code snippets, metadata) data sources into formats suitable … have experience in data modelling, architecture, and processing skills with varied unstructured data representations · Processing unstructured data, such as 3D geometric data · Large scale, data-intensive systems in production · Distributedcomputing frameworks, such as Spark, Dask, Ray Data etc. · Cloud platforms such as AWS, Azure, or GCP · Docker · Documenting code, architectures, and experiments · Linux systems and bash terminals More ❯
London, England, United Kingdom Hybrid / WFH Options
Cloudbeds
Fast 500 again in 2024 - but we're just getting started. How You'll Make an Impact: As a Senior Data Engineer , you'll design and implement large-scale distributed data processing systems using technologies like Apache Hadoop, Spark, and Flink. You'll build robust data pipelines and infrastructure that transform complex data into actionable insights, ensuring scalability and … platform that processes billions in bookings annually. You'll architect data lakes, warehouses, and real-time streaming platforms while implementing security measures and optimizing performance. With your expertise in distributedcomputing, containerization (Docker, Kubernetes), and streaming technologies (Kafka, Confluent), you'll drive innovation and evaluate new technologies to continuously improve our data ecosystem. Our Data team: We're … expected, and collective wins matter more than individual credit. What You Bring to the Team: Technical Expertise & Scalability Mindset: Deep knowledge of data architecture, ETL/ELT pipelines, and distributed systems, with the ability to design scalable, high-performance solutions. Problem-Solving & Ownership: A proactive approach to diagnosing issues, improving infrastructure, and taking full ownership from concept to production. More ❯
London, England, United Kingdom Hybrid / WFH Options
PhysicsX
concepts and best practices (e.g., versioning, testing, CI/CD, API design, MLOps) Building machine learning models and pipelines in Python, using common libraries and frameworks (e.g., TensorFlow, MLFlow) Distributedcomputing frameworks (e.g., Spark, Dask) Cloud platforms (e.g., AWS, Azure, GCP) and HP computing Containerization and orchestration (Docker, Kubernetes) Strong problem-solving skills and the ability to More ❯
maintenance and support Collaborate with cross-function teams and demonstrate great communication skills We're excited if you have 5+ years of experience in delivery multi-tier, highly scalable, distributed web application Deep understanding in software architecture, object-oriented design principles, and data structures Extensive experience in developing microservices using Java, Python Experience in distributedcomputing frameworks More ❯
language (Python, Java, or Scala) Experience with cloud platforms (AWS, GCP, or Azure) Experience with data warehousing and lake architectures ETL/ELT pipeline development SQL and NoSQL databases Distributedcomputing frameworks (Spark, Kinesis, etc.) Software development best practices including CI/CD, TDD, and version control Strong understanding of data modelling and system architecture Excellent problem-solving More ❯