Machine Learning Infrastructure Engineer [UAE Based]
Job Title: ML Infrastructure Senior Engineer
Location: Abu Dhabi, United Arab Emirates [Full relocation package provided]
Job Overview
We are seeking a skilled ML Infrastructure Engineer to join our growing AI/ML platform team. This role is ideal for someone passionate about large-scale machine learning systems and has hands-on experience deploying LLMs/SLMs using advanced inference engines like vLLM. You will play a critical role in designing, deploying, optimizing, and managing ML models and the infrastructure around them—both for inference, fine-tuning and continued pre-training.
Key Responsibilities
· Deploy large-scale or small language models (LLMs/SLMs) using inference engines (e.g., vLLM, Triton, etc.).
· Collaborate with research and data science teams to fine-tune models or build automated fine-tuning pipelines.
· Extend inference-level capabilities by integrating advanced features such as multi-modality, real-time inferencing, model quantization, and tool-calling.
· Evaluate and recommend optimal hardware configurations (GPU, CPU, RAM) based on model size and workload patterns.
· Build, test, and optimize LLMs Inference for consistent model deployment.
· Implement and maintain infrastructure-as-code to manage scalable, secure, and elastic cloud-based ML environments.
· Ensure seamless orchestration of the MLOps lifecycle, including experiment tracking, model registry, deployment automation, and monitoring.
· Manage ML model lifecycle on AWS (preferred) or other cloud platforms.
· Understand LLM architecture fundamentals to design efficient scalability strategies for both inference and fine-tuning processes.
Required Skills
Core Skills:
· Proven experience deploying LLMs or SLMs using inference engines like vLLM, TGI, or similar.
· Experience in fine-tuning language models or creating automated pipelines for model training and evaluation.
· Deep understanding of LLM architecture fundamentals (e.g., attention mechanisms, transformer layers) and how they influence infrastructure scalability and optimization.
· Strong understanding of hardware-resource alignment for ML inference and training.
Technical Proficiency:
· Programming experience in Python and C/C++, especially for inference optimization.
· Solid understanding of the end-to-end MLOps lifecycle and related tools.
· Experience with containerization, image building, and deployment (e.g., Docker, Kubernetes optional).
Cloud & Infrastructure:
· Hands-on experience with AWS services for ML workloads (SageMaker, EC2, EKS, etc.) or equivalent services in Azure/GCP.
· Ability to manage cloud infrastructure to ensure high availability, scalability, and cost efficiency.
Nice-to-Have
· Experience with ML orchestration platforms like MLflow, SageMaker Pipelines, Kubeflow, or similar.
· Familiarity with model quantization, pruning, or other performance optimization techniques.
· Exposure to distributed training frameworks like Unsloth, DeepSpeed, Accelerate, or FSDP.
- Company
- AI71
- Location
- London, UK
- Posted
- Company
- AI71
- Location
- London, UK
- Posted