Senior Machine Learning Engineer

Senior Machine Learning Engineer:

We are seeking a hands-on Senior Machine Learning Engineer to design, build, and deploy scalable machine learning systems in production. You will be responsible for the end-to-end lifecycle of ML solutions—from data preparation and model training to deployment, monitoring, and continuous improvement. Working closely with data engineers, architects, and product teams, you will transform business problems into reliable, high-performance ML applications on AWS.

Key Responsibilities

Model Development & Training

· Design, train, and evaluate machine learning models for classification, regression, forecasting, recommendation, and anomaly detection use cases.

· Implement feature engineering pipelines and manage model experimentation using Scikit-Learn, XGBoost, LightGBM, or deep learning frameworks (TensorFlow / PyTorch where applicable).

· Optimise model performance through hyperparameter tuning, cross-validation, and robust evaluation frameworks.

ML Pipelines & Productionisation

· Build end-to-end ML pipelines using Python, ensuring reproducibility and scalability across training and inference.

· Deploy models to production using Amazon SageMaker (Training Jobs, Batch Transform, and Real-Time Endpoints).

· Implement model versioning, lineage tracking, and rollback strategies for safe releases.

Data Engineering & Feature Management

· Collaborate with data engineering teams to ingest, clean, and prepare large-scale datasets using AWS services and/or Databricks (PySpark).

· Design feature stores and manage feature reuse for training and inference consistency.

· Handle data quality validation and drift detection over time.

Performance & Reliability Engineering

· Optimize model inference latency, throughput, and infrastructure cost through instance right-sizing, batch inference, and caching strategies.

· Implement monitoring for model performance, data drift, and prediction quality using CloudWatch or custom metrics.

· Troubleshoot production issues and continuously improve system robustness.

Code Quality & MLOps

· Write clean, modular, testable Python code following software engineering best practices.

· Build CI/CD pipelines for ML workflows, including automated training, testing, and deployment.

· Define ML metrics (accuracy, ROC-AUC, F1, RMSE, etc.) and track them across model versions.

Technical Skills & Requirements

Programming & Frameworks

· Expert proficiency in Python (NumPy, Pandas, Scikit-Learn, Pydantic).

· Strong experience with API development using FastAPI or Flask for ML inference services.

· Familiarity with async processing and distributed workloads where needed.

AWS & MLOps Stack

· Hands-on experience with Amazon SageMaker (training, tuning, batch & real-time endpoints).

· Experience with AWS data services (S3, Glue, Athena, Redshift, or EMR).

· Exposure to containerisation (Docker) and orchestration (Step Functions, Airflow, or similar).

Machine Learning Expertise

· Strong grounding in supervised and unsupervised learning techniques.

· Practical experience with feature selection, model interpretability, and bias mitigation.

· Experience evaluating and deploying models in real-world, noisy data environments.

Advantageous Skills

Advanced ML & Data Systems

· Experience with Databricks and PySpark for large-scale data processing.

· Knowledge of time-series forecasting, recommender systems, or NLP (non-GenAI).

· Familiarity with classical optimization and statistical modeling techniques.

Integration & Enterprise Systems

· Experience integrating ML systems with enterprise platforms and workflows (e.g., triggering inference from business events or APIs).

· Exposure to workflow automation tools (e.g., Workato) or event-driven architectures is a plus.

Ideal Candidate Profile

· Strong software engineering mindset applied to ML systems.

· Proven experience taking ML models from experimentation to production at scale.

· Ability to collaborate with architects, data scientists, and business stakeholders.

· Passion for building reliable, maintainable, and measurable ML solution

Job Details

Company
HCLTech
Location
City of London, London, United Kingdom
Posted