MLOps Engineer

Role Summary

We are seeking a highly skilled MLOps Engineer to focus on the deployment, monitoring, and maintenance of machine learning models in production environments. This role is platform-focused and does not involve model development or end-user support. The successful candidate will ensure reliability, scalability, and performance of ML platforms while managing API endpoints and deployment workflows.

Key Responsibilities

Platform Operations & Monitoring

Monitor ML model endpoints and platform health using tools such as Grafana and Domino Data Lab
Respond to incidents and alerts; perform code fixes and manage changes via ServiceNow
Liaise with Domino Data Lab support to resolve platform-related issues

Model Deployment

Deploy and maintain ML models in production environments
Ensure models integrate seamlessly into automated pipelines
Maintain reliability, version control, and governance standards

Pipeline Maintenance

Collaborate with Data Scientists and Engineers for smooth production handoff
Maintain and optimize ML pipelines for stability and scalability
Improve performance, resource usage, and automation

Automation & Tooling

Implement automation for deployment and monitoring
Contribute to continuous platform improvements

Required Skills & Experience

Strong Python programming experience
Proven experience deploying and monitoring ML models in production
Understanding of model evaluation metrics, data drift, overfitting, and feature importance
Experience with AWS services (S3, Redshift, etc.)
Hands-on experience with Grafana for monitoring
Familiarity with Domino Data Lab (desirable)
Strong knowledge of CI/CD, version control, Docker, Kubernetes
Excellent troubleshooting and incident management skills
Strong stakeholder communication skills

Apply Now

MLOps Engineer

Job Details