AI Operations Lead

Job Specification: AI Operations Lead ( Role : AIOps Lead )

We are seeking an experienced and highly skilled AI Operations (AIOps) Lead to drive Driving Agentic Automation and AIOps implementation , the operationalization, governance, monitoring, and continuous improvement of enterprise AI solutions. This role requires a proven specialist capable of establishing scalable AI operating models while providing hands-on leadership to ensure AI systems deliver reliable, secure, and measurable business outcomes.

Key Responsibilities & Requirements:

  • Provide expert leadership for the operational management and continuous improvement of AI, Machine Learning, and Generative AI solutions across the organization.
  • Driving Agentic Automation and AIOps implementation by providing oversight, resolving blockers and ensuring smooth execution
  • Design the solution and implement, code review and lead the team - Google ADK (Agentic Framework), LLM - Google 2.0 Flash or 1.5, Lang graph
  • Drive team to formalize the engineering and integration approaches (enterprise changes, impacts, and documentation standards)
  • Establish feasibility and checklist-based transition / adoption approach with automated verifications where possible
  • Formalize and package adoption standards for federated adoption of AI / Agentic interventions
  • Run Training and Support Incidents/Escalations associated with Adoptions / Integrations
  • For first time cases, establish / package all materials associated with ad
  • Develop and implement enterprise-wide AIOps frameworks, operating models, standards, and best practices to ensure scalable and sustainable AI adoption.
  • Act as a hands-on contributor, working directly with program leadership, AI architects, data scientists, and engineering teams to support AI initiatives throughout their lifecycle.
  • Establish monitoring, observability, and performance management capabilities for AI models, services, and AI-powered applications.
  • Define and manage processes for model deployment, versioning, validation, retraining, and lifecycle management.
  • Ensure AI solutions meet operational requirements related to reliability, scalability, security, compliance, and business continuity.
  • Develop and track key performance indicators (KPIs) and service metrics related to AI adoption, model performance, operational efficiency, and business value realization.
  • Lead incident management, root-cause analysis, and remediation efforts for AI-related production issues.
  • Collaborate with data engineering, platform, security, and infrastructure teams to optimize AI platform operations and service delivery.
  • Drive the implementation of MLOps and LLMOps practices to support efficient deployment, monitoring, and governance of AI solutions.
  • Establish governance processes to ensure compliance with Responsible AI principles, organizational policies, and regulatory requirements.
  • Identify and mitigate operational, technical, security, and governance risks associated with AI deployments.
  • Support the development and execution of change management and adoption strategies to maximize the value of AI investments.
  • Translate operational insights and performance data into actionable recommendations for improving AI effectiveness and business outcomes.
  • Demonstrate strong stakeholder management and communication skills, particularly when engaging with senior leadership and cross-functional teams.
  • Operate effectively within a fast-paced, dynamic environment, delivering measurable outcomes and driving continuous operational excellence.

Preferred Qualifications:

  • Extensive experience in AI/ML operations, platform engineering, MLOps, DevOps, or enterprise technology operations.
  • Strong understanding of Machine Learning, Generative AI, Large Language Models (LLMs), Retrieval-Augmented Generation (RAG), and AI platform ecosystems.
  • Experience implementing and managing MLOps, LLMOps, model governance, and AI monitoring frameworks in enterprise environments.
  • Proven expertise with cloud-based AI and data platforms, automation tools, monitoring solutions, and CI/CD pipelines.
  • Strong analytical and problem-solving skills with the ability to translate operational data into strategic improvements.
  • Demonstrated experience leading large-scale AI transformation or operational excellence initiatives.
  • Excellent communication, stakeholder engagement, and leadership capabilities.

This role is ideal for a specialist who can bridge AI strategy and day-to-day operations, ensuring that enterprise AI solutions remain reliable, governed, scalable, and aligned with business objectives.

Job Details

Company
Mphasis
Location
City of London, London, United Kingdom
Posted