Senior Data Scientist
Role: Senior Data Scientist
Location: London, UK (Hybrid)
Employment type: Contract
Accountabilities
- The Data Scientist has full-stack accountabilities across the full value chain of building an industrialized data-science software product:
- Understanding a business problem and its component processes end to end, and identifying opportunities to make decisions more optimally leveraging decision-support tooling
- Efficiently conducting analyses and visualizations to identify valuable opportunities for decision-support and to determine trade-offs between different potential feature implementations
- Prototyping advanced machine learning and optimization models to prove the value of a use case and approach (in Python)
- Delivering features to industrialize machine learning and optimization models in Python using best-practice software principles (e.g., strict typing, classes, testing)
- Build automated, robust data cleaning pipelines that follow software best-practices (in Python)
- Implementing integrations between the core algorithm (machine-learning or optimization) and a workflow orchestration paradigm such as Dagster
- Implementing software in a cloud-based deployment pipeline with Continuous Integration / Continuous Deployment (CI/CD) principles
- Building logging, error handling, and automated tests (e.g., unit tests, regression tests) to ensure the robustness of operationally critical decision-support products
- Deliver features to harden an algorithm against edge cases in the operation and in data
- Conduct analysis to quantify the adoption and value-capture from a decision-support product
- Engage with business stakeholders to collect requirements and get feedback
- Contribute to conversations on feature prioritisation and roadmap, with an understanding of the trade-off between speed vs. long-term value
- Understand and integrate the product into existing business processes, and contribute to the development and adoption of new business processes leveraging a decision-support product
- Communicate feature and modeling approach, trade-offs, and results with the internal team and business stakeholders
- The Data Scientist is also accountable for ways of working fit for an Agile cross-functional development squad, including:
- Using Git-versioning best practices for version control
- Contributing and reviewing pull-requests and product / technical documentation
- Giving input on prioritization, team process improvements, optimizing technology choices
- Working independently and giving predictability on delivery timelines
Skills/capabilities
- Strong knowledge of either machine learning and optimization techniques, incl. supervised (regression, tree methods, etc.), unsupervised (clustering) learning, and operations research (linear, mixed integer programming, heuristics)
- Fluent in Python(required) and other programming languages (preferred)with strong skills in applying DS, ML, and OR packages (scikit-learn, pandas, numpy, gurobietc.) to solve real-life problems and visualise the outcomes (e.g. seaborn)
- Proficient in working with cloud platforms (AWS preferred), code versioning (Git), experiment tracking (e.g. MLflow)
- Experience with cloud-based ML tools (e.g. SageMaker), data and model versioning (e.g. DVC), CI/CD (e.g. GitHub Actions), workflow orchestration (e.g. Airflow/Dagster) and containerised solutions (e.g. Docker, ECS) nice to have
- Experience in code testing (unit, integration, end-to-end tests)
- Strong data engineering skills in SQL and Python
- Proficient in use of Microsoft Office, including advanced Excel and PowerPoint Skills
- Advanced analytical skills, including the ability to apply a range of data science and analytic techniques to quickly generate accurate business insights
- Understanding of the trade-offs of different data science, machine learning, and optimization approaches, and ability to intelligently select which are the best candidates to solve a particular business problem
- Able to structure business and technical problems, identify trade-offs, and propose solutions
- Communication of advanced technical concepts to audiences with varying levels of technical skills
- Managing priorities and timelines to deliver features in a timely manner that meet business requirements
- Collaborative team-working, giving and receiving feedback, and always seeking to improve team processes
Qualifications/experience
- Master’s degree or greater in data science, ML, or operational research, or 2+ years of highly relevant industry experience(required)
- 0-2 years working on production ML or optimization software products at scale (required)
- Experience in developing industrialized software, especially data science or machine learning software products (preferred)
- Experience in relevant business domains (transportation, airlines, operations, network problems) (preferred)