SQL across structured, semi-structured and unstructured data. Ability to use dimension reduction techniques (PCA, encoders etc.) Excellent familiarity with elastic net logistic regression, randomforest and XGBoost ensembles to work on supervised problems with structured, tabular data. We currently use Scikit-learn, and we're open to more »
to code or have programming experience, especially in Python. Some experience with theoretical concepts of statistical learning (e.g. hypothesis testing, Bayesian Inference, Regression, SVM, Random Forests, Neural Networks, Natural Language Processing, optimisation). Experience with some coding libraries frequently used in data science. The ability to communicate effectively. Experience more »
parameter estimation, factors selection, PCA, hypothesis testing, time series, queuing theory, survival analysis, clustering, linear programming. Experience with machine learning methods, such as regularization, random forests, neural networks and deep learning. Ability to write algorithms and implement pipelines in Python. Knowledge of Scala, R, is a plus. Experienced in more »
related analytical background. Proficiency in programming languages such as SAS, R, Matlab, or Python. Experience with predictive modelling techniques like Logistic Regression, Decision Trees, Random Forests, etc. Familiarity with Emblem and Radar. Strong communication skills to convey statistical concepts to non-statistical audiences. Self-motivated with excellent planning and more »
Solid understanding of computer science fundamentals, including data structures, algorithms, data modeling, and software architecture. Proficiency in classical machine learning algorithms (e.g., Logistic Regression, RandomForest, XGBoost) and modern deep learning algorithms (e.g., BERT, LSTM). Strong knowledge of SQL and Python's data analysis ecosystem (Jupyter, Pandas more »
Possess vast experience and expertise of working with probability and statistics, inclusive of machine learning, experimental design, and optimisation Experience using Gradient Boosting Machines, RandomForest, Neural Network or similar algorithms Proven and successful track record of leading high-performing data analyst teams through the successful performance of more »