Machine Learning

Brief summary of the course

Observing the world and compressing these observations into compact rules have been of great importance to humankind for ages. Nowadays we collect and generate a lot of data, in fact so much that no human can analyze it. Machine learning is a field of science that is responsible for designing computer algorithms capable of learning important patterns directly from large volumes of data without being explicitly programmed to. In this course, we are going to look into principles and techniques that are at the core of machine learning. Topics will include notions of supervised and unsupervised learning; classification, regression, clustering, and dimensionality reduction methods; deceptive effects of overfitting, and ways to estimate models’ generalization power. To make the learning process interactive and gain skills more practical we will practice implementing many of the mentioned algorithms in Python using Google Colaboratory is an interactive environment.

Course topics

Part 1. Supervised Learning

  • Nearest Neighbour Classifier/K-Nearest Neighbour Classifier
  • Linear Regression
  • Decision Trees
  • Overfitting
  • Train-val-test split
  • Cross-validation algorithm

Part 2. Unsupervised Learning

  • Principle component analysis
  • UMAP / t-SNE
  • K-means clustering
  • Hierarchical clustering
  • DBSCAN
  • Methods for estimating number of clusters

Part 3. Deep Learning:

  • Artificial Neuron
  • Feedforward path
  • Backpropagation algorithm
  • Basics of Convolutional Neural Networks

Part 4. Regularisation

  • L1 & L2 regularisation
  • LASSO regression
  • Ridge regression
  • Weight decay
  • Dropout
  • Data Augmentation

Part 5. Ensemble methods

  • Basic ensembling (averaging, majority vote)
  • Bagging
  • Random Forest
  • Boosting
  • XGBoost
  • Stacking & Blending

Part 6. Performance metrics

  • Accuracy, Recall, precision and f1-score
  • Confusion matrix
  • ROC & AUC
  • MSE & RMSE

Prerequisites

Lecture sample