Optimizing Internet with Data Science

Course Description

This course will dive into the design details of some of the most critical infrastructures for content delivery and distribution on the World Wide Web and approaches for their optimization via data analysis and mining. This will include various practical examples for design and implementation of supervised learning algorithms for smart caching, predictive content prefetching and dynamic information retrieval illustrated on the examples from Internet giants (BBC iPlayer and Skyscanner). It is expected that the student are familiar with the basics of supervised machine learning and are comfortable with programming in Python.

Course topics

  • content delivery and distribution — insights into the design of high-capacity Internet systems
  • smart caching, predictive prefetching, dynamic information retrieval — data mining techniques for optimizing critical internet infrastructure
  • industrial experience — examples from BBC and Skyscanner

 

Lecturer

Dr. Dmytro Karamshuk

Senior Data Scientist at Skyscanner

Dima (@karamshuk) is a Senior Data Scientist at Skyscanner where his focus is on applying data mining and machine learning techniques for optimizing content caching and distribution. Prior to Skyscanner, Dima was with King’s College London where he worked on analysis of BBC iPlayer (a joint project with BBC) and various social media websites (Twitter, Pinterest, Foursquare, etc.). He is an active contributor to the computer networks (Infocom, ComMag, etc.) and data mining communities (KDD, WWW, etc.). Dima’s work has been featured in New Scientist, BBC News and other media outlets. He also co-founded and was a former CEO of stanfy.com.

 

Fields of interests: data mining, content delivery, and distribution, social media

Contacts[email protected]
http://karamshuk.github.io
http://twitter.com/karamshuk