Tallinn, Estonia

Introduction to Machine Learning with R

when 15 July 2024 - 24 July 2024
language English
duration 2 weeks
credits 4 EC
fee EUR 400

This is a 32-hour hands-on course for statistical data analysis using R. The main goal of this course is to empower participants to use R for data analysis and machine learning applications.

The course introduces students to the statistical programming language R and the use of R studio. We will cover the concepts of data manipulation and data preparation as well as uni- and bivariate statistics in R.

We will cover the so-called grammar of graphics in R with the ggplot2 package to create stunning and publication-ready data visualizations. We will also discuss how to conduct basic descriptive statistics (such as mean, standard deviation, correlation) in R to describe your data.

Our main focus will be the discussion of a selection of machine learning algorithms and their implementation in R. We will for example try to model the factors that influenced the survival of the Titanic passengers, predict customer churn for a telecommunications company and try to classify traffic signs based on images.

The course is designed to give a robust theoretical understanding of the methods and allow students to use the algorithms with real-world data sets.

Course leader

Dr. Daniel Hoppe is a Professor of Business Administration, especially Retail Management and e-Commerce at Cooperative University Gera-Eisenach.

Target group

Generally, anyone interested in learning the statistical programming language R for data analysis and application of machine learning algorithms are welcome to apply. Specifically:
* aspiring Bachelor students (after successfully passing the statistics course)
* master students / PhD students.

No previous knowledge in R is required. However, basic statistical knowledge (descriptive and analytical statistics) is recommended.
Students should bring their own laptop (Windows or Mac) and have R and R studio installed. Details on how to install R and R Studio will be provided.

Course aim

- Introduction to the statistical programming language R and the use of R studio
- Data manipulation and data preparation in R
- Uni- und and bivariate statistics in R
- Grammar of graphics in R (with ggplot2) to produce publication-ready graphics
- Theoretical understanding of selected machine learning algorithms (e.g. logistics regression, decision trees and random forests, k-nearest-neighbours, hierarchical cluster analysis)
- Practical application of selected machine learning algorithms in R

Credits info

4 EC
Assessment criteria: written assignment (10 – 15 pages of text plus R code), application of uni- and bivariate statistics, graphical visualization and (at least) one machine learning algorithm to be applied to a data set.

Fee info

EUR 400: Early-Bird Course Fee until 31 March 2024.

NB! Accommodation, cultural programme and meals are not included in the price.
EUR 450: Regular Course Fee after 31 March 2024.

NB! Accommodation, cultural programme and meals are not included in the price.

Register for this course
on course website