Statistics:

Multivariate Statistics 1

In this course, participants will explore unsupervised learning techniques, including PCA for dimension reduction and various clustering methods such as k-means, hierarchical clustering, and DBSCAN. The course provides both theoretical insights into multivariate analysis and practical exercises using R, helping participants apply these techniques to their own datasets.

The participants will learn when and how to apply unsupervised learning methods, such as PCA for dimension reduction, and clustering techniques like k-means, hierarchical clustering, and other hybrid approaches. The course also covers rotation techniques following dimensionality reduction, as well as mixture models, heatmaps, and more advanced clustering methods (DBSCAN, Louvain). The course content is designed to provide a foundational understanding of the theory behind multivariate analysis. Each topic is accompanied by hands-on exercises using the statistical software R. Participants are encouraged to ask questions and seek advice on analyzing their own datasets.

Topics:

This course on multivariate statistics covers two different topics:

  • Dimension reduction methods: This first chapter focuses more on principal component analysis (PCA), what is "under the hood", how many principal components to choose, how to visualize and interpret the results. A brief overview of rotation techniques and other unsupervised multivariate methods (e.g., for categorical variables, data structured into groups) is also part of the lecture.
  • Cluster analysis: This second chapter describes the different measures of dissimilarity and distances that can be used to define clusters. It focuses on the two most frequently used clustering methods: k-means and hierarchical clustering, and the combination of these two methods into hybrid algorithms. This chapter also covers the theory and application of mixture models as well as the R commands that permit to produce heatmaps together with the result of a clustering algorithm. Finally, two other clustering methods, namely DBSCAN and Louvain method for community detection, are introduced at the end of this lecture.

Methods:

Each day consists of blocks covering first the theory behind the methods and their applications in R. Theoretical lessons will be followed by hands-on examples with best-practice solutions.

Learning goals

Understand and Apply Principal Component Analysis (PCA)

  • Describe the principles of PCA, how to determine the number of components to retain and how to interpret them.
  • Apply rotation techniques for better understanding of the components.
  • Use hands-on exercises to confidently apply PCA to real-world data using R.

Understand and Apply Clustering Methods

  • Choose appropriate dissimilarity measures for your data
  • Explore different clustering techniques, such as k-means, hierarchical clustering and some hybrid approaches.
  • Execute clustering methods in R using real-world data.

Explore Advanced Clustering Approaches

  • Understand mixture models, DBSCAN, and Louvain clustering to identify complex data structures and communities.
  • Create heatmaps to visualize clustering outcomes and enhance interpretability of multivariate analyses.

Course date

Register now: May 06–23, 2025

For more information on how to register, please follow the link on the course date.

Prerequisites

Programming skills with R (as taught in the course “Introduction to R”) and basic knowledge of statistics (as taught in the course “Introduction to Statistics”).

Target group

This course is open to researchers of all career stages, or anyone interested in learning about the subject.

This course is free of charge.

Alternativ-Text

Subscribe newsletter