Multivariate Statistics

2019-2020

Course Objective

Upon completing this course, students have a thorough knowledge of
multivariate distributional theory and its relevance for econometrics
and data science, as well as of core dimension reduction and
classification methods for multivariate data.
Students are able to operationalize these techniques on real or
simulated data using there own code/script in Python or R, as well as
standard libraries available for these tasks in either of these
languages.

Course Content

This course introduces the theory and applications for
analyzing multi-dimensional data. Topics include multivariate
distributions, Gaussian models, fat-tailed multivariate distributions,
copulas, mixture models, multivariate inference, dimension reduction
methods such as principal components and factor models, and
classification and clustering methods. Course content is subject to
change in order to keep the contents up-to-date with new developments in
data science.

Teaching Methods

There are two 2-hours lectures, one 2-hours tutorial, possibly
complemented with a 2-hour mix of tutorial and computer-lab.

Method of Assessment

Written exam plus assignments.

Literature

Härdle, W.K., and L. Simar (2015): Applied Multivariate Statistical
Analysis. Springer, 4th ed. Available for free via the VU library
(search for “Hardle Simar” in the catalogue). There is an accompanying
book available for free in the library with the solutions to all
exercises in the book (search for "Hardle Hlavka" in the catalogue).

Recommended background knowledge

This course may be one of your first courses that elaborately builds on
previous course work. We will use:
- linear algebra skills (as we deal with multivariate data, we will
abundantly use matrix notation and concepts);
- statistics + introduction to data science (for standard concepts such
as distributions, densities, expectations, conditional distributions,
standard univariate distributions (normal, Student t, chi-squared, F),
maximum likelihood estimation);
- analysis / calculus (for limits, integration, multivariate change of
variables in integrands);
- numerical analysis (for simulation, random number generation,
optimization, inverse cdf technique);
- econometrics 1 (for linear regression analogy, maximum likelihood,
matrix notation in the linear regression model, some normal multivariate
distribution theory).
We will briefly brush up this knowledge where needed, but students are
themselves responsible for retracing their previous course material in
case they have forgotten relevant parts.

General Information

Course Code E_EOR2_MS
Credits 6 EC
Period P4
Course Level 200
Language of Tuition English
Faculty School of Business and Economics
Course Coordinator prof. dr. A. Lucas
Examiner prof. dr. A. Lucas
Teaching Staff prof. dr. A. Lucas

Practical Information

You need to register for this course yourself

Last-minute registration is available for this course.

Teaching Methods Study Group, Lecture
Target audiences

This course is also available as: