Statistical Data Analysis


Course Objective

After the student has follwed the course, she/he should be able to
⦁ find, with the help of QQ-plots, symplots, histograms, box plots,
goodness-of-fit tests, etc., a suitable model distribution for a dataset
at hand, e.g., a normal or exponential distribution, and to estimate the
unknown (e.g. location and/or scale) parameters,
⦁ describe the data distribution quantitatively and qualitatively (e.g.,
symmetry, presence of outliers) with the help of the computer software
R and estimate the underlying density function,
⦁ decide, by taking characteristics of the dataset into account, which
statistical method is preferred (e.g. to use a nonparametric test
statistic, or to make a trade-off between robustness and efficiency of
an estimator) to draw conclusions on the population underlying the data,
for example with the help of hypothesis tests and confidence intervals,
⦁ apply tests for location parameters, stochastic order, or equality in
distribution in two-sample problems, and be able to assess the
asymptotic relative efficiency of tests,
⦁ apply (again in R) resampling methods such as the bootstrap or random
permutation to find characteristics of a statistic, even if no model
assumptions are made,
⦁ analyse, with the help of rank-based correlation tests, chi-square
tests for contingency tables, or multiple linear regression, the
relationship between two or more variables in a given dataset. In the
context of multiple linear regression, the student should be able to
identify influential observations and select variables for the linear
regression model.

In all the above, the student should be able to present the findings

Course Content

This is an advanced level statistical data analysis course that builds
on an introductory course on statistics, e.g. Statistics (Algemene
Statistiek). The course introduces the students to several widely used
statistical models and methods, and the students are taught how to apply
these tools to real data with the use of the statistical software
package R. The following subjects are covered:
- summarizing data;
- investigating the distribution of data;
- density estimation;
- nonparametric methods;
- bootstrap;
- two-sample problems;
- contingency tables;
- multiple linear regression.
The course is a combination of theory (in the lectures) and practice (in
the computer classes) in such a way that the theory is explicitly linked
to the practice of statistical data analysis.

Teaching Methods

Lectures (13x2h; once per week), computer classes (13x2h; once per
Attendance is not mandatory but strongly recommended.

Method of Assessment

Homework assignments in R and two written exams.
50% of the final grade consists of the average assignment grades, the
other 50% of the final grade consists of the exam grade. Both of these
grades have to be at least 5.5. Otherwise, the course is failed.
The exam grade equals either the average of the grades of both partial
exams, that are written during the semester (if they are both passed
with a grade of at least 4.0), or it equals the resit exam grade.
If the resit exam is written, the homework assignment grades still count
towards the final course grade as explained above.
If both partial exams are passed with an average of at least 5.5, then
doing the resit exam is not possible anymore.
There is no resit possibility for the assignments.


Lecture notes.

Target Audience

2BA, 2W, 2W-B, 3W, 3W-B, 3Ect.

Additional Information

Language of tuition: English

Recommended background knowledge

The required knowledge has been obtained if the students had previously
passed the VU courses Statistics (X_400004) and Probability Theory
(X_400622) or equivalent courses.

General Information

Course Code X_401029
Credits 6 EC
Period P4+5
Course Level 300
Language of Tuition English
Faculty Faculty of Science
Course Coordinator dr. D. Dobler
Examiner dr. D. Dobler
Teaching Staff dr. D. Dobler

Practical Information

You need to register for this course yourself

Last-minute registration is available for this course.

Teaching Methods Seminar, Lecture
Target audiences

This course is also available as: