Course ObjectiveAfter the student has follwed the course, she/he should be able to
⦁ find, with the help of QQ-plots, symplots, histograms, box plots,
goodness-of-fit tests, etc., a suitable model distribution for a dataset
at hand, e.g., a normal or exponential distribution, and to estimate the
unknown (e.g. location and/or scale) parameters,
⦁ describe the data distribution quantitatively and qualitatively (e.g.,
symmetry, presence of outliers) with the help of the computer software
R and estimate the underlying density function,
⦁ decide, by taking characteristics of the dataset into account, which
statistical method is preferred (e.g. to use a nonparametric test
statistic, or to make a trade-off between robustness and efficiency of
an estimator) to draw conclusions on the population underlying the data,
for example with the help of hypothesis tests and confidence intervals,
⦁ apply tests for location parameters, stochastic order, or equality in
distribution in two-sample problems, and be able to assess the
asymptotic relative efficiency of tests,
⦁ apply (again in R) resampling methods such as the bootstrap or random
permutation to find characteristics of a statistic, even if no model
assumptions are made,
⦁ analyse, with the help of rank-based correlation tests, chi-square
tests for contingency tables, or multiple linear regression, the
relationship between two or more variables in a given dataset. In the
context of multiple linear regression, the student should be able to
identify influential observations and select variables for the linear
In all the above, the student should be able to present the findings
Course ContentThis is an advanced level statistical data analysis course that builds
on an introductory course on statistics, e.g. Statistics (Algemene
Statistiek). The course introduces the students to several widely used
statistical models and methods, and the students are taught how to apply
these tools to real data with the use of the statistical software
package R. The following subjects are covered:
- summarizing data;
- investigating the distribution of data;
- density estimation;
- nonparametric methods;
- two-sample problems;
- contingency tables;
- multiple linear regression.
The course is a combination of theory (in the lectures) and practice (in
the computer classes) in such a way that the theory is explicitly linked
to the practice of statistical data analysis.
Teaching MethodsLectures (13x2h; once per week), computer classes (13x2h; once per
Attendance is not mandatory but strongly recommended.
Method of AssessmentHomework assignments in R and two written exams.
50% of the final grade consists of the average assignment grades, the
other 50% of the final grade consists of the exam grade. Both of these
grades have to be at least 5.5. Otherwise, the course is failed.
The exam grade equals either the average of the grades of both partial
exams, that are written during the semester (if they are both passed
with a grade of at least 4.0), or it equals the resit exam grade.
If the resit exam is written, the homework assignment grades still count
towards the final course grade as explained above.
If both partial exams are passed with an average of at least 5.5, then
doing the resit exam is not possible anymore.
There is no resit possibility for the assignments.
Target Audience2BA, 2W, 2W-B, 3W, 3W-B, 3Ect.
Additional InformationLanguage of tuition: English
Recommended background knowledgeThe required knowledge has been obtained if the students had previously
passed the VU courses Statistics (X_400004) and Probability Theory
(X_400622) or equivalent courses.
|Language of Tuition||English|
|Faculty||Faculty of Science|
|Course Coordinator||dr. D. Dobler|
|Examiner||dr. D. Dobler|
dr. D. Dobler
You need to register for this course yourself
Last-minute registration is available for this course.
|Teaching Methods||Seminar, Lecture|
This course is also available as: