Data Mining Techniques

2019-2020

Course Objective

The aim of the course is that students acquire data mining knowledge and
skills that they can apply in a business environment. More precisely,
the following learning goals are distinguished:
1. Understand the data mining process
2. Understand exploratory data analysis
3. Understand feature engineering
4. Understand classification techniques
5. Understand regression techniques
6. Understand association rules and recommender systems
7. Understand current state of the art in data mining
8. Understand ethical concerns and evaluation of results
9. Learning to argue for choices (rationale)

The course if focused on the Dublin descriptors Applying Knowledge and
Understanding (since the aim is to improve the practical skills); Making
judgements (which technique is appropriate, how to best apply Data
Mining for a specific case); Communication skills (how to report on your
approach, choices and your results), and Learning skills (able to find
new relevant techniques, assess their suitability, etc.).

How the aims are to be achieved: Students will acquire knowledge and
skills mainly through the following: an overview of the most common data
mining algorithms and techniques (in lectures), a survey of typical and
interesting data mining applications, and practical assignments to gain
"hands on" experience. The application of skills in a business
environment will be simulated through various assignments of the course.

Course Content

The course is intended to introduce Data Mining Techniques to students
that are new to the field as well as to more experienced students. The
main aim is to gain a more practical perspective towards Data Mining
Techniques/Machine Learning. Lectures will cover more basic things for
those new to the field (general introduction into Data Mining, classical
algorithms such as decision trees, association rules, neural networks,
ensemble learning, etc.) and on top will discuss advanced topics
including deep learning, recommender systems, big data infrastructures,
and text mining. A number of successful applications in the area will
also be discussed. In addition to lectures, there will be an extensive
practical part, where students will experiment with various data mining
algorithms and data sets. The grade for the course will be based on
these practical assignments (i.e., there will be no final examination).

Teaching Methods

Lectures (h) and practical sessions (pra). Lectures are planned to be
interactive: there will be small questions, etc.

Method of Assessment

Practical assignments (i.e. there is no exam). There will be two
assignments done in groups of three. For the first assignment there is a
choice: going for a basic assignment (suited for those new to the
domain) or a more advanced one (for students with more experiences). The
second assignment is the same to all and will involve an in class Kaggle
competition. There is a possibility to get a grade without doing these
assignments: to do a real research project instead (which will most
likely to involve more work, but it can also be more rewarding). For the
regular assignments the first assignment counts for 40% and the second
for 60%. The grade of both assignments needs to be sufficient to pass
the course.

Literature

Optional: Ian H. Witten, Eibe Frank, Mark A. Hall, and Christopher J.
Pal. Data Mining: Practical Machine Learning Tools and Techniques
(Fourth Edition). Morgan Kaufmann, 2016 978-0128042915.

Target Audience

mBA, mCS, mAI, mBio

Recommended background knowledge

Statistics.

General Information

Course Code X_400108
Credits 6 EC
Period P5
Course Level 500
Language of Tuition English
Faculty Faculty of Science
Course Coordinator dr. M. Hoogendoorn
Examiner dr. M. Hoogendoorn
Teaching Staff dr. M. Hoogendoorn

Practical Information

You need to register for this course yourself

Last-minute registration is available for this course.

Teaching Methods Lecture, Computer lab
Target audiences

This course is also available as: