Text Mining

Dit vak wordt in het Engels aangeboden. Omschrijvingen kunnen daardoor mogelijk alleen in het Engels worden weergegeven.

Doel vak

You will get acquainted with the possibilities and problems of automatic
analysis of natural language by computers. Students will obtain
practical knowledge; they will learn to use existing technology and
experience the obstacles and options of the domain. They will learn
about the theories behind language technology and its connection to
artificial intelligence, linguistics and semantic web. The students will
choose a project themselves in which they apply the learned
evaluate its results and communicate their findings through a report.

Inhoud vak

It is estimated that about 80% of knowledge is captured in language:
think of news, wikis, social media and handbooks. Searching for
information is also largely done through language. The amount of
information is too large for humans to oversee, which is why
technologies are developed to access and use this information more

Text Mining is a promising research domain whose goal it is to extract
structured information from unstructured natural language. This is a big
challenge as human language is a rich and complex medium that is to be
understood in the context of social human interaction. Therefore,
language technology analyses language on different levels: the
grammatical level (e.g. word types and syntax), and the semantic level
(e.g. entities, events, opinions). During the course you will learn how
this information is coded in text and how you can extract and present it
using computers.


Lectures (2 hours/week) and labs (2 hours/week).


Assignments and exam:
50% final assignment (group);
50% exam.

None of the grades can be lower than 5.5 to pass the course.
Attendance at the final assignment presentation session is mandatory and
all but one of the practical assignments need to be passed.

Vereiste voorkennis



Will be announced at Canvas



Overige informatie

This module is compulsory in the third year of Lifestyle Informatics and
elective for Informatie, Multimedia & Management, Business Analytics.
The course can
only be completed if the grades for the test components (assignment +
exam) are at least 5.5 each, with a weighted average of more than 6, and
all but one of the practical assignments were passed. Attendance at the
event where students present their final assignments is compulsory.

This course is also interesting to students from other faculties as many
fields deal with text and can benefit from automated text analysis (e.g.
digital humanities, financial domain). Specific prior knowledge is not
required, but affinity with computers is needed as the lab sessions and
assignment require some Python programming.

Aanbevolen voorkennis

Information Retrieval and Python

Algemene informatie

Praktische informatie

