Text Mining

2018-2019
Dit vak wordt in het Engels aangeboden. Omschrijvingen kunnen daardoor mogelijk alleen in het Engels worden weergegeven.

Doel vak

You will get acquainted with the possibilities and problems of automatic
analysis of natural language by computers. Students will obtain
practical knowledge; they will learn to use existing technology and
experience the obstacles and options of the domain. They will learn
about the theories behind language technology and its connection to
artificial intelligence, linguistics and semantic web. The students will
choose a project themselves in which they apply the learned
technologies,
evaluate its results and communicate their findings through a report.

Inhoud vak

It is estimated that about 80% of knowledge is captured in language:
think of news, wikis, social media and handbooks. Searching for
information is also largely done through language. The amount of
information is too large for humans to oversee, which is why
technologies are developed to access and use this information more
efficiently.

Text Mining is a promising research domain whose goal it is to extract
structured information from unstructured natural language. This is a big
challenge as human language is a rich and complex medium that is to be
understood in the context of social human interaction. Therefore,
language technology analyses language on different levels: the
grammatical level (e.g. word types and syntax), and the semantic level
(e.g. entities, events, opinions). During the course you will learn how
this information is coded in text and how you can extract and present it
using computers.

Onderwijsvorm

Lectures (2 hours/week) and labs (2 hours/week).

Toetsvorm

Assignments and exam:
50% final assignment (group);
50% exam.

None of the grades can be lower than 5.5 to pass the course.
Attendance at the final assignment presentation session is mandatory and
all but one of the practical assignments need to be passed.

Vereiste voorkennis

None

Literatuur

Will be announced at Canvas

Doelgroep

3IMM, 3LI, BA

Overige informatie

This module is compulsory in the third year of Lifestyle Informatics and
elective for Informatie, Multimedia & Management, Business Analytics.
The course can
only be completed if the grades for the test components (assignment +
exam) are at least 5.5 each, with a weighted average of more than 6, and
all but one of the practical assignments were passed. Attendance at the
final
event where students present their final assignments is compulsory.

This course is also interesting to students from other faculties as many
fields deal with text and can benefit from automated text analysis (e.g.
digital humanities, financial domain). Specific prior knowledge is not
required, but affinity with computers is needed as the lab sessions and
assignment require some Python programming.

Aanbevolen voorkennis

Information Retrieval and Python

Algemene informatie

Vakcode L_PABAALG002
Studiepunten 6 EC
Periode P4
Vakniveau 300
Onderwijstaal Engels
Faculteit Faculteit der Geesteswetenschappen
Vakcoördinator prof. dr. P.T.J.M. Vossen
Examinator prof. dr. P.T.J.M. Vossen
Docenten dr. H.D. van der Vliet
dr. E. Maks
prof. dr. P.T.J.M. Vossen

Praktische informatie

Voor dit vak moet je zelf intekenen.

Voor dit vak kun je last-minute intekenen.

Werkvormen Werkcollege, Hoorcollege
Doelgroepen

Dit vak is ook toegankelijk als: