Interpreting Information in Text by Humans and Machines

2019-2020
Dit vak wordt in het Engels aangeboden. Omschrijvingen kunnen daardoor mogelijk alleen in het Engels worden weergegeven.

Doel vak

In this course, students are trained in systematic text analysis. In
particular, we explore the process of identifying and annotating
information in historic and contemporaneous texts such as novels,
lyrics, letters, newspaper articles, movie scripts, blogs and
other social media texts using manual and automatic methods. They
will learn the implications for the theoretical models and concepts they
are familiar with in their own discipline. Students will work on a
research project of their choice and annotate these texts in an
interdisciplinary context using different tools and
methods. They will apply expert and crowd annotations, develop
code-books and compare the results. Finally, they will use
text mining techniques for analyzing text and reflect on the
performance of the automatic annotation. We will focus on high-level
semantic annotations of, for example, (historic) events, entities and
emotions that are of interest to a broader range of humanities and
social and computer science students. Students present their findings
in a research paper.

Inhoud vak

This module addresses the process of systematic text analysis through
human annotation and automatic analysis using text mining techniques.
Annotations make information that is
implicit in data explicit, allowing researchers to explore their data,
identify patterns and answer various research questions in a
methodologically sound way.
Annotation requires the use of some type of interpretation model and it
results in an analysis that can be compared across annotators. As such,
annotation can be seen as an important step towards the formalization of
humanities and social science as a discipline. The degree to which
annotators agree or disagree (the so-called Inter Annotator Agreement)
tells us something about the reproducibility of the interpretation
process, the matureness of theoretical notions and the criteria used to
apply them to real data.
Annotated data can be used to evaluate text mining techniques that
automatically identify the same or similar information. How do these
techniques work? Can a machine do better than humans? Is it possible to
use the automatic annotations to extract useful informations form the
text and to answer the research questions.

Humanities scholars and social scientists learn
to represent their interpretation of texts in a data structure. Computer
science students will learn about how text mining
technologies can be applied in Humanities and Social Sciences.
Different backgrounds of annotators will lead
to different types of annotations. Linguists, (cultural-)historians,
social-scientists, and literature-scientists will consider sources and
data differently and consequently come to different
annotations of the same source/data.

Onderwijsvorm

Lecture, Seminar (2 hrs a week each)

Toetsvorm

Weekly assignments and a final research paper.

Vereiste voorkennis

Python (basics)

Doelgroep

3rd year bachelor students, in particular Humanities, Social Science and
Computer Science. Students in the Minor Digital Humanities and Social
Analytics.

Afwijkende intekenprocedure

This module is taught at the VU. Module registration at the VU is required.

Algemene informatie

Vakcode L_PABAALG005
Studiepunten 6 EC
Periode P2
Vakniveau 200
Onderwijstaal Engels
Faculteit Faculteit der Geesteswetenschappen
Vakcoördinator dr. E. Maks
Examinator dr. E. Maks
Docenten dr. E. Maks

Praktische informatie

Voor dit vak moet je zelf intekenen.

Voor dit vak kun je last-minute intekenen.

Werkvormen Werkcollege, Hoorcollege
Doelgroepen

Dit vak is ook toegankelijk als: