Project Big Data

2019-2020
Dit vak wordt in het Engels aangeboden. Omschrijvingen kunnen daardoor mogelijk alleen in het Engels worden weergegeven.

Doel vak

After completing this course:
1. the student can transform and explore data with the command line
2. the student can extract data with regular expressions
3. the student can import and process static and streaming data in
Python
4. the student can store and re​trieve semi-structured data in and from
a database
5. the student can parallelize tasks via MapReduce, threads and/or
queues in Python.
6. the student can create appropriate and well-formatted visualizations
and tables
7. the student can address a research question and report on their
findings

Inhoud vak

This course aims to integrate various aspects involved with data science
and to teach the fundamentals of working with big data (including an
introduction to Hadoop). Topics include visualization of data; preparing
data for processing (machine learning or data mining); storing
unstructured data; and scaling techniques for working with big volumes
of data. Python is used throughout this hands-on course.​

Onderwijsvorm

Lectures and Q&A sessions

Toetsvorm

Hand-in assignments, presentation, and a report.
Assignment week 1: 15%
Assignment week 2: 15%
Assignment week 3: 15%
Report: 35%
Presentation: 20%

The weighted average needs to be 5.5 or higher.​

Vereiste voorkennis

Programming experience in any language

Literatuur

Slides

Doelgroep

2BA

Algemene informatie

Vakcode X_400645
Studiepunten 6 EC
Periode P6
Vakniveau 300
Onderwijstaal Engels
Faculteit Faculteit der Bètawetenschappen
Vakcoördinator prof. dr. S. Bhulai
Examinator prof. dr. S. Bhulai
Docenten

Praktische informatie

Voor dit vak moet je zelf intekenen.

Voor dit vak kun je last-minute intekenen.

Werkvormen Hoorcollege, Werkgroep
Doelgroepen

Dit vak is ook toegankelijk als: