Algorithms in Sequence Analysis

2019-2020
Dit vak wordt in het Engels aangeboden. Omschrijvingen kunnen daardoor mogelijk alleen in het Engels worden weergegeven.

Doel vak

Have you ever wondered how we can track a gene across 3 billion years of
evolution? Sequence alignment can be used to compare genes from humans
and bacteria, using a dynamic programming algorithm. In this course we
focus on algorithms for biological sequences that can be applied to real
scientific problems in biology.

Students will obtain in-depth knowledge about the theory of sequence
analysis methods. They will also develop understanding and skills to
apply the algorithms to protein and DNA sequences. We would like to
stress that no biological knowledge is required to enter this course.

Goals
- At the end of the course, the student will be aware of the major
issues, methodology and available algorithms in sequence analysis.
- At the end of the course, the student will have hands-on experience in
tackling biological problems using sequence analysis algorithms and
applying the general statistical framework of Hidden Markov Models.
- At the end of the course, the student will be able to implement
several of the most important algorithms in sequence analysis.

Inhoud vak

Theory:
- Dynamic programming, database searching, pairwise and multiple
alignment, probabilistic methods including hidden markov models, pattern
matching, entropy measures, evolutionary models, and phylogeny.

Practical:
- Programming (in Python) an alignment algorithm based on dynamic
programming
- Aligning sequencing data from tumors to the human genome and analysing
structural variants
- Programming (in Python) an implementation of Hidden Markov Models and
using it to predict protein domain structure

Onderwijsvorm

13 Lectures: 2 two-hour lectures per week
13 Computer practicals and associated assignments: 2 two-hour hands-on
sessions per week

Toetsvorm

The final grade for this course will consist of 50% practical work (see
above) and 50% theoretical assessment.
The theoretical assessment will be an oral and/or written exam
(depending on number of students).

Vereiste voorkennis

Bachelor in any science discipline (including medicine).
Basic programming skills (Python) and an interest in biological
problems.

Literatuur

Course material on bb.vu.nl
Books: Durbin, R., Eddy, S.R., Krogh, A., Mitchison, G.. Biological
Sequence Analysis. Cambridge University Press, 1998, 350 pp., ISBN
0521629713.
Recommended reading: Marketa Zvelebil and Jeremy O. Baum Understanding
Bioinformatics Garland Science 2008 ISBN-10: 0-8153-4024-9

Doelgroep

mAI, mBio, mCS

Overige informatie

BYOD policy (Bring Your Own Device)
We expect students in this course to use their own laptop. This laptop
should at the very least support an SSH client, for remote shell access
to the VU Linux servers. Ideally, this laptop supports a command line
shell, Python 3 and a text editor with syntax highlighting -- either
standalone (e.g. Atom or Sublime Text) or as part of a simple IDE (e.g.
Spyder). As such, we recommend the Anaconda python distribution
regardless of operating system, along with PuTTy or PowerShell for
Windows users specifically.

If you are considering purchasing new hardware, we recommend the
following:
o Processor: Intel i5 / AMD Ryzen 5 or above
o Memory: At least 4GB RAM
o Storage: At least 512GB harddisk space
o Operating System: Ubuntu 16.04

The course is taught in English.

Algemene informatie

Vakcode X_405050
Studiepunten 6 EC
Periode P2
Vakniveau 600
Onderwijstaal Engels
Faculteit Faculteit der Bètawetenschappen
Vakcoördinator prof. dr. J. Heringa
Examinator prof. dr. J. Heringa
Docenten prof. dr. J. Heringa

Praktische informatie

Voor dit vak moet je zelf intekenen.

Voor dit vak kun je last-minute intekenen.

Werkvormen Hoorcollege, Computerpracticum
Doelgroepen

Dit vak is ook toegankelijk als: