Algorithms in Sequence Analysis

Dit vak wordt in het Engels aangeboden. Omschrijvingen kunnen daardoor mogelijk alleen in het Engels worden weergegeven.

Doel vak

Have you ever wondered how we can track a gene across 3 billion years of
evolution? Sequence alignment can be used to compare genes from humans
and bacteria, using a dynamic programming algorithm. In this course we
focus on algorithms for biological sequences that can be applied to real
scientific problems in biology.

Students will obtain in-depth knowledge about the theory of sequence
analysis methods. They will also develop understanding and skills to
apply the algorithms to protein and DNA sequences. We would like to
stress that no biological knowledge is required to enter this course.

- At the end of the course, the student will be aware of the major
issues, methodology and available algorithms in sequence analysis.
- At the end of the course, the student will have hands-on experience in
tackling biological problems using sequence analysis algorithms and
applying the general statistical framework of Hidden Markov Models.
- At the end of the course, the student will be able to implement
several of the most important algorithms in sequence analysis.

Inhoud vak

- Dynamic programming, database searching, pairwise and multiple
alignment, probabilistic methods including hidden markov models, pattern
matching, entropy measures, evolutionary models, and phylogeny.

- Programming (in Python) an alignment algorithm based on dynamic
- Aligning sequencing data from tumors to the human genome and analysing
structural variants
- Programming (in Python) an implementation of Hidden Markov Models and
using it to predict protein domain structure


13 Lectures: 2 two-hour lectures per week
13 Computer practicals and associated assignments: 2 two-hour hands-on
sessions per week


The final grade for this course will consist of 50% practical work (see
above) and 50% theoretical assessment.
The theoretical assessment will be an oral and/or written exam
(depending on number of students).

Vereiste voorkennis

Bachelor in any science discipline (including medicine).
Basic programming skills (Python) and an interest in biological


Course material on
Books: Durbin, R., Eddy, S.R., Krogh, A., Mitchison, G.. Biological
Sequence Analysis. Cambridge University Press, 1998, 350 pp., ISBN
Recommended reading: Marketa Zvelebil and Jeremy O. Baum Understanding
Bioinformatics Garland Science 2008 ISBN-10: 0-8153-4024-9


mAI, mBio, mCS

Overige informatie

BYOD policy (Bring Your Own Device)
We expect students in this course to use their own laptop. This laptop
should at the very least support an SSH client, for remote shell access
to the VU Linux servers. Ideally, this laptop supports a command line
shell, Python 3 and a text editor with syntax highlighting -- either
standalone (e.g. Atom or Sublime Text) or as part of a simple IDE (e.g.
Spyder). As such, we recommend the Anaconda python distribution
regardless of operating system, along with PuTTy or PowerShell for
Windows users specifically.

If you are considering purchasing new hardware, we recommend the
o Processor: Intel i5 / AMD Ryzen 5 or above
o Memory: At least 4GB RAM
o Storage: At least 512GB harddisk space
o Operating System: Ubuntu 16.04

The course is taught in English.

Algemene informatie

Vakcode X_405050
Studiepunten 6 EC
Periode P2
Vakniveau 600
Onderwijstaal Engels
Faculteit Faculteit der Bètawetenschappen
Vakcoördinator prof. dr. J. Heringa
Examinator prof. dr. J. Heringa
Docenten prof. dr. J. Heringa

Praktische informatie

Voor dit vak moet je zelf intekenen.

Voor dit vak kun je last-minute intekenen.

Werkvormen Hoorcollege, Computerpracticum

Dit vak is ook toegankelijk als: