Yufeng Wu

ywu@engr.uconn.edu

Please read carefully, although I will not cover much of these in class.

This course covers important algorithmic results in bioinformatics

and computational biology. I will cover both established topics

(on, e.g. exact string matching, sequence alignment, phylogeny) and

latest developments in, e.g. population genetics and systems biology.

The goal is to present an overall picture of algorithmic aspects

of bioinformatics and inspire students to pursue research in this

fast-developing field.

This course is lecture-based. Homeworks will be assigned on major

subjects covered in the class. Students are required to read and

present a research paper in algorithmic bioinformatics. There will

be a (possibly take-home) final exam. There is no required programming,

although some optional programming projects are possible.

In particular, the planned subjects are:

1) Exact string matching. New developments in Suffix Trees and Arrays.

We will review the basics of suffix trees and arrays, and

then look at some recent work that use suffix arrays and

their use in bioinformatics.

2) Sequence analysis. This includes space-efficient pairwise alignment,

multiple sequence alignment with provable properties, ideas behind the

popular bioinformatics tool BLAST and latest developments on improving

BLAST.

3) Phylongey. Various classical phylogenetic methods: ultrametric trees,

additive trees, and perfect phylogeny. On perfect phylogeny, we will

cover both classic binary perfect phylogeny and multi-state perfect

phylogeny. We will also cover the currently widely used phylogenetic

methods, including parsimony, Neighbor Joining Algorithm and maximum

likelihood (if time permits).

4) Genome rearrangement. We will sample some results on this interesting

subject, on which seminal results have been obtained.

5) Population genetics. This includes haplotype inference and reconstruction of

networks with recombination. These topics are what I am currently working on.

6) Other subjects, including biological networks and related

algorithmic problems, gene regulation, and structural bioinformatics.

Prerequisites. As for background, essentially no biology is assumed.

The most relevant background is a graduate course on algorithms,

but a serious student who has only had a undergraduate algorithm course,

or a smart, mathematically mature student who has had neither, might also

be able to follow the course.

Textbook: No textbook required. The following books are useful to this course.

1. Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology

by Dan Gusfield, 1997. An excellent introduction for people from computer science background.

2. Inferring Phylogenies by Joseph Felsenstein, 2003. A nice general treatment of phylogenetics.

3. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids

by Richard Durbin, Sean R. Eddy, Anders Krogh, and Graeme Mitchison, 1999.

This book is widely used as the textbook for teaching statistical aspects of bioinformatics.

We will use it from time to time.

Also, this book might be of interests as well.

1. Phylogenetics (Oxford Lecture Series in Mathematics and Its Applications, 24)

by Charles Semple and Mike Steel, 2003.

Contains interesting materials on phylogenetics. Very mathematical.

be accepted. Please acknowledge the source of any ideas. You may share

ideas with someone else as long as you acknowledge them.

If you work with one or more person on a writeup then

you should turn in a single writeup.

biology and bioinformatics, understand it and then write a short (perhaps 2-4 pages)

document on it. Your goal is not to repeat what the author(s) said. Instead,

I would like to see some interesting or semi-interesting ideas (observations, extensions,

etc.). If you prefer, you can also do a small research project on your own about anything

you think interesting in computational biology. In either case, you have the freedom

to choose subject, but make sure to email me to get permission on papers/subjects.

Exams. There is no real sit-in exams for this course. There will be a take-home final

exam, which is more like a comprehensive homework problem set. There will be a 25

minutes discussion with you in my office. The subject is likely to be the project you did.

Do not worry, this is not an exam, just a chance for me to see what students learn

in the course and what interests students have.

Grading. This is a non-required graduate course. I expect you register it because

you are interested in it and want to learn something about bioinformatics.

I am required to assign a grade. The grade will be assigned by:

homework, project report and discussion, and take-home final.