Yufeng Wu

ywu@engr.uconn.edu

Please read carefully, although I will not cover much of these in class.

This course covers important techniques in bioinformatics.

The goal is to present an overall picture of bioinformatics and

inspire students to pursue research in this fast-developing field.

Outline. This course is lecture-based. Homeworks will be assigned on major

subjects covered in the class. There will be a (possibly take-home) final exam.

In addition, each student will work on individual programming projects.

In particular, the planned subjects are:

1) Exact string matching. New developments in Suffix Trees and Arrays.

We will review the basics of suffix trees and arrays, and

then look at some recent work that uses string matching in bioinformatics.

2) Sequence analysis. This includes space-efficient pairwise alignment and

multiple sequence alignment.

3) Hidden Markov models (HMM). Concepts, algorithms and applications of

HMM in bioinformatics.

4) Phylongey. Various classical phylogenetic methods. We will also cover

the currently widely used phylogenetic methods, including parsimony,

Neighbor Joining Algorithm and maximum

likelihood. We will also cover perfect phylogeny problem.

5) Other topics, which may include genome rearrangement, biological networks and related

algorithmic problems, gene regulation, and structural bioinformatics.

Prerequisites. As for background, essentially no biology is assumed.

The most relevant background is some knowledge of probability and know how to write programs.

Also, an undergraduate course on algorithms will help, but not required.

Textbook: The required textbook is:

Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids

by Richard Durbin, Sean R. Eddy, Anders Krogh, and Graeme Mitchison, 1999.

This book is widely used as the textbook for teaching statistical aspects of bioinformatics.

We will use it from time to time. I also recommend the following book:

Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology

by Dan Gusfield, 1997. This is an excellent introduction for people from computer science background.

We will cover topics related to both books.

be accepted. Please acknowledge the source of any ideas. You may share

ideas with someone else as long as you acknowledge them.

If you work with one or more person on a writeup then

you should turn in a single writeup together.

Yes, there will be no partner in your project. The projects will be related to what

is presented in class and helps you to get a feeling about what is like to develop

a bioinformatics software tool. There is no restriction on the programming language to use.

But keep in mind that the performance of the programs matters.

Graduate Students. Graduate students are required to do extra

work for this course. Each graduate student will need to read a research paper in bioinformatics,

understand it and then write a short (perhaps 2-4 pages) document on it.

Remember that your goal is not to repeat what the author(s) said. Instead,

I would like to see some interesting or semi-interesting ideas (observations, extensions,

etc.). If you prefer, you can also do a small research project on your own about anything

you think interesting in bioinformatics. In either case, you have the freedom

to choose subject, but make sure to email me to get permission on papers/subjects.

Exams. There is no midterm exams for this course. There will be a final

exam, which may be a take-home exam. For graduate students,

there will be a 25 minutes discussion with you in my office. The subject is likely to be the project you did.

Do not worry, this is not an exam, just a chance for me to see what students learn

in the course and what interests students have.

Grading. This is a non-required course. I expect you register it because

you are interested in it and want to learn something about bioinformatics.

I am required to assign a grade. The grade will be assigned by:

homework, project report and discussion (for graduate student), and final.