Yufeng Wu

235 ITEB, ywu@engr.uconn.edu

Office hours: ITEB 235, Wednesday 10-12 pm and 2-4:30 pm or by appointment.

This course covers selected research topics in bioinformatics and computational

biology. I will mainly focus on some latest development in bioinformatics algorithms:

string algorithms and their applicaiton in next generaiton sequencing data analysis,

coalescent theory and population genetics, and complex evolutionary models.

The goal is to present some state-of-the-art computational aspects of bioinformatics

and inspire students to pursue research in this fast-developing field.

This course is lecture-based. Students are required to read and present a research

subject in algorithmic bioinformatics. Each student should also perform some empirical

study by implementing some bioinformatics algorithms.

In particular, the planned subjects are:

1) String matching algorithms and applications in next generaiton sequencing

data analysis. New developments in Suffix Trees and Arrays. Burrows-Wheeler transform.

Genome assembly. Reads mapping. Structrual variations detection.

2) Coalescent theory. Basic models in coalescent theory. Probabilistic computation

on coalescent models. Applications of coalescent theory in genetics.

3) Recombination. Recombination models. Lower bounds. Ancestral recombination graphs

and related algorithms. Other recombination models.

4) Complex evolutionary models. Subtree prune and regraft and related algorithms.

Phylogenetic network models.

Prerequisites. As for background, essentially no biology is assumed. The most relevant

background is a graduate course on algorithms, but a serious student who has only had

a undergraduate algorithm course, or a smart, mathematically mature student who has

had neither, might also be able to follow the course.

Textbook: No textbook required. The following books are useful to this course.

1. Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology

by Dan Gusfield, 1997. An excellent introduction for people from computer science background.

2. Coalescent Theory: An Introduction, by John Wakeley, 2008. This is a good introduction

to coalescent theory. I have requested reservation of this book in the library.

Homework. I have not yet decided whether to assign written homework or not. The current

plan is that each student will write up lecture notes for the presented papers/topics in lectures.

Moreover, each student needs to work on one problem related to lectures (which is mostly

about reading some papers related to what is taught in class).

Presentaiton. Each student needs to select a particular subject in algorithmic bioinformatics

to present to the peer students. The student should contact the instructor about the subject first.

I prefer the presentaiton to provide some general background and also cover some interesting

technical aspects.

Projects. Each student should do some empirical study. Often this means a student needs to

implement some bioinformatics algorithms and tests its performance. I prefer that each student

works on his/her own project but exception can be made. Alternatively, a student can also

choose to conduct some more theoretical investigation (e.g. designing a faster algorithm for

some bioinformtics algorithms). Again, each project needs to be first approved by the instructor.

Exams. I do not plan to give exams in this course, although this may change according to the course progress.

Grading. The grade will be assigned by: lecture notes writing (20%), lecture problems (15%), project (40%),

and presentation (25%).