CSE 5095:  Research Topics in Bioinformatics
Spring 2012

Instructor: Yufeng Wu

Lecture: Tuesday and Thursday 11:00--12:15 pm, ITE 119.

Office Hour: ITE 235, Wednesday 9:30-12:00 and 2:00-4:30, or by appointment.
Note: some materials will be posted on HuskyCT.


Course Description. See the Syllabus.

Problems related to lectures
Schedule. Planned schedule is here, but this is what is really happening:

4/25: Student presentation
X. Wang: Phylogenetic network
Yu Wu:  compressed suffix array

4/23: Student presentation
A. Thibodeau: RNA-seq
N. Tran: Genome assembly

The paper on phylogenetic network based on clusters.
The paper on compressed suffix array: Compressed Suffix Arrays and Suffix Trees with Applications to Text Indexing and String Matching
RNA-seq notes.
4/19: Student presentation
T. Xu: Genome assembly
A.  Al-OKaily: phylogenetic networks

4/17: Student presentation
J. Lindsay: scaffolding for genome assembly
K. Marinelli: I/O issues in computing

Genome assembly using GPU.
A book on phylogenetic network

A scaffolding paper (from RECOMB 2011).
Notes by Kevin on I/O issues.
4/12: No class.

4/10: Student presentaiton
Sequencing applications in metagenomics (J. Zhang)
Topics in DNA forensics (P. .Ghimire)

A paper on metagenomic assembly.

The paper presented in class about match probability in forensics DNA.
4/5: Phylogenetic networks.
Notes by A. Thibodeau

4/3: Subtree prune and regraft algorithms.

This is my own paper about phylogenetic networks.
Read this book for more background and related types of phylogenetic networks.

This is the paper for the fixed parameter tractable algorithm for SPR.
3/29: Subtree prune and regraft.
Notes by A. Thibodeau

3/27:  Algorithms for building ARGs. Introduction to subtree prune and regraft.

This is my own paper about subtree prune and regraft.

This is the paper about tree-based ARG construction.

3/22: Algorithms for building ancestral recombination graph.
Notes by K. Marinelli

3/20: Ancestral recombination graph. The history bound.
Notes by N. Tran.
Notes by Gusfield on ancestral recombination graph.
This is the paper that first develop the history bound.
3/8: Lower bound on recombination.

3/6: Recombination: coalescent view and lower bound.
The haplotype bound paper talked about in class.
Notes by Gusfield on recombinaiton lower bounds and ancestral recombination graphs.

Here is a paper on coalescent ancestral configurations in recombination analysis.
3/1: Perfect phylogeny. Introduction to recombination.

2/28: A little more about DNA forensics. Perfect phylogeny problem.

Notes by X. Wang.
Here is a chapter written by Gusfield about recombination. This provides some more accessible introduction to recombination to non-biologists

The paper on perfect phylogeny linear time construction by Gusfield.

2/23: More probability computation on coalescent theory. A little about DNA forensics.

2/21: Probability computation with coalescent theory

Much of what is discussed in class is Tavare's writing.
Paper by Griffiths and Lessard on Ewens' sampling formula.

Some info about DNA forensics.
An article on DNA forensics, which mentions a problem discussed in class.
2/16: coalescent theory
Notes by J. Zhang

2/14: Genome assembly with paired reads. Introduction to probability theory

See Wakeley's book for (very accessible) introduction to coalescent theory.
Anothere good reference to coalescent theory (although a little harder to read) is Tavare's writing.

Two papers (paper 1 and paper 2) on genome assembly with paired reads.
2/9: Genome assembly (cont.)

2/7: Genome assembly: introduction

Notes on genome assembly (by T. Xu).
Pevzner, et al's paper using Eulerian path.

Large part of genome assembly introduction comes from Gusfield's book.

2/2: More BWT-based reads mapping. Calling genetic variants with sequence reads.

1/31: Short reads mapping (cont.)

Notes on reads mapping (by Yu Wu).
The SNP calling paper discussed in class.

The BWT-based reads mapping: the BWA paper
Compressed suffix array
: Compressed Suffix Arrays and Suffix Trees with Applications to Text Indexing and String Matching
Another paper on compressed suffix array:
1/26: Short sequence reads mapping

1/24: Pattern matching with BWT

s (by P. Ghimire) on BWT.
The short reads mapping paper is: the MAQ paper

The main paper covered on BWT pattern matching is: Opportunistic data structures with applications (FOCS 2000)
1/20: Burrows-Wheeler Transform

1/18: Introduction to string matching.

Notes (by Y. Wu) on string algorithm introduction
There are many online reference on BWT. There are also books on BWT (e.g. link).

Some more notes on suffix tree and array:
An introduction to suffix tree by Dan Gusfield (PDF). A simple introduction to suffix array link.
If you want, you can also read the original paper on linear-time suffix array algorithm (PDF).