CSE 5860:  Computational Problems in Evolutionary Genomics, Spring 2014

Instructor: Yufeng Wu

Lecture: Tuesday/Thursday 11:00--12:15 pm.

Office Hour: ITE 235, Wednesday 9:30-12:00 and 2:00-4:30, or by appointment.
Note: some materials may be posted on HuskyCT.


Course Description. See the Syllabus.


Th: Exam.

Tu: Student presentation
Papers presented: inference of demographic history;
Th: Student presentation

Tu: review and more applications and extensions
Papers presented: Estimating heterozygosity from sequences; inference of population structure from haplotypes; analyze incomplete lineage sorting and hybridization; MCMC for coalescent likelihood computation; haplotype matching.
Th: Inference of population demographic history.

Tu: Recombination: the SMC model.

The paper by Li and Durbin, 2011.

The paper on SMC by McVean and Cardin.
Th: Algorithms for building ARGs.

Tu: Recombination: galled tree. Heuristics.
My paper on self-derivability and counting of minARGs.
Two galled tree papers: paper 1 and paper 2.
The heuristics on constructing ARGs is from: the paper by Song, Wu and Gusfield.
Th: Discussion of projects.

Tu: Recombination: ancestral recombination graph and History bound.

History bound was from the paper by Myers and Griffiths, and improved by Bafna and Bansal.
Th: PPH: perfect phylogeny haplotyping

Tu: Haplotype inference.

The original PPH paper by Gusfield.

The PHASE paper.
Th: Lower bound on the minimum number of recombination

Tu: Coalescent with recombination.

The haplotype bound paper talked about in class by Myers and Griffiths.
The paper on improved haplotype bound by Song, Wu and Gusfield.

Here is a chapter written by Gusfield about recombination. This provides some more accessible introduction to recombination to non-biologists
Here is a paper on coalescent ancestral configurations in recombination analysis.

Assignment: HW3 is out. Due: 3/27 in HuskyCT.
Th: Markov chain Monte Carlo. Gusfield's algorithm on perfect phylogeny.

Tu: Monte Carlo methods
Gusfield's paper on perfect phylogeny.

Wakeley's book: Ch. 8.4. Here is a paper on importance sampling.
Th: The Griffiths-Tavare method. Importance sampling.

Tu: Infinite sites model. The Griffiths-Tavare method.
Wakeley's book, Ch. 8.4.
Importance sampling is a large topic in Monte Carlo methods. See, for example, this book chapter.
Th: Ewens sampling formula (ESF)

Tu: Probability on coalescent: expected SFS; number of allele types on infinite alleles model.
Wakeley's book, Ch. 4.1 and 4.2.
The paper on proof of ESF

Assignment: HW2 is out. Due: 3/3 in HuskyCT.
Th: no class (snow day)

Tu: Coalescent and statistic for polymorphisms
Wakeley's book, Ch. 4.1. Also, see Wakeley's book Chapter 3.
Th: Basic probability for coalescent.

Tu: Varying population size.
Variable population size is covered in Tavare's book: up to page 24.
Basic probability: Chapter 2 of Wakeley's book (on reserve in the library). You really should read this chapter if you are not confident about probability.
Th: Basic coalescent; properties .

Tu: Diffusion approximation on Wright-Fisher model.

Read Tavare's book: up to page 24.
It does not cover much of diffusion process. So refer to this book chapter on diffusion on Wright-Fisher model. I feel it is more accessible than several other alternatives. Not surprisingly, it is mathematical, but if you can read it you will understand more.

Assignment: HW1 is out. Due: 2/13 in HuskyCT.
Th: Wright-Fisher model. Probability

Tu: Introduction to population genetics.

Basic concepts and models.
Gillespie's book is now on reserve in the library.
Prof. Holsinger at UConn has lecture notes on population genetics.