CSE 5860: Computational Problems in Evolutionary Genomics - Spring 2014

Yufeng Wu
235 ITEB, ywu@engr.uconn.edu
Office hours:
ITEB 235, Wednesday 9-12 pm or by appointment.

This course covers selected Computational and Mathematical topics in evolutionary genomics.
I will mainly focus on problems arising in population genomics: concepts of population genetics,
coalescent theory and population genetics, and complex evolutionary processes in population
genetics. This course will focus on the computational aspects of population genomics. The
goal of the course is to introduce students the current status of the computational population
genomics and inspire students to pursue research in this fast-developing field.

This course is lecture-based. Students are required to read and present a research paper
(or papers). Each student should also perform some empirical study by implementing
some computational approaches for population genomics problems and/or analyzing
some real population genomics data (e.g. the 1000 Genomes Project data).

In particular, the planned subjects are:

1) Introduction to Population Genetics. This includes basic concepts of population
genetics (mutation, genetic drift, gene flow and selection). 

2) Coalescent theory. Basic models in coalescent theory. Diffusion theory.
Probabilistic computation on coalescent models. Applications of coalescent theory.

3) Recombination. Recombination models. Lower bounds. Ancestral recombination graphs
and related algorithms. Other recombination models.

4) Complex evolutionary models. These may include incomplete lineage sorting and
reticulate evolution.

5) Other topics may include Monte Carlo methods, pedigree analysis and application of
population genetics in forensics.

Prerequisites. As for background, essentially no biology is assumed. The most relevant
background is upper division or graduate courses on algorithms and probability and statistics.
Or a smart, mathematically mature student who has had neither, might also be able to
follow the course.

Textbook: No textbook required. I will post links to papers and other documents.
The following books are useful to this course.

1. Coalescent Theory: An Introduction, by John Wakeley, 2008. This is a good introduction
to coalescent theory. I have requested reservation of this book in the library.

2. Ancestral Inference in Population Genetics, by Simon Tavare. This document used to
be available online but seems no longer so. If I find it, I will post a link.

Homework. I will occasionally assign written homework.

Presentation.  Each student needs to select a particular subject in population genomics
to present to the class. The student should contact the instructor about the subject first.
I prefer the presentation to provide some general background and also cover some interesting
technical aspects. The student presentation will occur during the later part of the course.

Projects. Each student should do some empirical study. Often this means a student needs to
implement some computational approaches for a selected population genomics problem
and then tests its performance. Or a student can choose to analyze real population genetics
data for some population genetics inference. I prefer that each student works on his/her
own project but exception can be made. Alternatively, a student can also choose to conduct
some more theoretical investigation (e.g. designing a faster algorithm for some meaningful problem
in population genomics).
Note that each project needs to be first approved by the instructor.

Exams. I do not plan to give sit-down exams in this course, although this may change according
to the course progress. The current plan is that I will give a take-home final exam.
Also at the end of the class I plan to meet each student in class to discuss what
they learned from the class.

Grading. The grade will be assigned by: homework (15%), project (30%), paper presentation (15%)
 final exam (25%) and discussion (15%).