BME 4800 and CSE 3800/5800: Bioinformatics - Fall 2017

Yufeng Wu

Please read carefully, although I will not cover much of these in class.

This course covers important techniques in bioinformatics.
The goal is to present an overall picture of bioinformatics and
inspire students to pursue research in this fast-developing field.

Outline. This course is lecture-based. Homeworks will be assigned on major
subjects covered in the class.
There will be a final exam.
In addition, each student will work on individual programming projects.
Students enrolled in CSE 5800 will need to complete additional course work.

In particular, the planned subjects are:

1) Sequence analysis. This includes space-efficient pairwise alignment and
multiple sequence alignment.

2) Introduction to probabilistic and statistical inference. Hidden Markov models (HMM).
Concepts, algorithms and applications of HMM in bioinformatics.

3) Phylogeny. Various classical phylogenetic methods. We will also cover
the currently widely used phylogenetic methods, including parsimony,
Neighbor Joining Algorithm and maximum likelihood.

4) Other topics, which may include genome rearrangement, and structural bioinformatics.

Prerequisites. As for background, essentially no biology is assumed.
The most relevant background is some knowledge of probability and know how to write programs.
Also, an undergraduate course on algorithms will be useful. I expect the students
have some basic skills in algorithms design and analysis. The most frequently used
algorithm design technique will be dynamic programming. You should try to understand
dynamic programming if you haven't learned.

Textbook: The required textbook is:
Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids
by Richard Durbin, Sean R. Eddy, Anders Krogh, and Graeme Mitchison, 1999.
This book is a bit old but still very useful for learning bioinformatics.
I also recommend the following book:
Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology
by Dan Gusfield, 1997.
This is an excellent introduction for people from computer science background.
We will cover some topics taken from this book.

Homeworks. I will appreciate if you can typeset your homeworks. Late homeworks will not
be accepted. Please acknowledge the source of any ideas. You may share
ideas with someone else as long as you acknowledge them.
If you work with one or more person on a writeup then
you should turn in a single writeup together.

Programming assignment. The programming assignment is meant to give you some hands-on experience
about how to develop bioinformatics software tools. In my opinion, this is what a bioinformatics
course offered in computer science department should provide. Each student will work on one or several
programming assignments individually. Yes, there will be no partner in these assignments.
The assignments will be related to what is presented in class and help you to get a feeling about
what is like to develop a bioinformatics software tool.
There is no restriction on the programming language to use.
But keep in mind that the performance of your programs matters.
The programming assignments
may be different for undergraduate and graduate students.

Project. Each student will need to do a course project in bioinformatics.
Undergraduate students can work in small groups, while a graduate student is expected to do an individual project.
Later in the course, I will suggest some possible projects, while students can also come up with their own ideas.
Each project will need to do the following: (i) proposal (to be approved by the instructor), (2) survey of the problem,
(3) status report, (4) presentation and (5) written report.

Students enrolled in CSE 5800. Students in CSE 5800 are required to do extra
work for this course. There are likely additional homework problems only for 5800 students.

Exams. There is no midterm exams for this course. There will be a final exam.

Grading. This is a non-required course. I expect you take it because
you are interested in it and want to learn something about bioinformatics.
I am required to assign a grade. The grade will be assigned based on : 
written homework (20%), programming assignments (25%), course project
(30%), and final (25%).