Course: Algorithms for Biological Sequence Analysis
Fall semester, 2006
Tuesday 10:20
– 13:10,
111 CSIE Building.
3 credits
Web site:
http://www.csie.ntu.edu.tw/~kmchao/seq06fall
Instructor:
Kun-Mao Chao
(趙坤茂)
Teaching assistant: Hung-Lung Wang (王弘倫) email: f92085@csie.ntu.edu.tw
Prerequisites: Some basic knowledge on algorithm development and program design is
required. Background in bioinformatics and computational biology is welcome but
not required for taking this course.
Coursework:
Homework assignments and Class
participation (15%)
Two midterm exams (30%
each): November 7, 2006 and December 19, 2006
(tentatively)
Midterm #1 (Nov. 7, 2006)
Oral presentation of selected papers (25%)
Our classmates: I
II III
IV V
Class PowerPoint slides:
First Class (Sept. 19, 2006)
Dynamic Programming -- a Quick Review (Sept. 26, 2006)
Sequence Alignment (I,
Smith-Waterman) (Sept.
26 and Oct. 3, 2006)
Sequence Alignment (II, Various scoring
scheme) (Oct. 17 and
Oct. 24,
2006)
Suboptimal Alignment (Delta points)
(Oct. 31, 2006)
Sequence_Alignment_(III, FASTA &
BLAST) (Nov. 14, 2006)
Heaviest Segments in a Number
Sequence (Dec. 5, 2006)
RMSQ (Dec. 5, 2006)
SNP (Dec. 12, 2006)
Supporting materials: (Read 1~7 for the first midterm exam)
-
Basic Concepts of DNA, Proteins, Genes and Genomes --
Kun-Mao Chao
-
Dynamic Programming -- a Quick Review --Kun-Mao Chao
-
Sequence Alignment --Kun-Mao Chao
-
A Note on the Scoring Schemes -- Kun-Mao Chao
-
Chao, K. -M., Pearson, W. R.
and Miller, W. , 1992,
Aligning Two Sequences within a Specified Diagonal Band, Computer
Applications in the Biosciences (CABIOS, now Bioinformatics), 8:
481-487
-
Chao, K. -M., 1994,
Computing All Suboptimal Alignments in Linear Space, Combinatorial
Pattern Matching '94, Lecture Notes in Computer Science 807, 31-42,
California, USA.
-
Chao, K. -M., Hardison R. C.
and Miller, W. , 1994,
Recent Developments in Linear-Space Alignment Methods: a Survey, Journal
of Computational Biology, 1: 271-291.
-
Heaviest Segments in a Number Sequence
--Kun-Mao Chao
-
RMSQ -- Kuan-Yu Chen and Kun-Mao Chao
-
Huang, Y.-T., Zhang, K., Chen, T. and
Chao, K.-M., 2005, “Selecting
Additional Tag SNPs for Tolerating Missing Data in Genotyping,” BMC
Bioinformatics, 6: 263.
-
Huang, Y.-T., Chao, K.-M.,
and Chen, T., 2005, “An
Approximation Algorithm for Haplotype Inference by Maximum Parsimony,”
Journal of Computational Biology, 12: 1261-1274.
-
Chang, C.-J., Huang, Y.-T., and Chao, K.-M., 2006,
“A
Greedier Approach for Finding Tag SNPs,” Bioinformatics,
22: 685-691.
-
The Human Project and Beyond
-
Basic Local Alignment Search Tool -- SF Altschul, W Gish, W Miller, EW
Myers, DJ Lipman
(J Mol Biol. 215(3):403-410, 1990)
-
Gapped BLAST
and PSI-BLAST -- SF Altschul, TL Madden, AA Schaffer, J Zhang, Z Zhang, W
Miller and DJ Lipman
(Nucleic Acids Research, 25(17): 3389-3402, 1997)
-
CLUSTAL W -- J D Thompson, D G Higgins, and T J Gibson
(Nucleic Acids Res. 22(22): 4673–4680, 1994)
Class presentations:
1.
Maximum number of team members:
6
2.
Each member is required to present in
turn;
3.
Revised slides should be sent to me one
week after the presentation; (Please compress your figures.)
4.
Questions in class are always welcome;
Selected papers for presentation:
- (November 21, 2006)
S.F. Altschul, T. L. Madden, A.A. Sch{\"{a}}ffer, J. Zhang,
Z. Zhang, W. Miller and D. Lipman, Gapped BLAST and PSI-BLAST:
a new generation of protein database search programs. Nucleic Acids
Research 25 (1997), 3389--3402.
Plus an introduction to NCBI.
李定達、曾文鴻、鄭為正、佘健生
Slides
- (November 28, 2006)
Bin Ma, John Tromp, Ming Li. PatternHunter: faster and more sensitive
homology search. Bioinformatics, 18(3):440-445. March 2002.
[
Abstract ] [
02ph.pdf,
310Kb ] [
BibTeX ] [
PubMed ]
Ming Li, Bin Ma, Derek Kisman, John Tromp.
PatternHunter II:
Highly Sensitive and Fast Homology Search. Journal of
Bioinformatics and Computational Biology, 2(3):417-439. 2004. Early
version in GIW 2003..
[
Abstract ] [
04ph2.ps,
341Kb ] [ Download from
publisher ] [
BibTeX ]
Derek Kisman, Ming Li, Bin Ma, Li Wang. tPatternHunter: gapped, fast and
sensitive translated homology search. Bioinformatics,
21(4):542-544. February 2005.
[
Abstract ] [
Download from publisher ] [
BibTeX ] [
PubMed ]
洪錫全、郭立翔、張智翔、鍾承宏、王凱平、莊謹譽
Slides
Suggested_problems
- (December 26, 2006)
W. James Kent
BLAT-The BLAST-Like Alignment Tool
Genome Res. 2002 12: 656-664. Published in Advance March 20,
2002, 10.1101/gr.229202. Article published online before March 2002
[Abstract]
[Full Text]
Scott Schwartz, W. James Kent, Arian Smit, Zheng Zhang, Robert Baertsch,
Ross C. Hardison, David Haussler, and Webb Miller
Human-Mouse Alignments with BLASTZ
Genome Res. 2003 13: 103-107.
[Abstract]
[Full Text]
楊伍隆、李孟韓、陳韋仰、王煜樟
Slides
- (January 2, 2007)
Alignment of Whole
Genomes (148K, PDF format) by A.L. Delcher, S. Kasif, R.D.
Fleischmann, J. Peterson, O. White, and S.L. Salzberg. Nucleic Acids
Research, 27:11 (1999), 2369-2376.
Fast Algorithms for Large-scale Genome Alignment and Comparison. A.L.
Delcher, A. Phillippy, J. Carlton, and S.L. Salzberg. Nucleic Acids
Research (2002), Vol. 30, No. 11 2478-2483
The MUMmer Homepage
游騰楷、曾俊雄、王慧芬、杜海倫
Slides
- (January 9, 2007)
Michael Brudno,
Chuong B. Do, Gregory M. Cooper, Michael F. Kim, Eugene Davydov, NISC
Comparative Sequencing Program, Eric D. Green, Arend Sidow, and Serafim
Batzoglou
LAGAN and
Multi-LAGAN: Efficient Tools for Large-Scale Multiple Alignment of Genomic DNA
Genome Res., Apr 2003; 13: 721 - 731 ; doi:10.1101/gr.926603
Michael Brudno, Alexander Poliakov, Asaf Salamov, Gregory M. Cooper, Arend
Sidow, Edward M. Rubin, Victor Solovyev, Serafim Batzoglou, and Inna Dubchak
Automated
Whole-Genome Multiple Alignment of Rat, Mouse, and Human
Genome Res., Apr 2004; 14: 685 - 692 ; doi:10.1101/gr.2067704
藍永倫、陳婕妤、文國煒、宗灝、戴邦炘
Slides
References (Recommended, but not required):
1. Algorithms on
Strings, Trees, and Sequences: Computer Science and Computational
Biology, by Dan Gusfield (1997)
2. Biological Sequence Analysis, by Richard Durbin et
al. (1998)
3. Computational Molecular Biology: An Algorithmic
Approach, by Pavel Pevzner (2000)
4. An Introduction to Bioinformatics Algorithms, by Neil C. Jones and
Pavel Pevzner (2004)
5. Related
journal and conference papers
Some smart guys in this field (biological
sequence analysis) you might wish to know:
Webb C. Miller
and Miller Lab.
Michael S. Waterman
Russell F. Doolittle
Pavel Pevzner
Eugene
W. Myers
David J. Lipman
Stephen F. Altschul
William R.
Pearson
Samuel
Karlin
Eric S. Lander
David
Sankoff
Gary
D. Stormo
Daniel M.
Gusfield
Warren
Gish
Vineet Bafna
Minoru Kanehisa
(金久 實)
Ming
Li (李明)
Tao
Jiang (姜濤)
Xiaoqiu
Huang (黃曉秋)
Louxin Zhang
(張洛欣)
Ting Chen
(陳梃)
… to be continued.
Kun-Mao Chao (趙坤茂; 不好意思
互相稱讚求進步~~~)
Useful links:
- NCBI
NCBI Education
-
Molecular Biology for Computer
Scientists by Lawrence Hunter
-
Developing
Bioinformatics Computer Skills by Cynthia Gibas & Per Jambeck
- Sense
from Sequences: Stephen F. Altschul on Bettering BLAST , July/August 2000 (A story about BLAST)
-
Initial
sequencing and analysis of the human genome
(15 February 2001 Nature 409, 860 - 921
(2001)
-
The
Sequence of the Human Genome (Science 2001
February 16; 291: 1304-1351)
-
An
evaluation of the draft human genome sequence (Nature Genetics 29, 88 - 91 (01 Sep 2001) Letters)
-
Initial
sequencing and comparative analysis of the mouse genome (Nature 420, 520
- 562 (2002))
- The
UCSC Genome Browser
-
The
Chimpanzee Genome (Nature; Sept. 1, 2005)
The Chimpanzee Sequencing and Analysis
Consortium25,
"Initial
sequence of the chimpanzee genome and comparison with the human genome,"
Nature 437, 69-87 (1
September 2005) |
doi: 10.1038/nature04072
-
The HapMap Project (Nature; Oct. 27, 2005)
Please visit:
and in particular the following research articles:
Bibliography links:
-
PubMed
(PubMed, a
service of the National Library of Medicine, provides access to over 12 million MEDLINE citations back to the mid-1960's and additional life science journals. PubMed includes links to many sites providing full text articles and other
related resources.)
- Google Scholar
(Google Scholar provides a simple way to broadly search for scholarly
literature. From one place, you can search across many disciplines and
sources: peer-reviewed papers, theses, books, abstracts and articles, from
academic publishers, professional societies, preprint repositories,
universities and other scholarly organizations. Google Scholar helps you
identify the most relevant research across the world of scholarly research.)