Course: Algorithms for Biological Sequence Analysis

Fall semester, 2005

Wednesday 14:20 – 17:20, 111 CSIE Building.

3 credits

Web site: http://www.csie.ntu.edu.tw/~kmchao/seq05fall

Instructor: Kun-Mao Chao (趙坤茂)

Teaching assistant: Yao-Ting Huang (黃耀廷) email: d92023@csie.ntu.edu.tw

 

Prerequisites: Some basic knowledge on algorithm development and program design is required. Background in bioinformatics and computational biology is welcome but not required for taking this course.

 

Preliminary course outlines

 

Coursework:

Programming assignments and Class participation (30%)

Midterm exam (40%)

Final project (Oral presentation of selected papers) (30%)

 

Our classmates I  II  III  IV V

 

Class PowerPoint slides:

    First Class (Sept. 21, 2005)

    Dynamic Programming (a quick review) (Sept. 28, 2005)

    Heaviest Segments in a Number Sequence (Sept. 28, 2005)

    RMSQ (Oct. 5, 2005)

    Sequence Alignment (Oct. 5/12/19, 2005)

    Linear-Space Alignment Methods (Oct. 19, 2005)

    Haplotypes and SNPs (Oct. 26, 2005)

    Delta Points (Nov. 2, 2005)

 

Supporting materials:

Midterm exam: (Nov. 9, 2005; in class)
    Problems

 

Programming assignment: (Given: Oct. 26, 2005; Due: Dec. 14, 2005)

    Maximum number of team members: <6

    Please refer to the Haplotypes and SNPs (Oct. 26, 2005)

    A sample LD result (TA will explain this later.)
    Online Demo (Dec. 21, 2005)

 

Class presentations:

1.      Maximum number of team members: ~5

2.      Each member is required to present in turn;

3.      Revised slides should be sent to me one week after the presentation; (Please compress your figures.)

4.      Every student is required to submit a brief note right after the presentation;

5.      Questions in class are always welcome;

The schedule of our class presentation:

  1. Bin Ma, John Tromp, Ming Li. PatternHunter: faster and more sensitive homology search. Bioinformatics, 18(3):440-445. March 2002.
    [ Abstract ] [ 02ph.pdf, 310Kb ] [ BibTeX ] [ PubMed ]

    Ming Li, Bin Ma, Derek Kisman, John Tromp. PatternHunter II: Highly Sensitive and Fast Homology Search. Journal of Bioinformatics and Computational Biology, 2(3):417-439. 2004. Early version in GIW 2003..
    [ Abstract ] [ 04ph2.ps, 341Kb ] [ Download from publisher ] [ BibTeX ]

    Derek Kisman, Ming Li, Bin Ma, Li Wang. tPatternHunter: gapped, fast and sensitive translated homology search. Bioinformatics, 21(4):542-544. February 2005.
    [ Abstract ] [ Download from publisher ] [ BibTeX ] [ PubMed ]
    Nov. 16, 2005 趙坤茂 陳奕先 (tPatternHunter) 林語君 (PatternHunter II)

    Fast Local Alignment Tools (Nov. 16, 2005)
    PatternHunter II (Nov. 16, 2005)
    tPatternHunter (Nov. 16, 2005)
     
  2. W. James Kent
    BLAT-The BLAST-Like Alignment Tool
    Genome Res. 2002 12: 656-664. Published in Advance March 20, 2002, 10.1101/gr.229202. Article published online before March 2002 [Abstract] [Full Text]

    Plus an introduction to The UCSC Genome Browser.
    Nov. 23, 2005 田知本、巨彥霖、陳任志、游岳齊

    BLAT (Nov. 23, 2005)
    UCSC Genome Browser (Nov. 23, 2005)
     
  3. Scott Schwartz, W. James Kent, Arian Smit, Zheng Zhang, Robert Baertsch, Ross C. Hardison, David Haussler, and Webb Miller
    Human-Mouse Alignments with BLASTZ
    Genome Res. 2003 13: 103-107. [Abstract] [Full Text]  

    Belinda Giardine, Cathy Riemer, Ross C. Hardison, Richard Burhans, Laura Elnitski, Prachi Shah, Yi Zhang, Daniel Blankenberg, Istvan Albert, Webb Miller, W. James Kent, and Anton Nekrutenko
    Galaxy: A platform for interactive large-scale genome analysis
    Genome Res. Published September 16, 2005, 10.1101/gr.4086505 [Abstract] [PDF] [Supplemental Reseach Data]
    Nov. 30, 2005 宋建均、陳怡靜、許秉慧、鄭智懷

    BLASTZ (Nov. 30, 2005)
     
  4. Nicolas Bray and Lior Pachter
    MAVID: Constrained Ancestral Alignment of Multiple Sequences
    Genome Res. 2004 14: 693-699. [Abstract] [Full Text]

    Mathieu Blanchette, W. James Kent, Cathy Riemer, Laura Elnitski, Arian F.A. Smit, Krishna M. Roskin, Robert Baertsch, Kate Rosenbloom, Hiram Clawson, Eric D. Green, David Haussler, and Webb Miller
    Aligning Multiple Genomic Sequences With the Threaded Blockset Aligner
    Genome Res. 2004 14: 708-715. [Abstract] [Full Text] [Supplemental Reseach Data]
    Dec. 7, 2005 陳明江、羅正偉、張家榮、袁維均

    MAVID (Dec. 7, 2005)
    TBA (Dec. 7, 2005)
     
  5. 3DCoffee: Combining Protein Sequences and Structures within Multiple Sequence Alignments.
    O. O'Sullivan, K Suhre, C. Abergel, D.G. Higgins, C. Notredame. Journal of Molecular Biology,Vol 340, pp385-395,2004[pdf]

    T-Coffee: A novel method for multiple sequence alignments.
    C.Notredame, D. Higgins, J. Heringa,Journal of Molecular Biology,Vol 302, pp205-217,2000[pdf]

    COFFEE: A New Objective Function For Multiple Sequence Alignmnent.
    C. Notredame, L. Holme and D.G. Higgins,Bioinformatics,Vol 14 (5) 407-422,1998[pdf]
    Dec. 14, 2005 戴志華、林與絜、施逸優、吳於芳、黃仁暐

    Coffee Shop (Dec. 14, 2005)

    Dec. 21, 2005 No class. (Please make an appointment with TA for your programming assignment.)
     
  6. Eric J. Alm, Katherine H. Huang, Morgan N. Price, Richard P. Koche, Keith Keller, Inna L. Dubchak, and Adam P. Arkin
    The MicrobesOnline Web site for comparative genomics
    Genome Res. 2005 15: 1015-1022. [Abstract] [Full Text]

    Zhengchang Su, Victor Olman, Fenglou Mao, and Ying Xu
    Comparative genomics analysis of NtcA regulons in cyanobacteria: regulation of nitrogen assimilation and its coupling to photosynthesis
    (Published online 12 September 2005)
    Nucl. Acids Res. 2005 33: 5156-5171. [Abstract] [FREE Full Text] [Print PDF][Screen PDF] [Supplementary Material]

    Junkang Rong, John E. Bowers, Stefan R. Schulze, Vijay N. Waghmare, Carl J. Rogers, Gary J. Pierce, Hua Zhang, James C. Estill, and Andrew H. Paterson:
    Comparative genomics of Gossypium and Arabidopsis: Unraveling the consequences of both ancient and recent polyploidy
    Genome Res. 2005 15: 1198-1210. Published in Advance August 18, 2005, 10.1101/gr.3907305 [Abstract] [Full Text] [PDF] [Supplemental Research Data]
    Dec. 28, 2005 黃雯婷、黃昭綺、修丕承、吳憲國

    NCTA (Dec. 28, 2005)
    FISH (Dec. 28, 2005)
    Comparative genomics of Gossypium and Arabidopsis (Dec. 28, 2005)
     
  7. Guillaume BOURQUE, Zdobnov, E.M., Bork, P., Pevzner P.A., Tesler, G. (2005) "Comparative architectures of mammalian and chicken genomes reveal highly variable rates of genomic rearrangements across different lineages" Genome Research 15 98-110

    Guillaume BOURQUE, Pevzner, P.A., Tesler, G. (2004) "Reconstructing the genomic architecture of ancestral mammals: Lessons from human, mouse, and rat genomes" Genome Research 14(4) 507-516

    Pavel Pevzner and Glenn Tesler Genome Rearrangements in Mammalian Evolution: Lessons From Human and Mouse Genomes
    Genome Res. 2003 13: 37-45. Published in Advance December 30, 2002, 10.1101/gr.757503 [Abstract] [Full Text] [Supplemental Research Data]
    Jan. 4, 2006 黃子國、王維邦、張經略、林虹佑、鐘健元

 

 

References (Recommended, but not required):
1. Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology, by Dan Gusfield (1997)
2. Biological Sequence Analysis, by Richard Durbin et al. (1998)
3. Computational Molecular Biology: An Algorithmic Approach, by Pavel Pevzner (2000)
4. An Introduction to Bioinformatics Algorithms, by Neil C. Jones and Pavel Pevzner (2004)
5. Related journal and conference papers
 
Some smart guys in this field (biological sequence analysis) you might wish to know:
Webb C. Miller and Miller Lab.
Michael S. Waterman
Russell F. Doolittle
Pavel Pevzner
Eugene W. Myers
David J. Lipman
Stephen F. Altschul
William R. Pearson
Samuel Karlin
Eric S. Lander
David Sankoff
Gary D. Stormo
Daniel M. Gusfield
Warren Gish
Vineet Bafna
Minoru Kanehisa (金久  )
Ming Li (李明)
Tao Jiang (姜濤)
Xiaoqiu Huang (黃曉秋)
Louxin Zhang (張洛欣)
Ting Chen (陳梃)
… to be continued.
Kun-Mao Chao (趙坤茂; 不好意思 互相稱讚求進步~~~)
 
Useful links:
  1. NCBI    
  2. NCBI Education
  3. Molecular Biology for Computer Scientists by Lawrence Hunter
  4.  Developing Bioinformatics Computer Skills by Cynthia Gibas & Per Jambeck
  5. Sense from Sequences: Stephen F. Altschul on Bettering BLAST , July/August 2000 (A story about BLAST)
  6. Initial sequencing and analysis of the human genome (15 February 2001 Nature 409, 860 - 921 (2001)
  7.  The Sequence of the Human Genome (Science 2001 February 16; 291: 1304-1351)
  8. An evaluation of the draft human genome sequence (Nature Genetics 29, 88 - 91 (01 Sep 2001) Letters)
  9.  Initial sequencing and comparative analysis of the mouse genome (Nature 420, 520 - 562 (2002))
  10. The UCSC Genome Browser
  11. The Chimpanzee Genome (Nature; Sept. 1, 2005)
  12. The Chimpanzee Sequencing and Analysis Consortium25, "Initial sequence of the chimpanzee genome and comparison with the human genome," Nature 437, 69-87 (1 September 2005) | doi: 10.1038/nature04072
  13. The HapMap Project (Nature; Oct. 27, 2005)
    Please visit:
    http://www.nature.com/nature/journal/v437/n7063/index.html
     
    and in particular the following research articles:
    http://www.nature.com/nature/journal/v437/n7063/index.html#Article
Bibliography links:
1.          PubMed
(
PubMed, a service of the National Library of Medicine, provides access to over 12 million MEDLINE citations back to the mid-1960's and additional life science journals. PubMed includes links to many sites providing full text articles and other related resources.)