Course: Algorithms for Biological Sequence Analysis

Fall semester, 2007

Tuesday 9:10 – 12:10, 107 CSIE Building.

3 credits

Web site: http://www.csie.ntu.edu.tw/~kmchao/seq07fall

Instructor: Kun-Mao Chao (趙坤茂)

Teaching assistant: Yi-Ching Chen (陳怡靜)  email: d94010@csie...

 

Prerequisites: Some basic knowledge on algorithm development and program design is required. Background in bioinformatics and computational biology is welcome but not required for taking this course.

 

Coursework:

    Homework assignments and Class participation (15%)

    Two midterm exams (60%; 30% each):

        1. November 6, 2007  Problems

        2. December 18, 2007 Problems

    Oral presentation of selected papers (25%)

 

Our classmates: I  II  III  IV  V  VI  VII  VIII  IX

 

Class PowerPoint slides:

    Introduction (Oct. 2, 2007)

    Interesting Sequences (Oct. 2, 2007)

    Algorithmic Strategies (Oct. 9, 2007)

    Sequence Alignment (Oct. 16, 2007)

    Space-Saving Strategies (Oct. 23, 2007)
    Suboptimal Alignment (Oct. 30, 2007)

    K Best Local Alignments (Nov. 13, 2007)

    RMSQ (Nov. 27, 2007)

    Heaviest Segments (Dec. 4, 2007)

    SNP (Dec. 11, 2007)

 

Supporting materials:

  1. Basic Concepts of DNA, Proteins, Genes and Genomes (Oct. 30, 2007) -- Kun-Mao Chao

  2. Basic Algorithmic Strategies (Oct. 30, 2007) -- Kun-Mao Chao

  3. Sequence Alignment (Oct. 30, 2007) -- Kun-Mao Chao

  4. A Note on the Scoring Schemes (Oct. 30, 2007) -- Kun-Mao Chao

  5. Chao, K. -M., Pearson, W. R. and Miller, W., 1992, Aligning Two Sequences within a Specified Diagonal Band, Computer Applications in the Biosciences (CABIOS, now Bioinformatics), 8: 481-487

  6. Chao, K. -M., 1994, Computing All Suboptimal Alignments in Linear Space, Combinatorial Pattern Matching '94, Lecture Notes in Computer Science 807, 31-42, California, USA.

  7. Chao, K. -M., Hardison R. C. and Miller, W., 1994, Recent Developments in Linear-Space Alignment Methods: a Survey, Journal of Computational Biology, 1: 271-291.

  8. Chen, K.-Y. and Chao, K.-M.,  2007, “On the Range Maximum-Sum Segment Query Problem,” Discrete Applied Mathematics, 155: 2043-2052.

  9. A Note on Computing Heaviest Segments (Nov. 27, 2007) -- Kun-Mao Chao

  10. Huang, Y.-T., Zhang, K., Chen, T. and Chao, K.-M., 2005, “Selecting Additional Tag SNPs for Tolerating Missing Data in Genotyping,” BMC Bioinformatics, 6: 263.

  11. Huang, Y.-T., Chao, K.-M., and Chen, T., 2005, “An Approximation Algorithm for Haplotype Inference by Maximum Parsimony,” Journal of Computational Biology, 12: 1261-1274.

  12. Chang, C.-J., Huang, Y.-T., and Chao, K.-M., 2006, “A Greedier Approach for Finding Tag SNPs,” Bioinformatics, 22: 685-691.

 

Homework assignments:

#1: Handout: Oct. 2, 2007; Due: Oct. 12, 2007

    Send TA a PowerPoint page including:

         Your Name

         A sequence which, in your opinion, is sort of interesting or inspiring, noting if it’s your own creation or collection.

         A short reason explaining why it’s interesting or inspiring.

   A SAMPLE SOLUTION (Oct. 2, 2007)

   YOUR SOLUTIONS (Oct. 16, 2007)
 

 

Class presentations:

1.      Maximum number of team members: 6

2.      Each member is required to present in turn;

3.      Revised slides should be sent to me one week after the presentation; (Please compress your figures.)

4.      Questions in class are always welcome;

Selected papers for presentation:

1. November 13, 2007 (高孟駿、蕭安辰、廖凱傑、王鴻愷)

S.F. Altschul, T. L. Madden, A.A. Sch{\"{a}}ffer, J. Zhang, Z. Zhang, W. Miller and D. Lipman, Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research 25 (1997), 3389--3402.

W. James Kent
BLAT-The BLAST-Like Alignment Tool
Genome Res. 2002 12: 656-664. Published in Advance March 20, 2002, 10.1101/gr.229202. Article published online before March 2002 [Abstract] [Full Text]

2. November 20, 2007 (王人晟、陳冠伶、邱冠凱、賴怡玲、張孝澤、包鈺綸)
 
Bin Ma, John Tromp, Ming Li,  PatternHunter: faster and more sensitive homology search. Bioinformatics, 18(3):440-445. March 2002.
[ Abstract ] [ 02ph.pdf, 310Kb ] [ BibTeX ] [ PubMed ]

Ming Li, Bin Ma, Derek Kisman, John Tromp, PatternHunter II: Highly Sensitive and Fast Homology Search. Journal of Bioinformatics and Computational Biology, 2(3):417-439. 2004.
[ Abstract ] [ 04ph2.ps, 341Kb ] [ Download from publisher ] [ BibTeX ]
 
3. December 25, 2007 (王三源、吳韋霖、林佳慶、郭慶徵、謝東宇、宋彥朋)

Webb Miller et al., 28-way vertebrate alignment and conservation track in the UCSC Genome Browser: publication, supplement, sample source code (Genome Research, published online on Nov 5, 2007.)

4. January 8, 2008 (陳琨、朱安強、林晏禕、陳縕儂、呂哲安、楊孟翰、翁翊鐘)
 
Network alignment (References will be given in class.)
 
References (Recommended, but not required):
1. Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology, by Dan Gusfield (1997)
2. Biological Sequence Analysis, by Richard Durbin et al. (1998)
3. Computational Molecular Biology: An Algorithmic Approach, by Pavel Pevzner (2000)
4. An Introduction to Bioinformatics Algorithms, by Neil C. Jones and Pavel Pevzner (2004)
5. Related journal and conference papers
 
Some smart guys in this field (biological sequence analysis) you might wish to know:
Webb C. Miller and Miller Lab.
Michael S. Waterman
Russell F. Doolittle
Pavel Pevzner
Eugene W. Myers
David J. Lipman
Stephen F. Altschul
William R. Pearson
Samuel Karlin
Eric S. Lander
David Sankoff
Gary D. Stormo
Daniel M. Gusfield
Warren Gish
Martin Vingron
Vineet Bafna
Minoru Kanehisa (金久  )
Ming Li (李明)
Tao Jiang (姜濤)
Xiaoqiu Huang (黃曉秋)
Louxin Zhang (張洛欣)
Ting Chen (陳梃)
… to be continued.
Kun-Mao Chao (趙坤茂; 不好意思 互相稱讚求進步~~~)
 
Useful links:
  1. NCBI    
  2. NCBI Education
  3. Molecular Biology for Computer Scientists by Lawrence Hunter
  4.  Developing Bioinformatics Computer Skills by Cynthia Gibas & Per Jambeck
  5. Sense from Sequences: Stephen F. Altschul on Bettering BLAST , July/August 2000 (A story about BLAST)
  6. Initial sequencing and analysis of the human genome (15 February 2001 Nature 409, 860 - 921 (2001)
  7.  The Sequence of the Human Genome (Science 2001 February 16; 291: 1304-1351)
  8. An evaluation of the draft human genome sequence (Nature Genetics 29, 88 - 91 (01 Sep 2001) Letters)
  9.  Initial sequencing and comparative analysis of the mouse genome (Nature 420, 520 - 562 (2002))
  10. The UCSC Genome Browser
  11. The Chimpanzee Genome (Nature; Sept. 1, 2005)
  12. The Chimpanzee Sequencing and Analysis Consortium25, "Initial sequence of the chimpanzee genome and comparison with the human genome," Nature 437, 69-87 (1 September 2005) | doi: 10.1038/nature04072
  13. The HapMap Project (Nature; Oct. 27, 2005)
    Please visit:
    http://www.nature.com/nature/journal/v437/n7063/index.html
     
    and in particular the following research articles:
    http://www.nature.com/nature/journal/v437/n7063/index.html#Article
Bibliography links: