Course: Algorithms for Biological Sequence Analysis
Fall semester, 2005
Wednesday 14:20
– 17:20,
111 CSIE Building.
3 credits
Web site:
http://www.csie.ntu.edu.tw/~kmchao/seq05fall
Instructor:
Kun-Mao Chao
(趙坤茂)
Teaching assistant: Yao-Ting Huang (黃耀廷) email: d92023@csie.ntu.edu.tw
Prerequisites: Some basic knowledge on algorithm development and program design is
required. Background in bioinformatics and computational biology is welcome but
not required for taking this course.
Preliminary course outlines
Coursework:
Programming assignments and Class
participation (30%)
Midterm exam (40%)
Final project (Oral presentation of selected
papers) (30%)
Our classmates I
II III
IV V
Class PowerPoint slides:
First Class (Sept. 21, 2005)
Dynamic Programming (a quick review) (Sept. 28, 2005)
Heaviest Segments in a Number Sequence (Sept.
28, 2005)
RMSQ (Oct. 5, 2005)
Sequence Alignment (Oct. 5/12/19, 2005)
Linear-Space Alignment Methods (Oct. 19, 2005)
Haplotypes and
SNPs (Oct. 26, 2005)
Delta Points (Nov. 2, 2005)
Supporting materials:
-
Dynamic Programming -- a Quick Review
-
Heaviest Segments in a Number Sequence
(Working draft)
-
RMSQ
-
Sequence Alignment
-
Note on the Piecewise Constant Gap
Penalties
-
Haplotype Inference
-
Tag SNPs
-
LD Bins (Science; Feb., 2005)
-
Chao, K. -M., Pearson, W. R.
and Miller, W. , 1992,
Aligning Two Sequences within a Specified Diagonal Band, Computer
Applications in the Biosciences (CABIOS, now Bioinformatics), 8:
481-487.
-
Chao, K. -M., Hardison R. C.
and Miller, W. , 1994,
Recent Developments in Linear-Space Alignment Methods: a Survey, Journal
of Computational Biology, 1: 271-291.
-
Chao, K. -M., 1994,
Computing All Suboptimal Alignments in Linear Space, Combinatorial
Pattern Matching '94, Lecture Notes in Computer Science 807, 31-42,
California, USA.
Midterm exam: (Nov. 9, 2005; in class)
Problems
Programming assignment: (Given: Oct. 26, 2005; Due: Dec. 14, 2005)
Maximum number of team members:
<6
Please refer to the
Haplotypes and
SNPs (Oct. 26, 2005)
A sample LD result (TA will explain this later.)
Online Demo (Dec. 21, 2005)
Class presentations:
1.
Maximum number of team members:
~5
2.
Each member is required to present in
turn;
3.
Revised slides should be sent to me one
week after the presentation; (Please compress your figures.)
4.
Every student is required to submit a
brief note right after the presentation;
5.
Questions in class are always welcome;
The schedule of our class presentation:
- Bin Ma, John Tromp, Ming Li. PatternHunter: faster and more sensitive
homology search. Bioinformatics, 18(3):440-445. March 2002.
[
Abstract ] [
02ph.pdf,
310Kb ] [
BibTeX ] [
PubMed ]
Ming Li, Bin Ma, Derek Kisman, John Tromp.
PatternHunter II: Highly
Sensitive and Fast Homology Search. Journal of Bioinformatics and
Computational Biology, 2(3):417-439. 2004. Early version in GIW 2003..
[
Abstract ] [
04ph2.ps,
341Kb ] [ Download from
publisher ] [
BibTeX ]
Derek Kisman, Ming Li, Bin Ma, Li Wang. tPatternHunter: gapped, fast and
sensitive translated homology search. Bioinformatics,
21(4):542-544. February 2005.
[
Abstract ] [
Download from publisher ] [
BibTeX ] [
PubMed ]
Nov. 16, 2005 趙坤茂 陳奕先 (tPatternHunter) 林語君 (PatternHunter
II)
Fast Local Alignment Tools (Nov.
16, 2005)
PatternHunter II (Nov. 16, 2005)
tPatternHunter (Nov. 16, 2005)
- W. James Kent
BLAT-The BLAST-Like Alignment Tool
Genome Res. 2002 12: 656-664. Published in Advance March 20,
2002, 10.1101/gr.229202. Article published online before March 2002
[Abstract]
[Full Text]
Plus an introduction to
The UCSC Genome Browser.
Nov. 23, 2005 田知本、巨彥霖、陳任志、游岳齊
BLAT (Nov. 23, 2005)
UCSC Genome Browser (Nov. 23,
2005)
- Scott Schwartz, W. James Kent, Arian Smit, Zheng Zhang, Robert Baertsch,
Ross C. Hardison, David Haussler, and Webb Miller
Human-Mouse Alignments with BLASTZ
Genome Res. 2003 13: 103-107.
[Abstract]
[Full Text]
Belinda Giardine, Cathy Riemer, Ross C. Hardison, Richard Burhans,
Laura Elnitski, Prachi Shah, Yi Zhang, Daniel Blankenberg, Istvan Albert, Webb
Miller, W. James Kent, and Anton Nekrutenko
Galaxy: A platform for interactive large-scale genome analysis
Genome Res. Published September 16, 2005, 10.1101/gr.4086505
[Abstract]
[PDF]
[Supplemental Reseach Data]
Nov. 30, 2005 宋建均、陳怡靜、許秉慧、鄭智懷
BLASTZ (Nov. 30, 2005)
- Nicolas Bray and Lior Pachter
MAVID: Constrained Ancestral Alignment of Multiple Sequences
Genome Res. 2004 14: 693-699.
[Abstract]
[Full Text]
Mathieu Blanchette, W. James Kent, Cathy Riemer, Laura Elnitski, Arian
F.A. Smit, Krishna M. Roskin, Robert Baertsch, Kate Rosenbloom, Hiram
Clawson, Eric D. Green, David Haussler, and Webb Miller
Aligning Multiple Genomic Sequences With the Threaded Blockset
Aligner
Genome Res. 2004 14: 708-715.
[Abstract]
[Full Text]
[Supplemental
Reseach Data]
Dec. 7, 2005 陳明江、羅正偉、張家榮、袁維均
MAVID (Dec. 7, 2005)
TBA (Dec. 7, 2005)
- 3DCoffee: Combining Protein Sequences and Structures within Multiple
Sequence Alignments.
O. O'Sullivan, K Suhre, C. Abergel, D.G. Higgins, C. Notredame. Journal of
Molecular Biology,Vol 340, pp385-395,2004[pdf]
T-Coffee: A novel method for multiple sequence alignments.
C.Notredame, D. Higgins, J. Heringa,Journal of Molecular Biology,Vol
302, pp205-217,2000[pdf]
COFFEE: A New Objective Function For Multiple Sequence Alignmnent.
C. Notredame, L. Holme and D.G. Higgins,Bioinformatics,Vol 14 (5)
407-422,1998[pdf]
Dec. 14, 2005 戴志華、林與絜、施逸優、吳於芳、黃仁暐
Coffee Shop (Dec. 14, 2005)
Dec. 21, 2005 No class. (Please make an
appointment with TA for your programming assignment.)
- Eric J. Alm, Katherine H. Huang, Morgan N. Price, Richard P. Koche, Keith
Keller, Inna L. Dubchak, and Adam P. Arkin
The MicrobesOnline Web site for comparative genomics
Genome Res. 2005 15: 1015-1022.
[Abstract]
[Full Text]
Zhengchang Su, Victor Olman, Fenglou Mao, and Ying Xu
Comparative genomics analysis of NtcA regulons in cyanobacteria:
regulation of nitrogen assimilation and its coupling to photosynthesis
(Published online 12 September 2005)
Nucl. Acids Res. 2005 33: 5156-5171.
[Abstract]
[FREE Full
Text] [Print
PDF][Screen
PDF]
[Supplementary Material]
Junkang Rong, John E. Bowers, Stefan R. Schulze, Vijay N. Waghmare, Carl J.
Rogers, Gary J. Pierce, Hua Zhang, James C. Estill, and Andrew H. Paterson:
Comparative genomics of Gossypium and Arabidopsis:
Unraveling the consequences of both ancient and recent polyploidy
Genome Res. 2005 15: 1198-1210. Published in Advance August 18,
2005, 10.1101/gr.3907305
[Abstract]
[Full Text]
[PDF]
[Supplemental
Research Data]
Dec. 28, 2005 黃雯婷、黃昭綺、修丕承、吳憲國
NCTA (Dec. 28, 2005)
FISH (Dec. 28, 2005)
Comparative genomics of Gossypium and Arabidopsis (Dec.
28, 2005)
-
Guillaume BOURQUE, Zdobnov, E.M., Bork, P., Pevzner P.A., Tesler, G.
(2005) "Comparative
architectures of mammalian and chicken genomes reveal highly variable rates of
genomic rearrangements across different lineages" Genome Research 15
98-110
Guillaume BOURQUE, Pevzner, P.A., Tesler, G. (2004) "Reconstructing the
genomic architecture of ancestral mammals: Lessons from human, mouse, and rat
genomes" Genome Research 14(4) 507-516
Pavel Pevzner and Glenn Tesler Genome Rearrangements in Mammalian
Evolution: Lessons From Human and Mouse Genomes
Genome Res. 2003 13: 37-45. Published in Advance December 30,
2002, 10.1101/gr.757503
[Abstract]
[Full Text]
[Supplemental
Research Data]
Jan. 4, 2006 黃子國、王維邦、張經略、林虹佑、鐘健元
References (Recommended, but not required):
1. Algorithms on
Strings, Trees, and Sequences: Computer Science and Computational
Biology, by Dan Gusfield (1997)
2. Biological Sequence Analysis, by Richard Durbin et
al. (1998)
3. Computational Molecular Biology: An Algorithmic
Approach, by Pavel Pevzner (2000)
4. An Introduction to Bioinformatics Algorithms, by Neil C. Jones and
Pavel Pevzner (2004)
5. Related
journal and conference papers
Some smart guys in this field (biological
sequence analysis) you might wish to know:
Webb C. Miller
and Miller Lab.
Michael S. Waterman
Russell F. Doolittle
Pavel Pevzner
Eugene
W. Myers
David J. Lipman
Stephen F. Altschul
William R.
Pearson
Samuel
Karlin
Eric S. Lander
David
Sankoff
Gary
D. Stormo
Daniel M.
Gusfield
Warren
Gish
Vineet Bafna
Minoru Kanehisa
(金久 實)
Ming
Li (李明)
Tao
Jiang (姜濤)
Xiaoqiu
Huang (黃曉秋)
Louxin Zhang
(張洛欣)
Ting Chen
(陳梃)
… to be continued.
Kun-Mao Chao (趙坤茂; 不好意思
互相稱讚求進步~~~)
Useful links:
- NCBI
NCBI Education
-
Molecular Biology for Computer
Scientists by Lawrence Hunter
-
Developing
Bioinformatics Computer Skills by Cynthia Gibas & Per Jambeck
- Sense
from Sequences: Stephen F. Altschul on Bettering BLAST , July/August 2000 (A story about BLAST)
-
Initial
sequencing and analysis of the human genome
(15 February 2001 Nature 409, 860 - 921
(2001)
-
The
Sequence of the Human Genome (Science 2001
February 16; 291: 1304-1351)
-
An
evaluation of the draft human genome sequence (Nature Genetics 29, 88 - 91 (01 Sep 2001) Letters)
-
Initial
sequencing and comparative analysis of the mouse genome (Nature 420, 520
- 562 (2002))
- The
UCSC Genome Browser
-
The
Chimpanzee Genome (Nature; Sept. 1, 2005)
The Chimpanzee Sequencing and Analysis
Consortium25,
"Initial
sequence of the chimpanzee genome and comparison with the human genome,"
Nature 437, 69-87 (1
September 2005) |
doi: 10.1038/nature04072
-
The HapMap Project (Nature; Oct. 27, 2005)
Please visit:
and in particular the following research articles:
Bibliography links:
1.
PubMed
(PubMed, a
service of the National Library of Medicine, provides access to over 12 million
MEDLINE citations back to the mid-1960's and additional life science journals.
PubMed includes links to many sites providing full text articles and other
related resources.)