Course: Algorithms for Biological Sequence Analysis
Fall semester, 2020
9:10 – 12:10 Tuesday, R107@CSIE Building
3 credits
Website: http://www.csie.ntu.edu.tw/~kmchao/seq20fall
Instructor: Kun-Mao Chao (趙坤茂)
Teaching assistant: 陳百榆
r07922177
followed by @ntu.edu.tw
[TA's office hours: by appointment; Venue: R432]
Prerequisites: Some basic knowledge on algorithms is required. Background in bioinformatics and computational biology is welcome but not required for taking this course.
Coursework:
Homework assignments and Class participation (10%)
#1: Handout: September 15, 2020; Due: October 2, 2020 (See below)
Two midterm exams (70%; 35% each):
Midterm #1: Oct. 27, 2020
Midterm #2: Dec. 8, 2020
Oral presentation of selected papers/projects (20%)
[Dec. 15 for preparation],
Dec. 22, Dec. 29, Jan. 5.
Supporting materials:
About This Class Survey [2020/9/15]
Interesting Sequences [2020/9/15]
A Whirlwind Tour of Bioinformatics [2020/9/15]
Basic Algorithmic Strategies
[2020/9/22]
-- Basic Algorithmic Strategies [2020/9/22]
-- Vertex_Cover_Greedy_Fail [2020/9/22]
-- LIS example [2020/9/22]
-- LCS_example [2020/9/22]
Sequence Alignment [2020/9/22, 2020/9/29, 2020/10/6]
-- Global Alignment [2020/9/22, 2020/9/29]
-- A Note on Computing Heaviest Segments [2020/9/29]
-- Local Alignment [2020/9/29]
-- Various Scoring Schemes [2020/9/29, 2020/10/6]
-- An affine-gap-penalty example [2020/9/29]
-- Example 2; Solution [2020/9/29]
-- Scoring scheme examples [2020/9/29]
-- Needleman, Saul B.; and Wunsch, Christian D. (1970). "A general method applicable to the search for similarities in the amino acid sequence of two proteins". Journal of Molecular Biology 48 (3): 443–53.
-- Smith, Temple F.; and Waterman, Michael S. (1981). "Identification of Common Molecular Subsequences". Journal of Molecular Biology 147: 195–197.
-- Gotoh, Osamu: An improved algorithm for matching biological sequences. In: Journal of Molecular Biology. 162, 1982, S. 705-708 (PDF, 206 KB).
Space-Saving Strategies [2020/10/6, 2020/10/13]
Suboptimal Alignments [2020/10/13]
-- Space-Saving Strategies [2020/10/6, 2020/10/13]
-- A Note for Method 7@Suboptimal Alignments [2020/10/13]
-- More on Method 7@Suboptimal Alignments [2020/11/3]
-- Eugene Myers and Webb Miller, "Optimal Alignments in Linear Space," CABIOS (Bioinformatics) 9: 169-176, 1988. [pdf]
-- Chao, K. -M., Pearson, W. R. and Miller, W., 1992, Aligning Two Sequences within a Specified Diagonal Band, Computer Applications in the Biosciences (CABIOS, now Bioinformatics), 8: 481-487.
-- Chao, K.-M., Hardison, R. C. and Miller, W., 1993, Constrained Sequence Alignment, Bulletin of Mathematical Biology, 55: 503-524.
-- Chao, K.-M., Hardison, R. C. and Miller, W., 1993, Locating Well-Conserved Regions within a Pairwise Alignment, Computer Applications in the Biosciences (CABIOS, now Bioinformatics), 9: 387-396.
-- Chao, K. -M., 1994, Computing All Suboptimal Alignments in Linear Space, Combinatorial Pattern Matching '94, Lecture Notes in Computer Science 807, 31-42, California, USA.
-- Chao, K. -M., Hardison R. C. and Miller, W., 1994, Recent Developments in Linear-Space Alignment Methods: a Survey, Journal of Computational Biology, 1: 271-291.
-- Chao, K.-M. and Miller, W., 1995, Linear-Space Algorithms that Build Local Alignments from Fragments, Algorithmica, 13: 106-134.
Multiple Sequence Alignment [2020/10/20]
-- Multiple Sequence Alignment [2020/10/20]
Homology Search [2020/11/3, 2020/11/10]
-- Homology Search Tools (pdf) [2020/11/3, 2020/11/10]
Burrows-Wheeler Transform and FM-Index [2020/11/10]
-- BWT_TENETENET [2020/12/22]
Genome Reconstruction by Phillip E. C. Compeau and Pavel A. Pevzner [2020/11/17]
-- ppt (You may skip pages 125-175 & 301-340.) [2020/11/17]
-- Lectures by Phillip E. C. Compeau
-- B(3,1)&B(3,2) [2020/12/22]
-- Eulerian Tours by Christos H. Papadimitriou & Umesh Vazirani
(Exchange 10 and 01 of the left figure on Page 4)-- An example for fragment assembly as an Eulerian cycle problem [2020/11/17]
Pattern Identification in
a Haplotype Block (ppt,
pdf) [2020/11/24]
-- An example [2020/11/24]
Haplotype Inference (ppt,
pdf) [2020/11/24]
(Read the problem formulation on pp. 1262-1263 of the pdf
file)
-- Some other ways of formulation [2020/11/24]
The 3SUM Problem [2020/12/1]
Heaviest Segments [2020/12/1]
-- A Note on Computing Heaviest Segments [2020/12/1]
RMSQ [2020/12/1]
-- Chen, K.-Y. and Chao, K.-M., 2007, “On the Range Maximum-Sum Segment Query Problem,” Discrete Applied Mathematics, 155: 2043-2052. [2020/12/1]
Homework assignments:
#1: Handout: September 15, 2020; Due: October 2, 2020
Send TA a PowerPoint page including: (Subject: [HW1] Your student ID number)
Your Name
A sequence which, in your opinion, is sort of interesting or inspiring.
A short reason explaining why it is interesting or inspiring.
[All in one page, please.]
More sample solutions: 2007 2008 2009 2011 2015 2016 2017 2018 2019
Class presentations:
1. The expected number of team members: 1~6;
2. Each member is required to present in turn [about 10-15 minutes each];
3. Revised slides should be sent to me within one week after the presentation.
4. Questions in class are always welcome.
[Dec. 15 for preparation, no class]
Dec. 22
Mahdi Boroujeni, Masoud Seddighin, Saeed Seddighin:
Improved
Algorithms for Edit Distance and LCS: Beyond Worst Case. SODA 2020:
1601-1620.
朱彥如 賴開元 周良冠
An Efficient Algorithm for Finding Special Palindromes - Yin-Chu Chen's Master Thesis (2019)
穆大有
Timo Lassmann:
Kalign 3: multiple sequence alignment of large datasets. Bioinform.
36(6): 1928-1929 (2020)
鄭士驤 馮才昇 陳傳諭 林祥瑞
Dec. 29
Jinek M, Chylinski K, Fonfara I, Hauer M, Doudna JA, Charpentier E
A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial
immunity.
Science. 2012 Aug 17;337(6096):816-21.
(The CRISPR-Cas9 genome editing technique was a significant contributor to
the Nobel Prize in Chemistry in 2020 being awarded to Emmanuelle Charpentier
and Jennifer Doudna.)
Doudna JA.
The
promise and challenge of therapeutic genome editing.
Nature. 2020 Feb;578(7794):229-236.
張晏瑋 陳炫均 柯叡澤
Abbott TR, Dhamdhere G, Liu Y, Lin X, Goudy L, Zeng L, Chemparathy A, Chmura
S, Heaton NS, Debs R, Pande T, Endy D, La Russa MF, Lewis DB, Qi LS.
Development of CRISPR as an Antiviral Strategy to Combat SARS-CoV-2 and
Influenza.
Cell. 2020 May 14;181(4):865-876.e12.
王又可 蘇郁文
Severe Covid-19 GWAS Group, Ellinghaus D, Degenhardt F, ..., Karlsen TH.
Genomewide Association Study of Severe Covid-19 with Respiratory Failure.
N Engl J Med. 2020 Oct 15;383(16):1522-1534.
黃冠瑋 闕中一 趙冠豪
Jan. 5
Pesho Ivanov, Benjamin Bichsel, Harun Mustafa, André Kahles, Gunnar Rätsch,
Martin T. Vechev:
AStarix: Fast and Optimal Sequence-to-Graph Alignment. RECOMB 2020:
104-119
王彥仁 林庭風 林喬文 簡仲明
Tavor Z. Baharav, Govinda M. Kamath, David N. C. Tse, Ilan Shomorony:
Spectral Jaccard Similarity: A New Approach to Estimating Pairwise Sequence
Alignments. RECOMB 2020: 223-225
(Also, Pattern Volume 1, Issue 6, 11 September 2020, 100081)
李智揚 張高登
Selected papers for presentation:
(You may access these selected articles using computers with NTU IP addresses.)
Amir Abboud, Arturs Backurs, Virginia Vassilevska Williams:
Tight Hardness Results
for LCS and Other Sequence Similarity Measures. FOCS 2015: 59-78.
Mahdi Boroujeni, Masoud Seddighin, Saeed Seddighin:
Improved
Algorithms for Edit Distance and LCS: Beyond Worst Case. SODA 2020:
1601-1620.
Jinek M, Chylinski K, Fonfara I, Hauer M, Doudna JA, Charpentier E
A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial
immunity.
Science. 2012 Aug 17;337(6096):816-21.
(The CRISPR-Cas9 genome editing technique was a significant contributor to
the Nobel Prize in Chemistry in 2020 being awarded to Emmanuelle Charpentier
and Jennifer Doudna.)
Doudna JA.
The
promise and challenge of therapeutic genome editing.
Nature. 2020 Feb;578(7794):229-236.
Abbott TR, Dhamdhere G, Liu Y, Lin X, Goudy L, Zeng L, Chemparathy A, Chmura
S, Heaton NS, Debs R, Pande T, Endy D, La Russa MF, Lewis DB, Qi LS.
Development of CRISPR as an Antiviral Strategy to Combat SARS-CoV-2 and
Influenza.
Cell. 2020 May 14;181(4):865-876.e12.
Severe Covid-19 GWAS Group, Ellinghaus D, Degenhardt F, ..., Karlsen TH.
Genomewide Association Study of Severe Covid-19 with Respiratory Failure.
N Engl J Med. 2020 Oct 15;383(16):1522-1534.
Patrick Kunzmann, Benjamin E. Mayer, Kay Hamacher:
Substitution matrix based color schemes for sequence alignment visualization.
BMC Bioinform. 21(1): 209 (2020)
Timo Lassmann:
Kalign 3: multiple sequence alignment of large datasets. Bioinform.
36(6): 1928-1929 (2020)
Pesho Ivanov, Benjamin Bichsel, Harun Mustafa, André Kahles, Gunnar Rätsch,
Martin T. Vechev:
AStarix: Fast and Optimal Sequence-to-Graph Alignment. RECOMB 2020:
104-119
Tavor Z. Baharav, Govinda M. Kamath, David N. C. Tse, Ilan Shomorony:
Spectral Jaccard Similarity: A New Approach to Estimating Pairwise Sequence
Alignments. RECOMB 2020: 223-225
(Also, Pattern Volume 1, Issue 6, 11 September 2020, 100081)
[http://rosalind.info/problems/list-view/?location=bioinformatics-textbook-track]