給「演算法與計算生物學實驗室」同學的一些話

(初版:5/26/2003;二版:6/9/2004更多的老生常談)

趙坤茂

台灣大學資訊工程系

 

我們實驗室的努力方向包括:

1. 任何具計算挑戰性的問題,包括各式各樣的演算法設計;
2. 在生物資訊應用領域上,結合演算法設計及資料庫理論所製作而成的軟體工具;
3. 在生物資訊應用領域上,開發實用的軟體分析工具。

 

剛進我們實驗室的同學,我建議先廣泛地略讀本篇後面所列的一些期刊及會議所刊登的論文,也許大部分論文都讓你像丈二金剛,摸不著頭緒,但先不要灰心,只要有一些課題能讓你有感覺,那就不虛此行了。另一方面,如果你對這些期刊會議的論文都不感興趣,也許我們實驗室想做的,和你的預期有所差距,宜趁早做個轉換。

 

在廣泛閱讀後,宜鎖定幾篇特別感興趣的論文精讀,如果能讀通幾篇好論文,自己的功力一定可提昇不少;同時在視野上必定增廣許多。閱讀時一定要學思並用,才能更快抓住論文的主軸及其可延伸的方向。在累積一些論文的閱讀經驗後,我建議你可以試試看這種方法,也許苦了些,但有意想不到的功效,把論文放在桌前,先看看論文題目,想想看這個主題的可能方向,研究方法及結果;再看看論文摘要,然後想想看若是你的話,你會如何進行。閱讀論文中,不要盡隨著作者的筆調前進,適時地停下來想想看你會怎麼做,如果有定理時,先不要看證明,先自己試著證證看。整篇都讀完時,回顧一下這篇論文的主要貢獻,有沒有立即可改進的地方呢?有沒有任何有意義的擴充方向呢?不要忽略了自己聯想力的可貴,一定要學思並用,讀起論文來才會津津有味。

 

有了一些經驗後,你就要試著去設定所要研究的題目,而這個題目,最好是你願意抱在懷裡一起入夢的,只有你願意主動出擊的問題,你才有更多的機會做出好結果。一個研究題目雖不一定有好解,但浸淫久了,好的想法自然來。

 

時間的管理也是你做研究是否成功的關鍵,我會建議每位同學都擬定自己的作戰計畫,給自己一個時間的guideline。同時要有一個research map,對自己所要研究的問題相關課題,有個畫在紙上的概略相關圖。讓我這樣打個比方,我們解題就像攻城掠地,不可能毫無方向地亂衝一番,自己所擬定的research map就像攤在成吉思汗前的大汗帝國圖,讓我們的思想版圖能擴充發揮到極致。

 

「保持進展」是我常提的一句話,也是我建議大家時時刻刻給自己提醒的一句話。如果我們試著去解一個問題,經過多時多方嘗試,仍無所獲,我們仍有進展,因為我們知道有哪些方法不可行,完全不做才是真正的沒有進展。但我們也不要掉進死胡同裡維持進展,這中間有些tradeoff,認真體會的人一定可以在重圍中殺出血路的。

 

我常常覺得做一個研究課題像爬座山,有人喜歡形勢陡峭的、有人喜歡風景俏麗的,無論你爬的山是哪一類型的,都要有攻頂拔尖的期許,不要半途而廢。

 

學從放心處來,這是為學的不二法門。

 

有了好想法,就要練習用專業的方式將它寫下來,這樣才能將我們的好想法推銷出去。一定要多用心寫,如果你寫得很粗糙,可能連自己都懶得看,這樣就糟蹋了好想法。我們常用LaTeX寫論文,如果你還不會,可向實驗室前輩請教,很快就能上手。把你曾讀過而覺得寫得很好的論文特別收藏起來,撰寫的時候以那些論文為榜樣,試著寫出同樣品質的論文。寫的時候要注意:

 

1.      題目要能反應出闡述的主題。

2.      切忌拖泥帶水,不知所云。再複雜的觀念,還是會有很好的角度,可深入淺出、抽絲剝繭地撰寫下來,這樣才能提高可讀性,必要時也可加上輔助的圖說明。

3.      章節段落的分配要有邏輯性。

4.      文獻的格式及引用要講究。

5.      英文字體(如數學符號要斜體)、詞彙、句型及文法要多費心,不可馬虎!(雖然英語不是我們的母語,但科技論文的語法並不難,多看幾篇就會有感覺;如果一個句子寫完後,自己看都覺得怪,一定要設法把它修好一點。)

6.      你的著作會跟你一輩子,一定要一讀再讀,追求完美,近乎苛求!

 

任何時候,如果需要與我討論,請不客氣地讓我知道。

 

下面我們介紹一些我們實驗室感興趣的期刊與會議論文來源。在計算生物學領域方面,BioinformaticsJournal of Computational Biology都值得一試;在應用演算法方面,Information Processing LettersAlgorithmica也頗多素材可參考,FOCSSTOCSODAESA等最新會議資料也會提供極佳的靈感;在資料庫方面,VLDBPODSTKDESIGMOD值得用心多看。

我推薦的論文來源如下(會議的超連結在此省略,因為每年都變,有心的讀者可自行以google搜尋引擎( http://www.google.com )或下面所列的常用文獻資料庫中找到,實際上我已將大部分的出處列在每個會議之後)

 

應用演算法領域:

Journal (期刊)

        Information Processing Letters

        Algorithmica

        Journal of Algorithms

        Journal of ACM

        SIAM Journal on Computing

        SIAM Journal on Discrete Mathematics

        Discrete Mathematics

        Discrete Applied Mathematics

        Theoretical Computer Science

        Networks

        Journal of Computer and System Sciences

        IEEE Transactions on Computers

        Operational Research Letters

        European Journal of Operational Research 

 

Conference (會議)

        FOCS (IEL)

        STOC (ACM Portal)

        SODA (ACM Portal)

        ESA (Lecture Notes in Computer Science)

        ICALP (Lecture Notes in Computer Science)

        ISAAC (Lecture Notes in Computer Science)

        COCOON (Lecture Notes in Computer Science)

        STACS (Lecture Notes in Computer Science)

        WADS & SWAT (Lecture Notes in Computer Science)

 

計算生物學領域:

Journal (期刊)

        Bioinformatics

        Journal of Computational Biology

        Genome Research

        Journal of Bioinformatics and Computational Biology

        Nucleic Acid Research

        Science

        Nature

 

Conference (會議)

        RECOMB (ACM Portal; Lecture Notes in Bioinformatics)

        ISMB (Bioinformatics)
        ECCB (Bioinformatics)

        PSB

        WABI (Lecture Notes in Computer Science)

 

資料庫

Journal (期刊)

        VLDB

        IEEE Transactions on Knowledge and Data Engineering

SIGMOD Record

IEEE Data(base) Engineering Bulletin

 

Conference (會議)

        SIGMOD/PODS

        VLDB

        ICDE

        ICDT

 

除了從網路上查閱之外,也不要忽略了圖書館的紙本閱讀來源。那些期刊在國內哪些圖書館有呢?這可從西文期刊聯合目錄找到。常用線上文獻資料庫:

 

1.          Google Scholar
Google Scholar provides a simple way to broadly search for scholarly literature. From one place, you can search across many disciplines and sources: peer-reviewed papers, theses, books, abstracts and articles, from academic publishers, professional societies, preprint repositories, universities and other scholarly organizations. Google Scholar helps you identify the most relevant research across the world of scholarly research.
 

2.          DBLP
(The DBLP server provides bibliographic information on major computer science journals and proceedings. Initially the server was focused on DataBase systems and Logic Programming (DBLP), now it is gradually being expanded toward other fields of computer science. You may now read "DBLP" as "Digital Bibliography & Library Project".)
 

3.          PubMed
(PubMed, a service of the National Library of Medicine, provides access to over 12 million MEDLINE citations back to the mid-1960's and additional life science journals. PubMed includes links to many sites providing full text articles and other related resources.)
 

4.          CiteSeer
(ResearchIndex is a scientific literature digital library that aims to improve the dissemination and feedback of scientific literature, and to provide improvements in functionality, usability, availability, cost, comprehensiveness, efficiency, and timeliness.)
 

5.          IEL
(IEEE Xplore provides full-text access to IEEE transactions, journals, magazines and conference proceedings published since 1988 and all current IEEE Standards.)
 

6.          ACM Portal
(Full text of every article ever published by ACM. Go to The ACM Digital Library; A bibliography from major publishers in computing with 600,000 entries. Go to The Guide)
 

7.          Lecture Notes in Computer Science
(The series Lecture Notes in Computer Science (LNCS), including its subseries Lecture Notes in Artificial Intelligence (LNAI), has established itself as a medium for the publication of new developments in computer science and information technology research and teaching - quickly, informally, and at a high level.)
 

8.          The Collection of Computer Science Bibliographies
(This is a collection of bibliographies of scientific literature in computer science from various sources, covering most aspects of computer science. The bibliographies are updated monthly from their original locations such that you'll always find the most recent versions here.)

 

9.          A compendium of NP optimization problems
(This is a continuously updated catalog of approximability results for NP optimization problems. The compendium is also a part of the book Complexity and Approximation. The compendium has not been updated for a while, so there might exist recent results that are not mentioned in the compendium. If you happen to notice such a missing result, please report it to us using the web forms.)