Machine Discovery (3 credits)


Instructor: Prof. Shou-de Lin (

Classroom: CSIE 107

Meeting Time: Friday 9:10am-12:10pm

Office Hour:   By Appointment

TA: Wei-Chi Lai (

Course Description:

This course discusses how machine can automatically perform or (assist human in performing) discovery tasks. It will cover three main themes: the instructor will go through several promising modeling techniques for MD, discuss some useful computational methods for MD, and introduce several exemplary/classic MD systems. Students are expected to not only comprehend the theoretical issues behind machine discovery but also have hands on experience in designing a discovery system. 

Also please refer to "Machine-Discovery - the Popular Science View".


Homework and Presentation: (30%)
Programming Assignments: (35%)
Final Project (35%)

Recommend Readings:

B1: "Machine Discovery", Jan Zytkow, 1997

B2: "Knowledge Discovery and Measures of Interest", Robert J. Hilderman, Howard J. Hamilton, 2001

Papers to be presented

Machine Discovery Principle Group:

    MD1. Herbert A. Simon, "Machine Discovery" and comments (B1, p171-p224)

    MD2: Wei-Min Shen "The Process of Discovery" (B1, p233-251)

    MD3: S. Borrett et al. "A method for representing and developing process models", Ecological Complexity, 2007

    MD4: P, Langley, Constructing explanatory process models from biological data and knowledge. AI in Medicine, 2006

Link Discovery and Anomaly Detection Group:

    LA1: William Eberle and Lawrence Holder, "Discovering Structural Anomalies in Graph-Based Data." ICDM07

    LA2: L. Backstrom, et al. "Group Formation in Large Social Networks: Membership, Growth, and Evolution" KDD2006

   LA3: Bo Long, Xiaoyun Wu, Zhongfei Zhang, Philip Yu, "Unsupervised Learning on K-partite Graphs", KDD2006

    LA4: Neville, J. and D. Jensen "Relational Dependency Networks" Journal of Machine Learning Research, 2007

Language Model Group:

     LM1: R. Iyer, M. Ostendorf, "Modeling Long Distance Dependence in Language: Topic Mixtures vs. Dynamic Cache Models", ICSLP '96

     LM2: Ronald Rosenfeld "Two Decades Of Statistical Language Modeling: Where Do We Go From Here?" 2000 

     LM3: P. Brown, et al.  "Class-Based N-Gram Models of Natural Language" Computational Linguistics, vol.18, no.4, pp. 467-- 479, 1992.

     LM4: Kai-Fu Lee; Mingjing Li; Zheng Chen , "Discriminative training on language model", 2000, MSRA.

     LM5: Jianfeng Gao; Kai-Fu Lee; Mingjing Li, "N-gram distribution based language model adaptation", 2000, MSRA.

Unsupervised Learning Group:

    UL1: Dmitry Davidov, Ari Rappoport and Moshe Koppel "Fully Unsupervised Discovery of Concept-Specific Relationships by Web Mining",  ACL2007

    UL2: Sharon Goldwater and Tom Griffiths "A Fully Bayesian Approach to Unsupervised Part-of-Speech Tagging", ACL 2007

    UL3: D. Downey, S. Schoenmackers and O. Etzioni, "Sparse Information Extraction: Unsupervised Language Models to the Rescue", ACL 2007
    UL4: W. Bridewell, et al. "Learning process models with missing data", ECML, 2007


Introduction @ @
Week 1 Course introduction: what, why, how, grades, SOP for MD @
Modeling techniques, unsupervised methods and Exemplary Discovery Systems @
Week 2 Probabilistic Graphical Model (FSA, LM, BN) @
Week 3 Hidden Markov Model, Viterbi Algorithm Assignment 1 out
Week 4 Expectation-Maximization Algorithm (1) @
Oct 19 Expectation-Maximization Algorithm (2) Assignment 1 due
Assignment 2 out
Oct 26 Unsupervised Labeling (Semantic role labeling, Word Sense Disambiguation, POS Tagging) +Decipherment   @
Nov 2 Discovery in Multi-relational Networks + Explanation-based Discovery @
Nov 16 Clustering (by Prof. Chien-Yu Chen) Assignment 2 due, homework (short essay) out
Nov 23 Social Networks Analysis + Interestingness measure Project Proposal Due
Paper and Project Presentations @
Nov 30 Project Proposal Presentation @
Dec 7 Language Model Group @
Dec 14 Unsupervised Learning Group @
Dec 21 Link Discovery and Anomaly Detection Group @
Dec 28

Machine Discovery Principle Group

Jan 4 Final Project Presentation @
Jan 11 Final Project Presentation Final Project Report Due, Homework(Short Essay) Due
y> width="193"> Final Project Report Due, Homework(Short Essay) Due