Web Retrieval and Mining

Fall 2008

Schedule
¡@

  Date   Topic   Reading
 2008/9/19  Course Policies (pdf),
 Overview of Web Retrieval
 and Mining (pdf)
Bush 45, Mining the Web (ch1)
 2008/9/26  Introduction to IR (pdf) Performance Evaluation
TREC Measures
 2008/10/03  IR System (pdf) Components of IR Systems (Ch1.2, 1.4)
Text Processing (Ch7.2)
Assignment 1 (CMU SLM)
 2008/10/10  Break  
 2008/10/17  Vector Space Model (pdf)
 Probabilistic Model (pdf)
Vector Space Model
Probabilistic Model
Relevance Feedback
 2008/10/24  Language Model (pdf) Language Model
Smoothing (Sections 2, 3)
LM with KL-divergence (Sections 1, 2, 3)
 2008/10/31  Learning to Rank (pdf) RankNet (Sections 1,3,4; skip proofs)
Benchmark LETOR
Assignment 2
(Model Overview)    
 2008/11/07  Link Analysis (pdf) Link Analysis
Math Review (A.2.2, A.4, A.5.1)
Google Technology
 2008/11/14  Midterm Exam  
 2008/11/21  Mixture Language Model &
 EM Algorithm (pdf)
 
EM Algorithm for Language Model
Assignment 3 (Austin's note on PageRank)
 2008/11/28  Intro. to Multimedia IR (pdf) Paper lists
 2008/12/05  Intro. to Multimedia IR (pdf) Text Classification
One-page proposal for term project
(
graduate students, team work)
 2008/12/12  Text Classification (pdf) Assignment 4
(undergraduate students, team work)
 2008/12/19  Text Classification (pdf)  
 2008/12/26  Paper Presentation (Schedule) Only graduate students
 2008/01/02  Break
 2008/01/09  Project Presentation Four-page project report draft
(graduate students)
 2008/01/16  Clustering for IR (pdf) Project Report/Assignment 4 Due

¡@