Machine learning allows computational systems to adaptively improve their performance with experience accumulated from the data observed. This course introduces the basics of learning theories, the design and analysis of learning algorithms, and some applications of machine learning.
date | syllabus | todo/done | materials |
09/06 (W1) |
course introduction;
topic 1: when can machines learn?
the learning problem
|
|
|
09/13 (W2) |
learning to answer yes/no;
types of learning
|
homework 1 announced |
required watching (before class):
required watching (before class):
|
09/20 (W3) |
feasibility of learning;
topic 2: why can machines learn?
training versus testing
|
|
required watching (before class):
required watching (before class):
suggested extended reading:
|
09/27 (W4) |
the VC dimension
|
homework 1 due; homework 2 announced |
suggested watching (anytime):
required watching (before class):
|
10/04 (W5) |
noise and error;
topic 3: how can machines learn?
linear regression;
logistic regression
|
|
required watching (before class):
required watching (before class):
required watching (before class):
|
10/11 (W6) |
linear models for classification;
nonlinear transformation
|
homework 3 announced |
required watching (before class):
required watching (before class):
required watching (before class):
|
10/18 (W7) |
nonlinear transformation;
topic 4: how can machines learn better?
hazard of overfitting;
regularization
|
homework 2 due |
required watching (before class):
required watching (before class):
required watching (before class):
|
10/25 (W8) |
validation;
three learning principles
|
final project announced |
required watching (before class):
required watching (before class):
|
11/01 (W9) |
topic 5: how can machines learn by embedding numerous features?
linear support vector machine;
dual support vector machine;
kernel support vector machine
|
homework 3 due; homework 4 announced |
required watching (before class):
required watching (before class):
required watching (before class):
|
11/08 (W10) |
guest lecture from Professor Edward Y. Chang;
kernel support vector machine;
|
|
talk in class:
required watching (before class):
|
11/15 (W11) |
no class as instructor needs to attend ACML 2023
|
homework 4 due; homework 5 announced |
suggested watching:
suggested watching:
suggested extended reading:
|
11/22 (W12) |
soft-margin support vector machine;
topic 6: how can machines learn by combining predictive features?
blending and bagging;
adaptive boosting;
decision tree
|
|
required watching (before class):
required watching (before class):
required watching (before class):
suggested extended reading:
|
11/29 (W13) |
decision tree;
random forest;
gradient boosted decision tree
|
homework 6 announced
|
required watching (before class):
required watching (before class):
required watching (before class):
suggested extended reading:
extended reading:
|
12/06 (W14) |
topic 7: how can machines learn by distilling hidden features?
neural network;
(preliminary) deep learning
|
homework 5 due
|
required watching (before class):
required watching (before class):
|
12/13 (W15) |
radial basis function network;
matrix factorization;
no class as instructor needs to attend NeurIPS 2023 |
|
suggested watching:
suggested watching:
|
12/20 (W16) |
modern deep learning
machine learning for modern artificial intelligence
|
homework 6 due |
- ReLU: Deep sparse rectifier neural networks (Glorot, Bordes and Bengio)
- leaky ReLU: Rectifier Nonlinearities Improve Neural Network Acoustic Models (Maas, Hannun and Ng)
- parametric ReLU: Delving Deep into Rectifiers: Surpassing Human-Level Performance on Image Net Classification (He, Zhang, Ren and Sun)
- Dive into Deep Learning Section 4.1.2
- Gloret initialization: Understanding the difficulty of training deep feedforward neural networks (Gloret and Bengio)
- He initialization: Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification (He et al.)
- backprop and momentum:
Learning representations by back-propagating errors (Rumelhart, Hinton, and Williams);
On the Momentum Term in Gradient Descent Learning Algorithms (Qian)
- adam: Adam: A Method for Stochastic Optimization (Kingma and Ba)
- Dive into Deep Learning Sections 4.8, 11.6, 11.8, 11.10
- Dropout: A Simple Way to Prevent Neural Networks from Overfitting (Srivastava, Hinton, Krizhevsky, Sutskever and Salakhutdinov)
- Dive into Deep Learning Sections 4.6
- Machine Learning Soundings
- Machine Learning for Modern Artificial Intelligence
|
12/27 (W17) |
no class and winter vacation started (really?) |
final project due |
|