Hsuan-Tien Lin > Courses > Machine Learning, Fall 2023

Hsuan-Tien Lin

Home | MOOCs | AIsk | Courses | Research Group | Awards | Publications | Presentations | Programs/Data

Machine Learning, Fall 2023

Course Description

Machine learning allows computational systems to adaptively improve their performance with experience accumulated from the data observed. This course introduces the basics of learning theories, the design and analysis of learning algorithms, and some applications of machine learning.

People

instructor: Hsuan-Tien LIN (htlin AT csie . ntu . edu . tw) [office hour: after classes, or by appointment]
TAs and TA hour: html_ta AT csie . ntu . edu . tw
- Chia-Wei CHANG (Undergraduate in CSIE Department)
- Yu-Cheng CHENG (M.S. Student in CSIE Department)
- Shuo-Chen HO (Undergraduate in CSIE Department)
- Cai-Yi HU (M.S. Student in CSIE Department)
- Yu-Shiang HUANG (Ph.D. Student in Data Science Program)
- Ren-Wei (Willy) LIANG (Undergraduate in CSIE Department)
- Jeng-Yue (Buffett) LIU (Undergraduate in Geography Department)
- Poy LU (Ph.D. student in Graduate Institute of Networking and Multimedia)
- Odo To (Undergraduate in CSIE Department)
- Cheng-Chi (Casper) WANG (Undergraduate in CSIE Department)
- Hsuan-Fu WANG (M.S. Student in Graduate Institute of Networking and Multimedia)

Course Information

Time: Wednesdays 9:10 to 12:10
Room: CSIE R104 (with broadcast to R102)
NTU COOL: https://cool.ntu.edu.tw/courses/32477
Slido: #HTML2023FALL
Textbook: Learning from Data, by Yaser Abu-Mostafa, Malik Magdon-Ismail and Hsuan-Tien Lin
Language: Mandarin teaching
Grading: 70% homework, 30% project (tentative)

Announcements

2023/11/29: homework 6 announced here, due on 2023/12/20
2023/11/15: homework 5 announced here, due on 2023/12/06
2023/11/03: bonus track of final project announced here, report due on 2023/12/27
2023/11/01: homework 4 announced here, due on 2023/11/15
2023/10/25: regular track of final project announced here, report due on 2023/12/27
2023/10/12: homework 3 announced here, due on 2023/11/01
2023/09/27: homework 2 announced here, due on 2023/10/18
2023/09/13: homework 1 announced here, due on 2023/09/27
2023/09/09: homework 0 announced here, due on 2023/09/27
2023/08/31: signup form announced here
2023/08/31: course policy announced here

Class Policy

policy

Course Plan (tentative)

date	syllabus	todo/done	materials
09/06 (W1)	course introduction; topic 1: when can machines learn? the learning problem		course introduction course slides; the learning problem course slides and extended slides; LFD 1.0, 1.1.1, 1.2.4
09/13 (W2)	learning to answer yes/no; types of learning	homework 1 announced	Lecture 2 extended slides Lecture 3 extended slides required watching (before class): course slides; LFD 1.1.2, 3.1 Learning to Answer Yes/No :: Perceptron Hypothesis Set Learning to Answer Yes/No :: Perceptron Learning Algorithm Learning to Answer Yes/No :: Guarantee of PLA Learning to Answer Yes/No :: Non-Separable Data required watching (before class): course slides; LFD 1.2; LFD 1.3 Types of Learning :: Learning with Different Output Space Types of Learning :: Learning with Different Data Label Types of Learning :: Learning with Different Protocol Types of Learning :: Learning with Different Input Space
09/20 (W3)	feasibility of learning; topic 2: why can machines learn? training versus testing		Lecture 4 extended slides no Lecture 5 extended slides---it's heavy enough ;-) required watching (before class): course slides; LFD 1.3 Feasibility of Learning :: Learning is Impossible? Feasibility of Learning :: Probability to the Rescue Feasibility of Learning :: Connection to Learning Feasibility of Learning :: Connection to Real Learning required watching (before class): course slides; LFD 2.0, 2.1.1 Training versus Testing :: Recap and Preview Training versus Testing :: Effective Number of Lines Training versus Testing :: Effective Number of Hypotheses Training versus Testing :: Break Point suggested extended reading: The Lack of A Priori Distinctions Between Learning Algorithms (Wolpert)
09/27 (W4)	the VC dimension	homework 1 due; homework 2 announced	suggested watching (anytime): course slides; LFD 2.0, 2.1.1 Theory of Generalization :: Restriction of Break Point Theory of Generalization :: Bounding Function: Basic Cases Theory of Generalization :: Bounding Funciton: Inductive Cases Theory of Generalization :: A Pictorial Proof required watching (before class): course slides; LFD 2.2 The VC Dimension :: Definition of VC Dimension The VC Dimension :: VC Dimension of Perceptrons The VC Dimension :: Physical Intuition of VC Dimension The VC Dimension :: Interpreting VC Dimension
10/04 (W5)	noise and error; topic 3: how can machines learn? linear regression; logistic regression		Lecture 9 extended slides required watching (before class): course slides; LFD 1.4 Noise and Error :: Noise and Probabilistic Target Noise and Error :: Error Measure Noise and Error :: Algorithmic Error Measure Noise and Error :: Weighted Classification required watching (before class): course slides; LFD 3.2 Linear Regression :: Linear Regression Problem Linear Regression :: Linear Regression Algorithm Linear Regression :: Generalization Issue Linear Regression :: for Binary Classification required watching (before class): course slides; LFD 3.3 Logistic Regression :: Logistic Regression Problem
10/11 (W6)	linear models for classification; nonlinear transformation	homework 3 announced	required watching (before class): course slides; LFD 3.3 Logistic Regression :: Logistic Regression Error Logistic Regression :: Gradient of Logistic Regression Error Logistic Regression :: Gradient Descent required watching (before class): course slides; LFD 3.3 (for SGD part only) Linear Models for Classification :: Binary Classification Linear Models for Classification :: Stochastic Gradient Descent Linear Models for Classification :: Multiclass via Logistic Linear Models for Classification :: Multiclass via Binary required watching (before class): course slides; LFD 3.4 Nonlinear Transformation :: Quadratic Hypotheses Nonlinear Transformation :: Nonlinear Transform
10/18 (W7)	nonlinear transformation; topic 4: how can machines learn better? hazard of overfitting; regularization	homework 2 due	required watching (before class): course slides; LFD 3.4 Nonlinear Transformation :: Price of Nonlinear Transform Nonlinear Transformation :: Structured Hypothesis Sets required watching (before class): course slides; LFD 4.0, 4.1 Hazard of Overfitting :: What is Overfitting? Hazard of Overfitting :: The Role of Noise and Data Size Hazard of Overfitting :: Deterministic Noise Hazard of Overfitting :: Dealing with Overfitting required watching (before class): course slides; LFD 4.2 Regularization :: Regularized Hypothesis Set Regularization :: Weight Decay Regularization Regularization :: Regularization and VC Theory Regularization :: General Regularizers
10/25 (W8)	validation; three learning principles	final project announced	required watching (before class): course slides; LFD 4.3 Validation :: Model Selection Problem Validation :: Validation Validation :: Leave-One-Out Cross Validation Validation :: V-Fold Cross Validation required watching (before class): course slides; LFD 5 Three Learning Principles :: Occam's Razor Three Learning Principles :: Sampling Bias Three Learning Principles :: Data Snooping Three Learning Principles :: Power of Three
11/01 (W9)	topic 5: how can machines learn by embedding numerous features? linear support vector machine; dual support vector machine; kernel support vector machine	homework 3 due; homework 4 announced	required watching (before class): course slides; LFD e-8.1 Linear SVM :: Large-Margin Separating Hyperplane Linear SVM :: Standard Large-Margin Problem Linear SVM :: Support Vector Machine Linear SVM :: Reasons behind Large-Margin Hyperplane required watching (before class): course slides; LFD e-8.2 Dual Support Vector Machine :: Motivation of Dual SVM Dual Support Vector Machine :: Largange Dual SVM Dual Support Vector Machine :: Solving Dual SVM Dual Support Vector Machine :: Messages behind Dual SVM required watching (before class): course slides; LFD e-8.3 Kernel Support Vector Machine :: Kernel Trick Kernel Support Vector Machine :: Polynomial Kernel
11/08 (W10)	guest lecture from Professor Edward Y. Chang; kernel support vector machine;		talk in class: 挖掘大型語言模型中的知識寶庫 by Professor Edward Y. Chang required watching (before class): course slides; LFD e-8.3 Kernel Support Vector Machine :: Gaussian Kernel Kernel Support Vector Machine :: Comparison of Kernels
11/15 (W11)	no class as instructor needs to attend ACML 2023	homework 4 due; homework 5 announced	suggested watching: course slides Kernel Logistic Regression :: Soft-Margin SVM as Regularized Model Kernel Logistic Regression :: SVM versus Logistic Regression Kernel Logistic Regression :: SVM for Soft Binary Kernel Logistic Regression :: Kernel Logistic Regression suggested watching: course slides Support Vector Regression :: Kernel Ridge Regression Support Vector Regression :: Support Vector Regression Primal Support Vector Regression :: Support Vector Regression Dual Support Vector Regression :: Summary of Kernel Models suggested extended reading: Kernel Logistic Regression and the Import Vector Machine (Zhu and Hastie) A Note on Platt's Probabilistic Outputs for Support Vector Machines (Lin, Weng and Lin)
11/22 (W12)	soft-margin support vector machine; topic 6: how can machines learn by combining predictive features? blending and bagging; adaptive boosting; decision tree		required watching (before class): course slides; LFD e-8.4 Soft-Margin Support Vector Machine :: Motivation and Primal Soft-Margin Support Vector Machine :: Dual Problem Soft-Margin Support Vector Machine :: Messages Soft-Margin Support Vector Machine :: Model Selection required watching (before class): course slides Blending and Bagging :: Motivation of Aggregation Blending and Bagging :: Uniform Blending Blending and Bagging :: Linear and Any Blending Blending and Bagging :: Bagging (Bootstrap Aggregation) required watching (before class): course slides Adaptive Boosting :: Motivation of Boosting Adaptive Boosting :: Diversity by Re-weighting Adaptive Boosting :: Adaptive Boosting Algorithm Adaptive Boosting :: Adaptive Boosting in Action suggested extended reading: A linear ensemble of individual and blended models for music rating prediction (Chen et al.) Bagging predictors (Breiman) A short introduction to boosting (Freund and Schapire) Classification and regression trees (overview of decision tree by Loh) Classification and regression trees (book of CART by Breiman et al.)
11/29 (W13)	decision tree; random forest; gradient boosted decision tree	homework 6 announced	required watching (before class): course slides Decision Tree :: Decision Tree Hypothesis Decision Tree :: Decision Tree Algorithm Decision Tree :: Decision Tree Heuristics in CART Decision Tree :: Decision Tree in Action required watching (before class): course slides Random Forest :: Random Forest Algorithm Random Forest :: Out-of-bag Estimate Random Forest :: Feature Selection Random Forest :: Random Forest in Action required watching (before class): course slides Gradient Boosted Decision Tree :: Adaptive Boosted Decision Tree Gradient Boosted Decision Tree :: Optimization of AdaBoost Gradient Boosted Decision Tree :: Gradient Boosting Gradient Boosted Decision Tree :: Summary of Aggregation suggested extended reading: Random forest (Breiman) Greedy Function Approximation: A Gradient Boosting Machine (Friedman) extended reading: Matrix Factorization Techniques for Recommender Systems (Koren, Bell and Folinsky)
12/06 (W14)	topic 7: how can machines learn by distilling hidden features? neural network; (preliminary) deep learning	homework 5 due	required watching (before class): course slides; LFD e-7.1, e-7.2, e-7.3, e-7.4 (selected parts) Neural Network :: Motivation Neural Network :: Neural Network Hypothesis Neural Network :: Neural Network Learning Neural Network :: Optimization and Regularization required watching (before class): course slides; LFD e-7.6 Deep Learning :: Deep Neural Network Deep Learning :: Autoencoder Deep Learning ::Denoising Autoencoder Deep Learning :: Principal Component Analysis
12/13 (W15)	radial basis function network; matrix factorization; no class as instructor needs to attend NeurIPS 2023		suggested watching: course slides Radial Basis Function Network :: RBF Network Hypothesis Radial Basis Function Network :: RBF Network Learning Radial Basis Function Network :: k-Means Algorithm Radial Basis Function Network :: k-Means and RBFNet in Action suggested watching: course slides Matrix Factorization :: Linear Network Hypothesis Matrix Factorization :: Basic Matrix Factorization Matrix Factorization :: Stochastic Gradient Descent Matrix Factorization :: Summary of Extraction Models
12/20 (W16)	modern deep learning machine learning for modern artificial intelligence	homework 6 due	ReLU: Deep sparse rectifier neural networks (Glorot, Bordes and Bengio) leaky ReLU: Rectifier Nonlinearities Improve Neural Network Acoustic Models (Maas, Hannun and Ng) parametric ReLU: Delving Deep into Rectifiers: Surpassing Human-Level Performance on Image Net Classification (He, Zhang, Ren and Sun) Dive into Deep Learning Section 4.1.2 Gloret initialization: Understanding the difficulty of training deep feedforward neural networks (Gloret and Bengio) He initialization: Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification (He et al.) backprop and momentum: Learning representations by back-propagating errors (Rumelhart, Hinton, and Williams); On the Momentum Term in Gradient Descent Learning Algorithms (Qian) adam: Adam: A Method for Stochastic Optimization (Kingma and Ba) Dive into Deep Learning Sections 4.8, 11.6, 11.8, 11.10 Dropout: A Simple Way to Prevent Neural Networks from Overfitting (Srivastava, Hinton, Krizhevsky, Sutskever and Salakhutdinov) Dive into Deep Learning Sections 4.6 Machine Learning Soundings Machine Learning for Modern Artificial Intelligence
12/27 (W17)	no class and winter vacation started (really?)	final project due

Last updated at CST 03:31, January 06, 2024
Please feel free to contact me: htlin.email.png