Homework 12

We have mentioned that data scaling is quite important, so would like to check if it is really true. Consider small data (<= 1500 points) here. If they have both original and scaled versions, their names will be like "xxx" and "xxx_scale." For each data, you evenly separate it to training/testing, use kNN with parameter selection on k, and then report testing accuracy. Discuss the difference of original and scaled versions.

When spiting data into training and testing, you need to make sure that the same data from original and scaled sets are used.


Last modified: Fri Feb 13 15:35:28 CST 2004