Homework 3

In homework 2 you may have seen that the performance after selecting 200 features is still not good. Remember that we used the software libsvm with parameters as follows
 svm-train -c 32 -g 0.0001220703125 thrombin

We suspect that maybe we did not select good parameters. We would like to try the following two things:

For consistency, please use the following 200-feature training and testing files prepares by Yien (yien@csie): training and testing.

Running so many combinations may take a few hours so you want to do this homework as early as possible. Write a short report (<= 2 pages) in English about what you find.

Note that the error rate is counted by a different way: from the KDD cup homepage "if there are 10 actives and 100 inactives in the test set, then each active will effectively count 10 times as much as each inactive."

For calculating cross-validation accuracy using different criteria, you can modify the program svm-train.c. In particular line 144 to 157.


Last modified: Mon Oct 29 19:11:36 CST 2001