Experimental Results on Statlog Data Sets

This section reports the experiments conducted to evaluate the effects of QuickRBF package. In particular, we are interested in how the mechanism proposed in this paper performs in comparison with the SVM and other famous classifiers in data classification benchmark data sets.

The experiments in this section are conducted to evaluate the performance of the QuickRBF classifier against other famous classifiers, the RVKDE classifier [Oyang et al., 2005], APC-III [Hwang and Bang, 1997], LIBSVM [Chang and Lin, 2001] and KNN. The discussions of the experiments will focus on the following two issues: classification accuracy and execution efficiency. Also, regarding to the parameter settings of other classifiers for comparison, we adopted the parameter settings suggested by the authors in their original papers.

Table 1 lists the main characteristics of the three data sets used in the experiments. All three data sets, satimage, letter, and shuttle, are from the Statlog collection [Michie et al., 1994]. The hardware platform used in the experiments is a workstation with dual Pentium-III-1GHz CPUs, 2GB RAM, and the FreeBSD UNIX-release 4.10.

Table 2 lists the number of support vectors of the three data sets, and we use the support vectors which are selected by SVM as our centers to compare with SVM.

Table 3 compares the accuracy delivered by alternative classification algorithms with the three benchmark data sets. As Table 3 shows, the proposed method basically delivers the same level of accuracy with other famous classifiers, SVM and RVKDE, while the KNN and APC-III based classifier do not produce comparable generation results.

Table 4 compares the execution time of the RVKDE classifier, the SVM, the APC-III based classifier and the proposed method with the Statlog data sets. In Table 4, the total time taken to construct classifiers based on the given training data sets are listed in the rows marked by "Make classifier". The time listed in "Make classifier" row are the time of cross validation for RVKDE classifier, the time of model selection for SVM, and the time of clustering process, calculating bandwidths and weights of APC-III based classifier. Also, for proposed classifier, the reported time is the time of calculating weights. In addition, the time taken by alternative classifiers to predict the classes of the testing instances are listed in the rows marked by "Prediction".

The mechanism used in the QuickRBF package is more efficient than the SVM classifier for constructing a data classifier. Also, the QuickRBF classifier is basically at the same level or even more efficient than other classifiers. In addition, the QuickRBF classifier delivers comparable execution efficiency as the LIBSVM in the prediction phase and enjoys a 10 times speedup over the RVKDE classifier in this regard.

Table 1: Statlog data sets used in the experiments

Date set number of
training data
number of
testing data
satimage 4435 2000
letter 15000 5000
shuttle 43500 14500

Table 2: The number of support vectors in Statlog data sets

Data set number of
training data
number of
support vectors
satimage 4435 1610
letter 15000 8945
shuttle 43500 286

Table 3: Comparison of classification accuracy with Statlog data sets

Data set RVKDE LIBSVM 1NN 3NN APC-III QuickRBF
satimage 92.30 91.30 88.80 90.65 90.25 92.35
letter 97.12 97.98 95.68 95.16 91.16 97.68
shuttle 99.94 99.92 99.94 99.91 97.34 99.43
Average 96.45 96.40 94.84 95.24 92.92 96.49

Table 4: Comparison of execution time in seconds

  Data set RVKDE LIBSVM APC-III QuickRBF
  satimage 676 64644 136 314
Make Classifier letter 2842 387096 712 59978
  shuttle 98540 467955 2595 114
  satimage 21.30 11.53 0.63 7.8
Prediction Time letter 128.60 94.91 2.15 104.6
  shuttle 996.10 2.13 0.48 2.95