Experimental Results on Statlog Data Sets

This section reports the experiments conducted to evaluate the effects of QuickRBF package. In particular, we are interested in how the mechanism proposed in this paper performs in comparison with the SVM and other famous classifiers in data classification benchmark data sets.

The experiments in this section are conducted to evaluate the performance of the QuickRBF classifier against other famous classifiers, the RVKDE classifier [Oyang et al., 2005], APC-III [Hwang and Bang, 1997], LIBSVM [Chang and Lin, 2001] and KNN. The discussions of the experiments will focus on the following two issues: classification accuracy and execution efficiency. Also, regarding to the parameter settings of other classifiers for comparison, we adopted the parameter settings suggested by the authors in their original papers.

Table 1 lists the main characteristics of the three data sets used in the experiments. All three data sets, satimage, letter, and shuttle, are from the Statlog collection [Michie et al., 1994]. The hardware platform used in the experiments is a workstation with dual Pentium-III-1GHz CPUs, 2GB RAM, and the FreeBSD UNIX-release 4.10.

Table 2 lists the number of support vectors of the three data sets, and we use the support vectors which are selected by SVM as our centers to compare with SVM.

Table 3 compares the accuracy delivered by alternative classification algorithms with the three benchmark data sets. As Table 3 shows, the proposed method basically delivers the same level of accuracy with other famous classifiers, SVM and RVKDE, while the KNN and APC-III based classifier do not produce comparable generation results.

Table 4 compares the execution time of the RVKDE classifier, the SVM, the APC-III based classifier and the proposed method with the Statlog data sets. In Table 4, the total time taken to construct classifiers based on the given training data sets are listed in the rows marked by "Make classifier". The time listed in "Make classifier" row are the time of cross validation for RVKDE classifier, the time of model selection for SVM, and the time of clustering process, calculating bandwidths and weights of APC-III based classifier. Also, for proposed classifier, the reported time is the time of calculating weights. In addition, the time taken by alternative classifiers to predict the classes of the testing instances are listed in the rows marked by "Prediction".

The mechanism used in the QuickRBF package is more efficient than the SVM classifier for constructing a data classifier. Also, the QuickRBF classifier is basically at the same level or even more efficient than other classifiers. In addition, the QuickRBF classifier delivers comparable execution efficiency as the LIBSVM in the prediction phase and enjoys a 10 times speedup over the RVKDE classifier in this regard.

Table 1: Statlog data sets used in the experiments

Date set

number of
training data

number of
testing data

satimage

4435

2000

letter

15000

5000

shuttle

43500

14500

Table 2: The number of support vectors in Statlog data sets

Data set

number of
training data

number of
support vectors

satimage

4435

1610

letter

15000

8945

shuttle

43500

286

Table 3: Comparison of classification accuracy with Statlog data sets

Data set

RVKDE

LIBSVM

1NN

3NN

APC-III

QuickRBF

satimage

92.30

91.30

88.80

90.65

90.25

92.35

letter

97.12

97.98

95.68

95.16

91.16

97.68

shuttle

99.94

99.92

99.94

99.91

97.34

99.43

Average

96.45

96.40

94.84

95.24

92.92

96.49

Table 4: Comparison of execution time in seconds

Data set

RVKDE

LIBSVM

APC-III

QuickRBF

satimage

676

64644

136

314

Make Classifier

letter

2842

387096

712

59978

shuttle

98540

467955

2595

114

satimage

21.30

11.53

0.63

7.8

Prediction Time

letter

128.60

94.91

2.15

104.6

shuttle

996.10

2.13

0.48

2.95