Binary-class Cross Validation with Different Criteria

Introduction

For some unbalanced data sets, accuracy may not be a good criterion for evaluating a model. This tool enables LIBSVM to conduct cross-validation and prediction with respect to different criteria (e.g., F-score, AUC..).

For LIBLINEAR, please see this page

What can this tool do?

  1. Cross-validation with different criteria (F-score, AUC, or BAC)
  2. Using different evaluations in prediction (precision, recall, F-score, AUC, or BAC)
    Please note that precision or recall may not be a good criterion for cross validation because you can easily get 100% precision/recall by predicting all data in one class.
  3. Support parameter search with grid.py
  4. An simple framework for designing new evaluation functions
  5. MATLAB support
The evaluation functions included in this tool are:
precision
Precision = true_positive / (true_positive + false_positive)
recall
Recall = true_positive / (true_positive + false_negative)
fscore
F-score = 2 * Precision * Recall / (Precision + Recall)
bac
BAC (Balanced ACcuracy) = (Sensitivity + Specificity) / 2,
where Sensitivity = true_positive / (true_positive + false_negative)
and   Specificity = true_negative / (true_negative + false_positive)
auc
AUC (Area Under Curve) is the area under the ROC curve.
ap
AP (Average Precision) approximates the area under the Precision-Recall curve.
In the scenario of highly-unbalanced data (such as information retrieval area), AP metric is commonly used.
In the implementation, we assume that the majority of the data is labeled as negetive.

Note: This tool is designed only for binary-class C-SVM with labels {1,-1}. Multi-class, regression and probability estimation are not supported.
Note: When using accuracy as the evaluation criterion, the cross validation accuracy may be different from that by standard LIBSVM. The reason is that LIBSVM internally group data in the same class, while this tool does not.


How to Run this Tool

  1. Download eval.cpp, eval.h, and Makefile, and put them to the directory of LIBSVM (overwrite the old Makefile).
  2. Add
    	#include "eval.h"
    to svm-train.c and svm-predict.c.
  3. Replace
            if(cross_validation)
            {
                    do_cross_validation();
            }
    in the main() of svm-train.c with
            if(cross_validation)
            {
                    double cv =  binary_class_cross_validation(&prob, &param, nr_fold);
                    printf("Cross Validation = %g%%\n",100.0*cv);
            }
    Note that the percentage mark is necessary in order to use grid.py. No need to change other places where do_cross_validation() appears.
  4. Replace
    	predict(input,output);
    in main() of svm-predict.c with
    	binary_class_predict(input, output, model);
  5. Assign the global variable
    	double (*evaluation_function)(const size_t, const double *, const int *) = auc;
    in eval.cpp to the evaluation function you preferred. You can also assign precision, recall, fscore, or bac here.
  6. Recompile LIBSVM with the new Makefile.
    	make clean; make

How to Display Multiple Evaluation Values

To display various evaluation results in prediction, you can displace

	evaluation_function(total, dec_values, true_labels);
in binary_class_predict() of eval.cpp. For example, to see accuracy, precision, and recall, you can write
	accuracy(total, dec_values, true_labels);
	precision(total, dec_values, true_labels);
	recall(total, dec_values, true_labels);
The output will be like
	Accuracy = 86.6667% (234/270)
	Precision = 88.1818% (97/110)
	Recall = 80.8333% (97/120)


Use grid.py to Find Parameters

The best parameters vary among different performance evaluations. Using grid.py (at tools/ in LIBSVM), you can choose the best parameters with respect to any specified evaluation function. grid.py will search best parameters C and g by cross-validation.

Here is an example output of grid.py for the data set heart_scale when the evaluation function is AUC:

	512.0 0.00048828125 90.7111
The best cross-validation AUC is 90.7111% when (C, g) = (512.0, 0.00048828125).

Because grid.py maximizes the evaluation value, the evaluation function should satisfy the property that a better model gives a higher value.


How to Add New Evaluation Functions

New evaluation functions should be added in eval.cpp. The prototype of an evaluation function is

	double eval_func(const size_t, const double *dec_values, const int *ty);
where dec_values is a vector of decision values and ty is a vector of true labels (+1 or -1). This function returns the evaluation value.

Here is an example showing how recall is implemented.

  1. Add a function prototype in eval.cpp.
    	double recall(const size_t size, const double *dec_values, const int *ty);	      
    
  2. Implement the recall function.
    	double recall(const size_t size, const double *dec_values, const int *ty)
    	{
    		size_t i;
    		int    tp, fn; // true_positive and false negative
    		double recall;
    		
    		tp = fn = 0;
    
    		for(i = 0; i < size; ++i) if(ty[i] == 1){ // true label is 1
    			if(dec_values[i] >= 0) ++tp; // predict label is 1
    			else                   ++fn; // predict label is -1
    		}
    
    		recall = tp / (double) (tp + fn);
    		
    		// print result in case of  invocation in prediction
    		printf("Recall = %g%%\n", 100.0 * recall);
    
    		return recall; // return the evaluation value
    	}
  3. Assign the global variable
    	double (*evaluation_function)(const size_t, const double *, const int *) = recall;	      
    
  4. Modify grid.py if your criterion is the smaller the better.


MATLAB Support

Please download the files do_binary_cross_validation.m, do_binary_predict.m, and validation_function.m. Put them to the matlab directory of LIBSVM.

Assign the variable

	valid_function = @(dec, labels) auc(dec, labels);
in validation_function.m to the evaluation function you preferred. You can assign auc, precision, recall, fscore, bac, or ap here.

You can use the following two functions.
  • do_binary_cross_validation() is for cross validation with different criteria.
  • do_binary_predict() is for prediction with different criteria.
  • Usage:

    > do_binary_cross_validation(training_label_vector, training_instance_matrix, 'libsvm_options', n_fold);
    > [predicted_label, evaluation_result, decision_values] = do_binary_predict(testing_label_vector, testing_instance_matrix, model);
    

    Examples:

    [trainY trainX] = libsvmread('./data.scale');
    [testY testX] = libsvmread('./data.scale.t');
    do_binary_cross_validation(trainY, trainX, '-c 8 -g 4', 5);
    model = svmtrain(trainY, trainX);
    [pred eval_ret dec] = do_binary_predict(testY, testX, model);
    
    These files can be used for LIBLINEAR, though you need to replace svmtrain and svmpredict with train and predict, respectively.


    Please contact Chih-Jen Lin for any question.