This directory includes sources used in the following paper:

Chih-Yang Hsia, Wei-Lin Chiang, and Chih-Jen Lin
Preconditioned Conjugate Gradient Methods in Truncated Newton Frameworks
for Large-scale Linear Classification, ACML 2018

This code is used to reproduce the experimental results in our paper.
However, results may be slightly different due to the randomness, CPU speed,
and the load on your machine(s).


Data sets
=========

Nine data sets in paper/supplement are considered:
- news20.binary
- url_combined
- kddb
- kdd12.svm
- criteo.trva
- w8a
- covtype
- rcv1_test
- kdda

Notes:
1. "yahoojp" and "yahookr" are not publicly available.
2. In our paper, the data set "criteo.trva" is considered.
   However, the public available version of criteo data set [1] (criteo.kaggle2014.svm)
   is a normalized version of "criteo.trva". Hence the results may be different.

[1] https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html#criteo


Explanations of each experiment script
======================================

Experiment 1:
The script 'exp1.sh' is used to reproduce the results in
Section 5.1 "Diagonal Preconditioner and the Proposed Method in Section 4".
Four methods are considered:
- CG
- Diag
- CG or Diag
- Mixed

Experiment 2:
The script 'exp2.sh' is used to reproduce the results in
Section 5.2 "Subsampled Hessian as Preconditioner".
Four methods are considered:
- CG
- SH-100
- SH-1000
- SH-3000

Experiment 3:
The script 'exp3.sh' is used to reproduce the results in
Section 5.3 "Running-time Comparison of Different Preconditioners".
Four methods are considered:
- CG
- Diagonal
- Mixed
- SH-3000


System Requirements
===================
This experiment is supposed to be run on UNIX machines. The following
command/libraries are required:
- bash
- wget
- gcc
- make
- bzip2
- Python 3.x
- matplotlib (a plotting library on Python. Note that version "1.5.x" is used.)

Please note that there might be some plotting issues if matplotlib 2.2.x is used.


How to run the experiments
==========================
We take Experiment 1 as an example.
First, we have to download data sets into the directory "datasets/".

$ cd datasets
$ ./download_data.sh

If you wish to download larger data sets, please type

$ ./download_large_data.sh

Second, we type

$ ./exp1.sh

to start running the experiments.

After "exp1.sh" finishes, the experimental logs will be put under "log_lr_best/".
The filename indicates which data set, regularization parameter, and method are used.
For example, "news20.binary.origin.1.1e-10.log" means that
    Data set: "news20.binary"
    Regularization parameter C: "1" x BestC
    Method: "origin"
are considered.
The experimental figures will also be generated under "paper/exp1_lr_best/".

By default, "exp1.sh" only runs the data "news20.binary".
You may try more data sets by uncommenting the lines starting from
Line 40 of "run.py" and Line 35 of "plot.py" to include more data sets.


More explanation of each directory/file
=======================================
'solvers/':
Implementation of different preconditioners considered in this paper.
Below are the mappings of all methods.

CG:          origin
Diag:        diag_no_eet
CG or Diag:  diag_parallel
Mixed:       diag_mul_no_eet
SH-100:      woodbury_100
SH-1000:     woodbury_1000
SH-3000:     woodbury_3000

'run.py':
Script for running experiments of this paper.

'plot.py':
Script for generating experimental figures of this paper.

'best_c1.json' and 'best_c2.json':
Best regularization parameters C for each data set.

'ref_log_lr_best/':
Reference logs used to plot figures.


How we get the best C (regularization parameter) values
=======================================================
As mentioned in paper, we consider the regularization term C with best CV rate
for each data set.
We used the option '-C' provided in LIBLINEAR find the best C parameter.
