References

AU99a
U. Alon, N. Barkai, D. A. Notterman, K. Gish, S. Ybarra, D.Mack, and A. J. Levine.
Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays.
Cell Biology, 96:6745–6750, 1999.
AA12a
Alekh Agarwal, Olivier Chapelle, Miroslav Dudik, and John Langford.
A reliable effective terascale linear learning system.
Journal of Machine Learning Research, 15:1111–1133, 2014.
MB04a
Matthew R. Boutell, Jiebo Luo, Xipeng Shen, and Christopher M. Brown.
Learning multi-label scene classification.
Pattern Recognition, 37(9):1757–1771, 2004.
RB08a
Ron Bekkerman and Martin Scholz.
Data weaving: Scaling up the state-of-the-art in data clustering.
In Proceedings of CIKM, pages 1083–1092, 2008.
PB14a
Pierre Baldi, Peter Sadowski, and Daniel Whiteson.
Searching for exotic particles in high-energy physics with deep learning.
Nature Communications, 5, 2014.
RC02a
R. Collobert, S. Bengio, and Y. Bengio.
A parallel mixture of SVMs for very large scale problems.
Neural Computation, 14(05):1105–1114, 2002.
BJC02a
Bo-Juen Chen, Ming-Wei Chang, and Chih-Jen Lin.
Load forecasting using support vector machines: A study on EUNITE competition 2001.
IEEE Transactions on Power Systems, 19(4):1821–1830, November 2004.
IC19a
Ilias Chalkidis, Emmanouil Fergadiotis, Prodromos Malakasiotis, and Ion Androutsopoulos.
Large-scale multi-label text classification on EU legislation.
In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL), pages 6314–6322, 2019.
IC22b
Ilias Chalkidis, Abhik Jana, Dirk Hartung, Michael Bommarito, Ion Androutsopoulos, Daniel Katz, and Nikolaos Aletras.
LexGLUE: A benchmark dataset for legal language understanding in English.
In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, pages 4310–4330, 2022.
Chang01d
Chih-Chung Chang and Chih-Jen Lin.
IJCNN 2001 challenge: Generalization ability and text decoding.
In Proceedings of IJCNN. IEEE, 2001.
MD04a
M. Duarte and Y. H. Hu.
Vehicle classification in distributed sensor networks.
Journal of Parallel and Distributed Computing, 64(7):826–838, July 2004.
KD02a
K. Duan, S. S. Keerthi, and A. N. Poo.
Evaluation of simple performance measures for tuning SVM hyperparameters.
Neurocomputing, 51:41–59, 2003.
AE02a
André Elisseeff and Jason Weston.
A kernel method for multi-labelled classification.
In Thomas G. Dietterich, Susan Becker, and Zoubin Ghahramani, editors, Advances in Neural Information Processing Systems 14, 2002.
GWF01a
Gary William Flake and Steve Lawrence.
Efficient SVM regression training with SMO.
Machine Learning, 46:271–290, 2002.
IG05a
Isabelle Guyon, Steve Gunn, Asa Ben Hur, and Gideon Dror.
Result analysis of the NIPS 2003 feature selection challenge.
In Advances in Neural Information Processing Systems, volume 17. 2005.
TG99a
T. R. Golub, D. K. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J. P. Mesirov, H. Coller, M. L. Loh, J. R. Downing, M. A. Caligiuri, C. D. Bloomfield, and E. S. Lander.
Molecular classification of cancer: class discovery and class prediction by gene expression monitoring.
Science, 286(5439):531, 1999.
JLG03a
Jennifer L. Gardy, Cory Spencer, Ke Wang, Martin Ester, Gabor E. Tusnady, Istvan Simon, Sujun Hua, Katalin deFays, Christophe Lambert, Kenta Nakai, and Fiona S.L. Brinkman.
PSORT-B: improving protein subcellular localization prediction for gram-negative bacteria.
Nucleic Acids Research, 31(13):3613–3617, 2003.
CWH03a
Chih-Wei Hsu, Chih-Chung Chang, and Chih-Jen Lin.
A practical guide to support vector classification.
Technical report, Department of Computer Science, National Taiwan University, 2003.
TKH96a
Tin Kam Ho and Eugene M. Kleinberg.
Building projectable classifiers of arbitrary complexity.
In Proceedings of the 13th International Conference on Pattern Recognition, pages 880–885, Vienna, Austria, August 1996.
CWH01a
Chih-Wei Hsu and Chih-Jen Lin.
A comparison of methods for multi-class support vector machines.
IEEE Transactions on Neural Networks, 13(2):415–425, 2002.
JJH94a
J. J. Hull.
A database for handwritten text recognition research.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(5):550–554, May 1994.
WLH17a
Will Hamilton, Zhitao Ying, and Jure Leskovec.
Inductive representation learning on large graphs.
In Advances in Neural Information Processing Systems, 2017.
YJ16a
Yuchin Juan, Yong Zhuang, Wei-Sheng Chin, and Chih-Jen Lin.
Field-aware factorization machines for CTR prediction.
In Proceedings of the ACM Recommender Systems Conference (RecSys), 2016.
SSK05a
S. Sathiya Keerthi and Dennis DeCoste.
A modified finite Newton method for fast solution of large scale linear SVMs.
Journal of Machine Learning Research, 6:341–361, 2005.
SK09a
Shimon Kogan, Dimitry Levin, Bryan R. Routledge, Jacob S. Sagi, and Noah A. Smith.
Predicting risk from financial reports with regression.
In In Proceedings of the North American Association for Computational Linguistics Human Language Technologies Conference, pages 272–280, 2009.
AK09a
Alex Krizhevsky.
Learning multiple layers of features from tiny images.
Technical report, University of Toronto, 2009.
SSK08a
S. Sathiya Keerthi, Sellamanickam Sundararajan, Kai-Wei Chang, Cho-Jui Hsieh, and Chih-Jen Lin.
A sequential dual method for large scale multi-class linear SVMs.
In Proceedings of the Forteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 408–416, 2008.
SL96a
Simon Lucas and A. Amiri.
Statistical syntactic methods for high-performance OCR.
IEE Proceedings-Vision, Image and Signal Processing, 143(1):23–30, 1996.
KL95a
Ken Lang.
Newsweeder: Learning to filter netnews.
In Proceedings of the Twelfth International Conference on Machine Learning, pages 331–339, 1995.
YL98a
Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner.
Gradient-based learning applied to document recognition.
Proceedings of the IEEE, 86(11):2278–2324, November 1998.
MNIST database available at http://yann.lecun.com/exdb/mnist/.
GL07b
Gaëlle Loosli, Stéphane Canu, and Léon Bottou.
Training invariant support vector machines using selective sampling.
In Léon Bottou, Olivier Chapelle, Dennis DeCoste, and Jason Weston, editors, Large Scale Kernel Machines, pages 301–320. MIT Press, Cambridge, MA., 2007.
YL04b
Yann LeCun, Fu Jie Huang, and Léon Bottou.
Learning methods for generic object recognition with invariance to pose and lighting.
In Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 97–104, 2004.
KML02a
Kuan-Min Lin and Chih-Jen Lin.
A study on reduced support vector machines.
IEEE Transactions on Neural Networks, 14(6):1449–1559, 2003.
LCL22a
Li-Chung Lin, Cheng-Hung Liu, Chih-Ming Chen, Kai-Chin Hsu, I-Feng Wu, Ming-Feng Tsai, and Chih-Jen Lin.
On the use of unrealistic predictions in hundreds of papers evaluating graph representations.
In Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI), 2022.
LM10a
Johannes Loza Mencía, Eneldoand Fürnkranz.
Efficient multilabel classification algorithms for large-scale problems in the legal domain.
In Enrico Francesconi, Simonetta Montemagni, Wim Peters, and Daniela Tiscornia, editors, Semantic Processing of Legal Texts: Where the Language of Law Meets the Law of Language, pages 192–215. Springer Berlin Heidelberg, 2010.
DL04b
David D. Lewis, Yiming Yang, Tony G. Rose, and Fan Li.
RCV1: A new benchmark collection for text categorization research.
Journal of Machine Learning Research, 5:361–397, 2004.
JM16a
James McDermott and Richard S. Forsyth.
Diagnosing a disorder in a classification benchmark.
Pattern Recognition Letters, 73:41–43, 2016.
JM13a
Julian McAuley and Jure Leskovec.
Hidden factors and hidden topics: understanding rating dimensions with review text.
In Proceedings of the 7th ACM conference on Recommender Systems, pages 165–172, 2013.
AM98a
Andrew McCallum and Kamal Nigam.
A comparison of event models for naive bayes text classification.
In Proceedings of the AAAI'98 Workshop on Learning for Text categorization, 1998.
JM09a
Justin Ma, Lawrence K. Saul, Stefan Savage, and Geoffrey M. Voelker.
Identifying suspicious URLs: An application of large-scale online learning.
In Proceedings of the Twenty Sixth International Conference on Machine Learning (ICML), pages 681–688, 2009.
YN11a
Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y. Ng.
Reading digits in natural images with unsupervised feature learning.
In NIPS Workshop on Deep Learning and Unsupervised Feature Learning, 2011.
JP98a
John C. Platt.
Fast training of support vector machines using sequential minimal optimization.
In Bernhard Schölkopf, Christopher J. C. Burges, and Alexander J. Smola, editors, Advances in Kernel Methods - Support Vector Learning, Cambridge, MA, 1998. MIT Press.
DP01a
Danil Prokhorov.
IJCNN 2001 neural network competition.
Slide presentation in IJCNN'01, Ford Research Laboratory, 2001.
JR01a
Jason D. M. Rennie.
Improving multi-class text classification with naive bayes.
Master's thesis, Massachusetts Institute of Technology, 2001.
AR14a
Anderson Rocha and Siome Goldenstein.
Multiclass from binary: Expanding one-vs-all, one-vs-one and ECOC-based approaches.
IEEE Transactions on Neural Networks and Learning Systems, 25(2):289–302, 2014.
JR01b
Jason D. M. Rennie and Ryan Rifkin.
Improving multiclass text classification with the Support Vector Machine.
Technical Report AIM-2001-026, Massachusetts Insititute of Technology, 2001.
SS10a
Soeren Sonnenburg and Vojtech Franc.
COFFIN : A computational framework for linear SVMs.
In Proceedings of the Twenty Seventh International Conference on Machine Learning (ICML), pages 999–1006, 2010.
SKS03a
Shirish Krishnaj Shevade and S. Sathiya Keerthi.
A simple and efficient algorithm for gene selection using sparse logistic regression.
Bioinformatics, 19(17):2246–2253, 2003.
GT08a
Grigorios Tsoumakas, Ioannis Katakis, and Ioannis Vlahavas.
Effective and efficient multilabel classification in domains with large number of labels.
In Proceedings of ECML/PKDD 2008 Workshop on Mining Multidimensional Data.
LT09a
Lei Tang and Huan Liu.
Relational learning via latent social dimensions.
In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD), pages 817–826, 2009.
AVU06a
Andrew V Uzilov, Joshua M Keegan, and David H Mathews.
Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change.
BMC Bioinformatics, 7(173), 2006.
JYW02a
Jung-Ying Wang.
Application of support vector machines in bioinformatics.
Master's thesis, Department of Computer Science and Information Engineering, National Taiwan University, 2002.
MW01a
M. West, C. Blanchette, H. Dressman, E. Huang, S. Ishida, R. Spang, H. Zuzan, J. A. Olson, Jr., J. R. Marks, and J. R. Nevins.
Predicting the clinical status of human breast cancer by using gene expression profiles.
Proceedings of the National Academy of Sciences, 98:11462–11467, 2001.
ST06a
Steve Webb, James Caverlee, and Calton Pu.
Introducing the webb spam corpus: Using email spam to identify web spam automatically.
In Proceedings of the Third Conference on Email and Anti-Spam (CEAS), 2006.
CCW16a
Chien-Chih Wang, Kent-Loong Tan, Chun-Ting Chen, Yu-Hsiang Lin, S. Sathiya Keerthi, Dhruv Mahajan, Sellamanickam Sundararajan, and Chih-Jen Lin.
Distributed Newton methods for deep learning.
Neural Computation, 30(6):1673–1724, 2018.
CCW18a
Chien-Chih Wang, Kent Loong Tan, and Chih-Jen Lin.
Newton methods for convolutional neural networks.
ACM Transactions on Intelligent Systems and Technology, 11(2):19:1–19:30, 2020.
HFY11a
Hsiang-Fu Yu, Cho-Jui Hsieh, Kai-Wei Chang, and Chih-Jen Lin.
Large linear classification when data cannot fit in memory.
ACM Transactions on Knowledge Discovery from Data, 5(4):23:1–23:23, February 2012.
GXY11a
Guo-Xun Yuan, Chia-Hua Ho, and Chih-Jen Lin.
An improved GLMNET for l1-regularized logistic regression.
Journal of Machine Learning Research, 13:1999–2030, 2012.
HFY10c
Hsiang-Fu Yu, Hung-Yi Lo, Hsun-Ping Hsieh, Jing-Kai Lou, Todd G. McKenzie, Jung-Wei Chou, Po-Han Chung, Chia-Hua Ho, Chun-Fu Chang, Yin-Hsuan Wei, Jui-Yu Weng, En-Syu Yan, Che-Wei Chang, Tsung-Ting Kuo, Yi-Chen Lo, Po Tzu Chang, Chieh Po, Chien-Yuan Wang, Yi-Hung Huang, Chen-Wei Hung, Yu-Xun Ruan, Yu-Shi Lin, Shou-De Lin, Hsuan-Tien Lin, and Chih-Jen Lin.
Feature engineering and classifier ensemble for KDD Cup 2010.
In JMLR Workshop and Conference Proceedings, 2011.
To appear.
AZ09a
Arkaitz Zubiaga.
Enhancing navigation on Wikipedia with social tags.
In Proceedings of Wikimania, 2009.