U. Alon, N. Barkai, D. A. Notterman, K. Gish, S. Ybarra, D.Mack, and A. J. Levine.
Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays.
Cell Biology, 96:6745–6750, 1999.
Alekh Agarwal, Olivier Chapelle, Miroslav Dudik, and John Langford.
A reliable effective terascale linear learning system.
Journal of Machine Learning Research, 15:1111–1133, 2014.
Matthew R. Boutell, Jiebo Luo, Xipeng Shen, and Christopher M. Brown.
Learning multi-label scene classification.
Pattern Recognition, 37(9):1757–1771, 2004.
Ron Bekkerman and Martin Scholz.
Data weaving: Scaling up the state-of-the-art in data clustering.
In Proceedings of CIKM, pages 1083–1092, 2008.
Pierre Baldi, Peter Sadowski, and Daniel Whiteson.
Searching for exotic particles in high-energy physics with deep learning.
Nature Communications, 5, 2014.
R. Collobert, S. Bengio, and Y. Bengio.
A parallel mixture of SVMs for very large scale problems.
Neural Computation, 14(05):1105–1114, 2002.
Bo-Juen Chen, Ming-Wei Chang, and Chih-Jen Lin.
Load forecasting using support vector machines: A study on EUNITE competition 2001.
IEEE Transactions on Power Systems, 19(4):1821–1830, November 2004.
Ilias Chalkidis, Emmanouil Fergadiotis, Prodromos Malakasiotis, and Ion Androutsopoulos.
Large-scale multi-label text classification on EU legislation.
In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL), pages 6314–6322, 2019.
Chih-Chung Chang and Chih-Jen Lin.
IJCNN 2001 challenge: Generalization ability and text decoding.
In Proceedings of IJCNN. IEEE, 2001.
M. Duarte and Y. H. Hu.
Vehicle classification in distributed sensor networks.
Journal of Parallel and Distributed Computing, 64(7):826–838, July 2004.
K. Duan, S. S. Keerthi, and A. N. Poo.
Evaluation of simple performance measures for tuning SVM hyperparameters.
Neurocomputing, 51:41–59, 2003.
André Elisseeff and Jason Weston.
A kernel method for multi-labelled classification.
In Thomas G. Dietterich, Susan Becker, and Zoubin Ghahramani, editors, Advances in Neural Information Processing Systems 14, 2002.
Gary William Flake and Steve Lawrence.
Efficient SVM regression training with SMO.
Machine Learning, 46:271–290, 2002.
Isabelle Guyon, Steve Gunn, Asa Ben Hur, and Gideon Dror.
Result analysis of the NIPS 2003 feature selection challenge.
In Advances in Neural Information Processing Systems, volume 17. 2005.
T. R. Golub, D. K. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J. P. Mesirov, H. Coller, M. L. Loh, J. R. Downing, M. A. Caligiuri, C. D. Bloomfield, and E. S. Lander.
Molecular classification of cancer: class discovery and class prediction by gene expression monitoring.
Science, 286(5439):531, 1999.
Jennifer L. Gardy, Cory Spencer, Ke Wang, Martin Ester, Gabor E. Tusnady, Istvan Simon, Sujun Hua, Katalin deFays, Christophe Lambert, Kenta Nakai, and Fiona S.L. Brinkman.
PSORT-B: improving protein subcellular localization prediction for gram-negative bacteria.
Nucleic Acids Research, 31(13):3613–3617, 2003.
Chih-Wei Hsu, Chih-Chung Chang, and Chih-Jen Lin.
A practical guide to support vector classification.
Technical report, Department of Computer Science, National Taiwan University, 2003.
Tin Kam Ho and Eugene M. Kleinberg.
Building projectable classifiers of arbitrary complexity.
In Proceedings of the 13th International Conference on Pattern Recognition, pages 880–885, Vienna, Austria, August 1996.
Chih-Wei Hsu and Chih-Jen Lin.
A comparison of methods for multi-class support vector machines.
IEEE Transactions on Neural Networks, 13(2):415–425, 2002.
J. J. Hull.
A database for handwritten text recognition research.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(5):550–554, May 1994.
Will Hamilton, Zhitao Ying, and Jure Leskovec.
Inductive representation learning on large graphs.
In Advances in Neural Information Processing Systems, 2017.
Yuchin Juan, Yong Zhuang, Wei-Sheng Chin, and Chih-Jen Lin.
Field-aware factorization machines for CTR prediction.
In Proceedings of the ACM Recommender Systems Conference (RecSys), 2016.
S. Sathiya Keerthi and Dennis DeCoste.
A modified finite Newton method for fast solution of large scale linear SVMs.
Journal of Machine Learning Research, 6:341–361, 2005.
Shimon Kogan, Dimitry Levin, Bryan R. Routledge, Jacob S. Sagi, and Noah A. Smith.
Predicting risk from financial reports with regression.
In In Proceedings of the North American Association for Computational Linguistics Human Language Technologies Conference, pages 272–280, 2009.
Alex Krizhevsky.
Learning multiple layers of features from tiny images.
Technical report, University of Toronto, 2009.
S. Sathiya Keerthi, Sellamanickam Sundararajan, Kai-Wei Chang, Cho-Jui Hsieh, and Chih-Jen Lin.
A sequential dual method for large scale multi-class linear SVMs.
In Proceedings of the Forteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 408–416, 2008.
Simon Lucas and A. Amiri.
Statistical syntactic methods for high-performance OCR.
IEE Proceedings-Vision, Image and Signal Processing, 143(1):23–30, 1996.
Ken Lang.
Newsweeder: Learning to filter netnews.
In Proceedings of the Twelfth International Conference on Machine Learning, pages 331–339, 1995.
Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner.
Gradient-based learning applied to document recognition.
Proceedings of the IEEE, 86(11):2278–2324, November 1998.
MNIST database available at
Gaëlle Loosli, Stéphane Canu, and Léon Bottou.
Training invariant support vector machines using selective sampling.
In Léon Bottou, Olivier Chapelle, Dennis DeCoste, and Jason Weston, editors, Large Scale Kernel Machines, pages 301–320. MIT Press, Cambridge, MA., 2007.
Yann LeCun, Fu Jie Huang, and Léon Bottou.
Learning methods for generic object recognition with invariance to pose and lighting.
In Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 97–104, 2004.
Kuan-Min Lin and Chih-Jen Lin.
A study on reduced support vector machines.
IEEE Transactions on Neural Networks, 14(6):1449–1559, 2003.
Li-Chung Lin, Cheng-Hung Liu, Chih-Ming Chen, Kai-Chin Hsu, I-Feng Wu, Ming-Feng Tsai, and Chih-Jen Lin.
On the use of unrealistic predictions in hundreds of papers evaluating graph representations.
In Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI), 2022.
Johannes Loza Mencía, Eneldoand Fürnkranz.
Efficient Multilabel Classification Algorithms for Large-Scale Problems in the Legal Domain, pages 192–215.
Springer Berlin Heidelberg, 2010.
David D. Lewis, Yiming Yang, Tony G. Rose, and Fan Li.
RCV1: A new benchmark collection for text categorization research.
Journal of Machine Learning Research, 5:361–397, 2004.
James McDermott and Richard S. Forsyth.
Diagnosing a disorder in a classification benchmark.
Pattern Recognition Letters, 73:41–43, 2016.
Andrew McCallum and Kamal Nigam.
A comparison of event models for naive bayes text classification.
In Proceedings of the AAAI'98 Workshop on Learning for Text categorization, 1998.
Justin Ma, Lawrence K. Saul, Stefan Savage, and Geoffrey M. Voelker.
Identifying suspicious URLs: An application of large-scale online learning.
In Proceedings of the Twenty Sixth International Conference on Machine Learning (ICML), pages 681–688, 2009.
Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y. Ng.
Reading digits in natural images with unsupervised feature learning.
In NIPS Workshop on Deep Learning and Unsupervised Feature Learning, 2011.
John C. Platt.
Fast training of support vector machines using sequential minimal optimization.
In Bernhard Schölkopf, Christopher J. C. Burges, and Alexander J. Smola, editors, Advances in Kernel Methods - Support Vector Learning, Cambridge, MA, 1998. MIT Press.
Danil Prokhorov.
IJCNN 2001 neural network competition.
Slide presentation in IJCNN'01, Ford Research Laboratory, 2001. .
Jason D. M. Rennie.
Improving multi-class text classification with naive bayes.
Master's thesis, Massachusetts Institute of Technology, 2001.
Anderson Rocha and Siome Goldenstein.
Multiclass from binary: Expanding one-vs-all, one-vs-one and ECOC-based approaches.
IEEE Transactions on Neural Networks and Learning Systems, 25(2):289–302, 2014.
Jason D. M. Rennie and Ryan Rifkin.
Improving multiclass text classification with the Support Vector Machine.
Technical Report AIM-2001-026, Massachusetts Insititute of Technology, 2001.
Soeren Sonnenburg and Vojtech Franc.
COFFIN : A computational framework for linear SVMs.
In Proceedings of the Twenty Seventh International Conference on Machine Learning (ICML), pages 999–1006, 2010.
Shirish Krishnaj Shevade and S. Sathiya Keerthi.
A simple and efficient algorithm for gene selection using sparse logistic regression.
Bioinformatics, 19(17):2246–2253, 2003.
Grigorios Tsoumakas, Ioannis Katakis, and Ioannis Vlahavas.
Effective and efficient multilabel classification in domains with large number of labels.
In Proceedings of ECML/PKDD 2008 Workshop on Mining Multidimensional Data.
Lei Tang and Huan Liu.
Relational learning via latent social dimensions.
In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD), pages 817–826, 2009.
Andrew V Uzilov, Joshua M Keegan, and David H Mathews.
Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change.
BMC Bioinformatics, 7(173), 2006.
Jung-Ying Wang.
Application of support vector machines in bioinformatics.
Master's thesis, Department of Computer Science and Information Engineering, National Taiwan University, 2002.
M. West, C. Blanchette, H. Dressman, E. Huang, S. Ishida, R. Spang, H. Zuzan, J. A. Olson, Jr., J. R. Marks, and J. R. Nevins.
Predicting the clinical status of human breast cancer by using gene expression profiles.
Proceedings of the National Academy of Sciences, 98:11462–11467, 2001.
Steve Webb, James Caverlee, and Calton Pu.
Introducing the webb spam corpus: Using email spam to identify web spam automatically.
In Proceedings of the Third Conference on Email and Anti-Spam (CEAS), 2006.
Chien-Chih Wang, Kent-Loong Tan, Chun-Ting Chen, Yu-Hsiang Lin, S. Sathiya Keerthi, Dhruv Mahajan, Sellamanickam Sundararajan, and Chih-Jen Lin.
Distributed Newton methods for deep learning.
Neural Computation, 30(6):1673–1724, 2018.
Chien-Chih Wang, Kent Loong Tan, and Chih-Jen Lin.
Newton methods for convolutional neural networks.
ACM Transactions on Intelligent Systems and Technology, 11(2):19:1–19:30, 2020.
Hsiang-Fu Yu, Cho-Jui Hsieh, Kai-Wei Chang, and Chih-Jen Lin.
Large linear classification when data cannot fit in memory.
ACM Transactions on Knowledge Discovery from Data, 5(4):23:1–23:23, February 2012.
Guo-Xun Yuan, Chia-Hua Ho, and Chih-Jen Lin.
An improved GLMNET for l1-regularized logistic regression.
Journal of Machine Learning Research, 13:1999–2030, 2012.
Hsiang-Fu Yu, Hung-Yi Lo, Hsun-Ping Hsieh, Jing-Kai Lou, Todd G. McKenzie, Jung-Wei Chou, Po-Han Chung, Chia-Hua Ho, Chun-Fu Chang, Yin-Hsuan Wei, Jui-Yu Weng, En-Syu Yan, Che-Wei Chang, Tsung-Ting Kuo, Yi-Chen Lo, Po Tzu Chang, Chieh Po, Chien-Yuan Wang, Yi-Hung Huang, Chen-Wei Hung, Yu-Xun Ruan, Yu-Shi Lin, Shou-De Lin, Hsuan-Tien Lin, and Chih-Jen Lin.
Feature engineering and classifier ensemble for KDD Cup 2010.
In JMLR Workshop and Conference Proceedings, 2011.
To appear.
Arkaitz Zubiaga.
Enhancing navigation on wikipedia with social tags.
In Proceedings of Wikimania, 2009.