Homework 1
Go to
UCI machine learning repository
and download the data soybean-large.data.
Prepare two files (one with all attributes
numerical and another as nominal) with ARFF format.
Train the file and test the file
soybean-large.test
Use the C4.5 implementation in
Weka to train and
classify these two files.
The C4.5 implementation in weka is called
weka.classifiers.j48.J48.
Then write a short report (<= 2 pages) in English
to describe what you find.
How to run weka on our linux system ?
If you use IBM java:
- Download weka
- Uncompress the data
/opt/IBMJava2-13/bin/jar xvf weka-3-2-1.jar
- Test their sample data
/opt/IBMJava2-13/bin/java -cp weka.jar weka.classifiers.j48.J48 -t ./data/iris.arff
If you use SUN java (i.e. /usr/bin/java and jar):
- Download weka
- Uncompress the data
jar xvf weka-3-2-1.jar
- Test their sample data
java -classpath /usr/lib/jdk1.1/lib/classes.zip:weka.jar weka.classifiers.j48.J48 -t ./data/iris.arff
Last modified: Sat Oct 6 22:49:13 CST 2001