KDD Cup  

Home Page
KDD Cup 2008
KDD Cup 2007
KDD Cup 2006
KDD Cup 2005
KDD Cup 2004
KDD Cup 2003
KDD Cup 2002
KDD Cup 2001
KDD Cup 2000
KDD Cup 1999
KDD Cup 1998
KDD Cup 1997
SIGKDD

KDD Cup 1999: General Information

For the results of the other KDD'99 contest, the knowledge discovery contest, see here.

The task for the classifier learning contest organized in conjunction with the KDD'99 conference was to learn a predictive model (i.e. a classifier) capable of distinguishing between legitimate and illegitimate connections in a computer network. Here is a detailed description of the task. The training and test data were generously made available by Prof. Sal Stolfo of Columbia University and Prof. Wenke Lee of North Carolina State University.

In total 24 entries were submitted for the contest. There was a data quality issue with the labels of the test data, which fortunately was discovered by Ramesh Agarwal (IBM Fellow) and Mahesh Joshi (University of Minnesota Ph.D. candidate) before results were announced publicly. Ramesh Agarwal and Mahesh Joshi have analyzed the data quality issue with great detail and precision, so we are confident that the test data with corrected labels is now correct. Other participants also detected and analyzed the data quality issue, including Itzhak Levin of LLSoft, Inc.

It is important to note that the data quality issue affected only the labels of the test examples. The training data was unaffected, as was the unlabeled test data. Therefore it was not necessary to ask participants to submit recomputed entries.

Each entry was scored against the corrected test data by a scoring awk script using the published cost matrix (see below) and the true labels of the test examples.