KDD Cup  

Home Page
KDD Cup 2008
KDD Cup 2007
KDD Cup 2006
KDD Cup 2005
KDD Cup 2004
KDD Cup 2003
KDD Cup 2002
KDD Cup 2001
KDD Cup 2000
KDD Cup 1999
KDD Cup 1998
KDD Cup 1997
SIGKDD

KDD Cup 2000: General Information

Co-Chairs

Carla Brodley, School of Electrical and Computer Engineering, Purdue University
Ronny Kohavi, Blue Martini Software
Special thanks to Brian Frasca, Llew Mason, and Zijian Zheng from Blue Martini Software and Ben Bernstein from Gazelle.com
Thanks to Acxiom for providing data enhancements.

Email: kddcup2000@bluemartini.com

Summary talk presented at KDD (8/20/2000)
KDD-Cup 2000 organizers' report: Peeling the onion. SIGKDD Explorations, 2(2):86-98, 2000

General Information (updated Apr 2002)

The KDD Cup 2000 domain contains clickstream and purchase data from Gazelle.com, a legwear and legcare web retailer that closed their online store on 8/18/2000.

You are required to sign a non-disclosure agreement in order to receive a password to access the data, although the original restrictions have been dramatically relaxed on Apr 2002 to allow wider use of the data. Basically, any use of the data is allowed as long as the proper acknowledgment is provided and a copy of the work is provided to Blue Martini Software.

In order to access the data, you must fill out the form on this page. Your username and password will be emailed to you.

When you have received a username and password (see above), you can go to the confidential section of the site, which contains a description of the tasks, the data, background information, and more.

The reference to the KDD Cup 2001 is as follows (a PDF is available here):

Ron Kohavi, Carla Brodley, Brian Frasca, Llew Mason, and Zijian Zheng. KDD-Cup 2000 organizers' report: Peeling the onion. SIGKDD Explorations, 2(2):86-98, 2000. http://robotics.stanford.edu/users/ronnyk/kddOrganizerReport.pdf

The bibtex entry is:

@Article{kddcup2000,
author = {Ron Kohavi and Carla Brodley and Brian Frasca and Llew Mason and Zijian Zheng},
title = {{KDD-Cup} 2000 Organizers' Report: Peeling the Onion},
journal = {SIGKDD Explorations},
volume = {2},
number = {2},
pages = {86--98},
url = {http://robotics.stanford.edu/users/ronnyk/kddOrganizerReport.pdf},
year = 2000}

A paper describing the Blue Martini architecture is available here

Suhail Ansari, Ron Kohavi, Llew Mason, and Zijian Zheng, Integrating E-Commerce and Data Mining: Architecture and Challenges, ICDM 2001.

Please remember the restrictions on the data.

Real Datasets for Association Rule Discovery (updated Oct 2002)

Three real-world datasets are available. You are required to sign a simple non-disclosure agreement in order to receive a password to access the data. Basically, any use of the data is allowed as long as the proper acknowledgment to Blue Martini Software is provided and a copy of the work is sent (e-mail is fine). For reference, please reference the following article instead of the KDD Cup paper:

Zijian Zheng, Ron Kohavi, and Llew Mason, Real World Performance of Association Rule Algorithms, KDD 2001.

The bibtex entry is:

@inproceedings{ zheng-kohavi-mason-real-assoc,
author = "Zijian Zheng and Ron Kohavi and Llew Mason",
title = "Real World Performance of Association Rule Algorithms",
booktitle = "Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining",
editor={Foster Provost and Ramakrishnan Srikant},
pages={401--406},
year = 2001,
url = {http://robotics.Stanford.EDU/users/ronnyk/realWorldAssoc.pdf}}

  • BMS-WebView-1
  • BMS-POS

  • Note, a long version of the oroginal paper is available as well as the slides.
    Please remember the restrictions on the data.

    There were five questions at the KDD Cup 2000. The results for Question 2 have been revised. When we calculated the results at Purdue, we had a subtle bug. The bug was uncovered thanks to Yoshinori Yaginuma who calculated his own score using the posted test data. We have corrected for this bug and have posted the new results for Question 2 (11/20/00).

    Question 1 Winner: Amdocs ( Paper , Poster )

    Given a set of page views, will the visitor view another page on the site or will the visitor leave?

    Honorable Mentions: Mui Seng Martin Lee, Chong Jin Ong and S. Sathiya Keerthi of Mechanical and Production Engineering Department, National University of Singapore

    Question 2 Winner: Salford Systems, Inc

    Given a set of page views, which product brand will the visitor view in the remainder of the session?

    Honorable Mentions: MP13 team of Alexei Vopilov, Ivan Shabalin and Vladimir Mikheyev, and the team of Mukund Deshpande, George Karypis, Department of Computer Science and Engineering, University of Minnesota

    Question 3 Winner: Salford Systems, Inc

    Given a set of purchases over a period of time, characterize visitors who spend more than $12 (order amount) on an average order at the site.

    Honorable Mentions: Orit Rafaely, Tel-Aviv University and Amdocs

    Question 4 Winner: e-steam ( Poster )

    Given a set of page views, characterize killer pages, i.e., pages after which users leave the site.

    Honorable Mentions: SAS, Amdocs, and LLSoft, Ltd

    Question 5 Winner: Amdocs ( Paper , Poster )

    Given a set of page views, characterize which product brand a visitor will view in the remainder of the session?

    Schedule (passed)

  • Data available: 5/20 - now available
  • Question period 1: 5/20-5/30 - passed
  • Test set available: 7/7 - now available
  • Question period 2: 7/10-7/17 - passed
  • Entries due: 7/17 - passed
  • Winners notified: 7/28 - passed
  • Talks due: 8/07 - passed
  • Talks approved by: 8/14 - passed
  • Winners announced: KDD Conference - passed

  • Summary talk presented at KDD 8/20/2000
    KDD-Cup 2000 organizers' report: Peeling the onion. SIGKDD Explorations, 2(2):86-98, 2000