KDD-2003
 
    The Ninth ACM SIGKDD International Conference on
    Knowledge Discovery and Data Mining
Washington, DC, USA     August 24 - 27, 2003

HOME
Organization
Program
For Authors
KDD-Cup
Registration
Hotel


KDD 2003 Program

Statistics

347 paper abstracts were submitted.
298 papers were submitted. 258 research track submissions reviewed by the Program Committe, and 40 industrial/government track submissions reviewed by the Industrial/Government Track Program Committee.
For the research track, 34 were accepted for oral presentation and 36 were accepted for poster presentation.
For the indust/gov track, 13 were accepted for oral presentation and 10 were accepted for poster presentation.
15 workshop proposals were submitted.
9 workshops will be presented at the conference..
13 tutorial proposals were submitted.
7 tutorials will be presented at the conference..


Preliminary Program

Sunday August 24
9:00-18:00 (Concourse)
Registration


10:00-10:15 (International Ballroom - Center)
Opening Remarks
Ted Senator, General Chair
Pedro Domingos, Christos Faloutsos, Program Chairs


10:15-10:30 (International Ballroom - Center)
Award Presentations
Chairs: Mark Craven, Daryl Pregibon


10:30-11:30 (International Ballroom - Center)
Award Talk
Chair: Gregory Piatetsky-Shapiro

Innovation Award Talk by Heikki Mannila


11:30-12:30 (International Ballroom - Center)
KDD Cup Awards
Chairs: Johannes Gehrke, Paul Ginsparg, Jon Kleinberg


12:30-14:00
Lunch (on your own)


14:00-15:00 (International Ballroom - Center)
Joint KDD/ICML Invited Talk
Chair: Pedro Domingos

Statistical Learning from Relational Data
Daphne Koller, Stanford University


15:00-16:00 (International Ballroom - Center)
Joint KDD/ICML Session I
Chair: Pedro Domingos

BEST RESEARCH PAPER AWARD
Maximizing the Spread of Influence through a Social Network
David Kempe, Jon Kleinberg, Eva Tardos

Bayesian Network Anomaly Pattern Detection for Disease Outbreaks
Weng-Keen Wong, Andrew Moore, Gergory Cooper, Michael Wagner


15:00-18:30 (Georgetown Room)
Tutorial: The Top 10 Data Mining Mistakes-and How to Avoid Them
John F. Elder, Elder Research, USA


16:00-16:30
Coffee Break


16:30-18:30 (International Ballroom - Center)
Joint KDD/ICML Session II
Chair: Tom Fawcett

XRules: An Effective Structural Classifier for XML Data
Mohammed Zaki, Charu Aggarwal

Learning on the Test Data: Leveraging "Unseen" Features
Ben Taskar, Ming Fai Wong, Daphne Koller

Information-Theoretic Co-clustering
Inderjit Dhillon, Subramanyam Mallela, Dharmendra Modha

ICML BEST STUDENT PAPER AWARD
A Kernel between Sets of Vectors
Risi Kondor, Tony Jebara

Monday August 25
7:30-8:30
Continental Breakfast


8:00-18:00 (Concourse)
Registration


10:00-17:00 (Exhibit Hall)
Exhibits


8:30-9:30 (International Ballroom - Center)
Invited Talk
Chair: Christos Faloutsos

On-Line Science: The World-Wide Telescope as a Prototype for the New Computational Science
Jim Gray, Microsoft Research


9:30-10:00
Coffee Break


10:00-12:00 Research Track 1 (Monroe Room)
Clustering and Pattern Discovery
Chair: Gregory Piatetsky-Shapiro

Privacy-Preserving K-Means Clustering over Vertically Partitioned Data
Jaideep Vaidya, Chris Clifton

Assessment and Pruning of Hierarchical Model Based Clustering
Jeremy Tantrum, Alejandro Murua, Werner Stuetzle

Generative Model-Based Clustering of Directional Data
Arindam Banerjee, Inderjit Dhillon, Joydeep Ghosh, Suvrit Sra

An Iterative Hypothesis-Testing Strategy for Pattern Discovery
Richard Bolton, Niall Adams


10:00-12:00 Research Track 2 (Military Room)
Temporal Data
Chair: Sunita Sarawagi

Indexing Multi-Dimensional Time-Series with Support for Multiple Distance Measures
Michail Vlachos, Marios Hadjieleftheriou, Dimitrios Gunopulos, Eamonn Keogh

Translation-Invariant Mixture Models for Curve Clustering
Darya Chudova, Scott Gaffney, Eric Mjolsness, Padhraic Smyth

Generating English Summaries of Time Series Data Using the Gricean Maxims
Somayajulu Sripada, Ehud Reiter, Jim Hunter, Jin Yu

To Buy or Not to Buy: Mining Airline Fare Data to Minimize Ticket Purchase Price
Oren Etzioni, Craig Knoblock, Rattapoon Tuchinda, Alexander Yates


10:00-12:00 Industrial/Govt. Track (Georgetown Room)
IT
Chair: Michael Pazzani

Passenger-Based Predictive Modeling of Airline No-show Rates
Richard D. Lawrence, Se J. Hong, Jacques Cherrier

The Data Mining Approach to Automated Software Testing
Mark Last, Menahem Friedman, Abraham Kandel

Critical Event Prediction for Proactive Management in Large-scale Computer Clusters
R. K. Sahoo, A. J. Oliner, I. Rish, M. Gupta, J. E. Moreira, S. Ma

Information Awareness: A Prospective Technical Assessment
David Jensen, Matt Rattigan, and Hannah Blau


12:00-13:30 (International Ballroom - Center)
Lunch


13:30-15:00 Research Track 1 (Military Room)
Classification and Contrast Sets
Chair: Lorenza Saitta

Classifying Large Data Sets Using SVMs with Hierarchical Clusters
Hwanjo Yu, Jiong Yang, Jiawei Han

Cross-Training: Learning Probabilistic Mappings Between Topics
Sunita Sarawagi, Soumen Chakrabarti, Shantanu Godbole

On Detecting Differences Between Groups
Geoff Webb, Shane Butler, Douglas Newlands


13:30-15:00 Industrial/Govt. Track (Monroe Room)
Science
Chair: R. Bharat Rao

Capturing Best Practice for Microarray Gene Expression Data Analysis
Gregory Piatetsky-Shapiro, Tom Khabaza, Sridhar Ramaswamy

Frequent-Subsequence-Based Prediction of Outer Membrane Proteins
Rong She, Fei Chen, Ke Wang, Martin Ester, Jennifer L. Gardy, Fiona S. L. Brinkman

Discovery of Climate Indices using Clustering
Michael Steinbach, Pang-Ning Tan, Vipin Kumar, Steven Klooster, Christopher Potter


13:30-17:00 (Georgetown Room)
Tutorial: Multi-Relational Data Mining
Luc DeRaedt, Albert-Ludwigs-University Freiburg, Germany
Saso Dzeroski, Jozef Stefan Institute, Slovenia


15:00-15:30 Coffee Break


15:30-17:00 (International Ballroom - Center)
Panel: Privacy and Data Mining: Friends or Foes?
Chair: Dr. Rakesh Agrawal, IBM Almaden Research Center

Panelists:
Prof. Christopher Clifton, Purdue University
Dr. Lawrence Cox, National Center for Health Statistics
Mr. James Dempsey, Center for Democracy & Technology
Mr. Daniel Gallington, Potomac Institute
Prof. Latanya Sweeney, Carnegie Mellon University
Dr. Bhavani Thuraisingham, National Science Foundation
Prof. Jeff Ullman, Stanford University


17:00-18:30 (International Ballroom - Center)
Poster Highlights
Chair: Usama Fayyad


18:30-20:30 (Exhibit Hall)
Poster Session and Reception
Tuesday August 26
7:30-8:30
Continental Breakfast


8:00-18:00 (Concourse)
Registration


10:00-17:00 (Exhibit Hall)
Exhibits


8:30-9:30 (International Ballroom - Center)
Invited Talk
Chair: Paul Bradley

Analyzing Customer Behavior at Amazon.com
Andreas Weigend, Chief Scientist, Amazon.com


9:30-10:00
Coffee Break


10:00-11:30 Research Track 1 (Monroe Room)
Relational and Graph Data
Chair: Ray Mooney

Aggregation-Based Feature Invention and Relational Concept Classes
Claudia Perlich, Foster Provost

Algorithms for Estimating Relative Importance in Networks
Scott White, Padhraic Smyth

CloseGraph: Mining Closed Frequent Graph Patterns
Xifeng Yan, Jiawei Han


10:00-11:30 Research Track 2 (Georgetown Room)
Data Streams and Sequential Data
Chair: Johannes Gehrke

Mining Concept-Drifting Data Streams using Ensemble Classifiers
Haixun Wang, Wei Fan, Philip Yu, Jiawei Han

Efficient Elastic Burst Detection in Data Streams
Yunyue Zhu, Dennis Shasha

Fragments of Order
Aristides Gionis, Teija Kujala, Heikki Mannila


10:00-11:30 Industrial/Govt. Track (Military Room)
Healthcare
Chair: Eric Bloedorn

Mining Hepatitis Data with Temporal Abstraction
Tu B. Ho, Trong Dung Nguyen, S. Kawasaki, S. Q. Le, H. Yokoi, K. Takabayashi

Clinical and Financial Outcomes Analysis with Existing Hospital Patient Records
R. Bharat Rao, Radu S. Niculescu, Colin Germond, Harsha Rao

BEST APPLICATION PAPER AWARD
Empirical Bayesian Data Mining for Discovering Patterns in Post-Marketing Drug Safety
David M. Fram, June S. Almenoff, William DuMouchel


11:45-13:45 (International Ballroom - Center)
SIGKDD Business Lunch


14:00-15:30 Research Track 1 (Monroe Room)
Web Mining and Data Cubes
Chair: Ronny Kohavi

Eliminating Noisy Information in Web Pages for Data Mining
Lan Yi, Bing Liu, Xiaoli Li

SEWeP: Using Site Semantics and a Taxonomy to Enhance the Web Personalization Process
Magdalini Eirinaki, Michalis Vazirgiannis, Iraklis Varlamis

Extracting Semantics from Data Cubes using Cube Transversals and Closures
Alain Casali, Rosine Cicchetti, Lotfi Lakhal


14:00-15:30 Research Track 2 (Georgetown Room)
Distance-Based Methods
Chair: Martin Ester

Towards Systematic Design of Distance Functions for Data Mining Applications
Charu Aggarwal

Mining Distance-Based Outliers in Near Linear Time with Randomization and a Simple Pruning Rule
Stephen Bay, Mark Schwabacher

Adaptive Duplicate Detection Using Learnable String Similarity Measures
Mikhail Bilenko, Raymond Mooney


14:00-15:30 Industrial/Govt. Track (Military Room)
Systems
Chair: Monte Hancock

Knowledge-Based Data Mining
Sholom M. Weiss, Stephen J. Buckley, Shubir Kapoor, Sřren Damgaard

The Anatomy of a Multimodal Information Filter
Yi-Leh Wu, King-Shy Goh, Beitao Li, Huaxing You, Edward Y. Chang

Golden Path Analyzer: Using Divide-and-Conquer to Cluster Web Clickstreams
Kamal Ali, Steven P. Ketchpel


15:30-16:00
Coffee Break


15:45-18:45 (Georgetown Room)
Tutorial: Information Extraction from the World Wide Web
William Cohen, Carnegie Mellon University
Andrew McCallum, University of Massachusetts, Amherst


16:00-18:30 Research Track 1 (Monroe Room)
Frequent Sets
Chair: Geoff Webb

Screening and Interpreting Multi-item Associations Based on Log-linear Modeling
Xintao Wu, Daniel Barbara, Yong Ye

Fast Vertical Mining Using Diffsets
Mohammed Zaki, Karam Gouda

CLOSET+: Searching for the Best Strategies for Mining Frequent Closed Itemsets
Jianyong Wang, Jiawei Han, Jian Pei

Inverted Matrix: Efficient Discovery of Frequent Items in Large Datasets in the Context of Interactive Mining
Mohammad El-Hajj, Osmar R. Zaiane

Mining Unexpected Rules by Pushing User Dynamics
Ke Wang, Yuelong Jiang, Laks Lakshmanan


16:00-17:30 Research Track 2 (International Ballroom - Center)
Data Reduction and Visualization
Chair: Mihael Ankerst

Efficient Data Reduction with EASE
Hervé Brönnimann, Bin Chen, Manoranjan Dash, Peter Haas, Peter Scheuermann

PROXIMUS: A Framework for Analyzing Very High Dimensional Discrete-Attributed Datasets
Mehmet Koyuturk, Ananth Grama

Visualizing Changes in the Structure of Data for Exploratory Feature Extraction
Elias Pampalk, Werner Goebl, Gerhard Widmer


17:30-18:30 (International Ballroom - Center)
Panel: Data Mining: The Next 10 Years
Chair: Dr. Usama Fayyad, President, DMX Group

Panelists:
Dr. Rakesh Agrawal, IBM Almaden Research
Dr. Gregory Piatetsky-Shapiro, KDnuggets
Dr. Daryl Pregibon, AT&T Research
Prof. Ragu Ramakrishnan, University of Wisconsin, Madison
Dr. Ramasamy Uthurusamy, General Motors


18:30-19:30 (Adams Room)
Transfer meeting - KDD 2003 and KDD 2004 organizing committees


19:30-22:00
Program committee dinner (by invitation only)
Wednesday August 247
8:00-12:00 (Concourse)
Registration


8:30-17:00
Full Day Workshops:

BIOKDD03: Data Mining in Bioinformatics (Monroe East)

Data Cleaning, Record Linkage and Object Consolidation (Georgetown West)

Fractals and Self Similarity in Data Mining: Issues and Approaches (Map Room - terrace level)

Link Analysis (Georgetown East)

MDM/KDD 2003: Integrated Media Mining (Caucus Room - terrace level)

MRDM 2003: Multi-relational Data Mining (Monroe West)

Operational Text Classification (Hemisphere Room)

WebKDD2003: WebMining as a Premise to Intelligent and Effective Web Applications (Military Room)


8:30-12:00
Half Day Workshop:

Data Mining Standards, Services and Platforms (Conservatory - terrace level)


8:30-12:00

Tutorial: Data Mining for Computer Security (Lincoln West)
Carla Brodley, Purdue University
Philip Chan, MIT/FIT

Tutorial: Data Mining for Machine Learners (Thoroughbred Room)
Johannes Gehrke, Cornell University
Jiawei Han, University of Illinois at Urbana


12:00-13:30
Lunch (on your own)


13:30-17:00

Tutorial: Privacy-Preserving Data Mining (Lincoln West)
Chris Clifton, Purdue University

Tutorial: Sequence Data Mining Techniques and Applications (Thoroughbred Room)
Mark Craven, University of Wisconsin, Madison
Sunita Sarawagi, IIT Bombay


Archives:

    Calls for Papers and Proposals


Webmaster: Osmar R. Zaďane
Last updated: August 6, 2003