KDD-2001 Tutorial

Scalable Frequent-Pattern Mining Methods: An Overview

Jiawei Han, Simon Fraser University
Laks V. S. Lakshmanan, University of British Columbia
Jian Pei, Simon Fraser University


Abstract Presenters
Efficient discovery of frequent patterns from large data sets plays an essential role in data mining. It has been an active theme of research in data mining, with broad applications in industry and deep implications in other themes of data warehousing and data mining. Although many efficient frequent-pattern mining techniques have been developed in the last 7-8 years, such as those listed at the end of this proposal, most of them have been published in scattered conference proceedings in several fields. The motivation for this tutorial is to present to the KDD community a comprehensive overview of this important theme and discuss its applications.

In this tutorial, we will present a comprehensive, state-of-art survey on the frequent-pattern mining methods and applications. The survey covers a wide spectrum of techniques and applications. For the spectrum of frequent-pattern mining, it covers mining associations, correlations, sequential patterns, max- and closed-patterns, partial periodicity, etc., and their applications in classification, data warehousing, spatial databases, multimedia databases, time-series databases, text databases, and WWW. As regards the mining techniques, it covers Apriori, its various kinds of improvements and extensions, projection and frequent-pattern growth techniques, constraint-based mining, sequential pattern mining methods, etc.

Jiawei Han (Ph.D., Univ. of Wisconsin at Madison, 1985) is Director of Intelligent Database Systems Research Laboratory, and Professor of School of Computing Science at Simon Fraser University in Canada. He has conducted research in the areas of data mining, data warehousing, spatial data mining, Web mining, multimedia data mining, deductive and object-oriented databases, and logic programming, with over 150 journal and conference publications. He is a project leader of the Canada NCE/IRIS-3 project ``Building, Querying, Analyzing, and Mining Data Warehouses on the Internet'' (1998-2002). He has served or is currently serving in the program committees of over 50 international conferences and workshops, including SIGMOD'99, SIGKDD'99 (tutorial chairman), SIGMOD'2000 (demo chairman), EDBT'2000, VLDB'2000, SIAMDM'2001 (PC co-chairman), SIGKDD'2001 (Best paper award chairman), and PAKDD'2001 (conference co-chairman). He has also served as an editor for IEEE Transactions on Knowledge and Data Engineering, Data Mining and Knowledge Discovery, and Journal of Intelligent Information Systems. His book Data Mining: Concepts and Techniques by Morgan Kaufmann (2000) has been popularly adopted as a textbook in universities.

Laks V. S. Lakshmanan (Ph.D., India Inst. of Science, 1987) is a Professor of Computer Science at the University of British Columbia in Canada. He has conducted research in the areas of data mining, data warehousing, Web database systems, semistructured databases and XML, multidatabase interoperability, deductive and object-oriented databases, logic programming, and theoretical computer science, with numerous publications in leading conferences and journals. He has served or is currently serving in the program committees of all major database conferences and workshops. He is currently serving as a subject editor for IEEE Transactions on Knowledge and Data Engineering.

Jian Pei is a Ph.D. Candidate at Simon Fraser University in Canada. He received a B.Sc. and M.Sc. in Shanghai Jiaotong University, China, and was a Ph.D. candidate of Peking University, China, before joining SFU. His current research interests include data mining, data warehousing and WWW technology. He has published research papers at SIGMOD'2000, DMKD'2000, KDD'2000, ICED'01, SIGMOD'01, DMKD'01, VLDB'01, etc. He expects to obtain his Ph.D. degree in the summer of 2002.