KDD-2001 Tutorial

E-Business Enterprise Data Mining

Usama Fayyad, digiMine, Inc.
Neal Rothleder, digiMine, Inc.
Paul Bradley, digiMine, Inc.


Abstract Presenters
Successful deployment of analytical solutions to e-business enterprise data requires data warehouse construction, efficient updates over multiple data sources, integration of data mining technologies with the underlying warehouse, and delivery of results in a form consumable by business end-users. In an e-business enterprise environment, there is a critical need to effectively deal with web-log data and integrate information extracted from the web-log with other data sources such as user profile data, product catalog information, transaction and sales data, advertising campaign data, operations data, billing information, inventory control data, etc. Data integration at the warehouse level often involves legacy data and data sets that continually grow larger over time. The "enterprise" is about understanding and acting on all aspects of the data by leveraging technologies ranging from simple counts or queries to sophisticated data mining analysis. In this environment it is critical to track these analyses over time and understand the evolution of the e-business. 

Once the data warehouse has been constructed, and processes are in place to efficiently update it over various data sources, the next steps involve integrating and automating data mining technology. Architectural design of the automation pipeline must consider the constraints imposed by large scale data. Focus includes implementation issues, hardware imposed constraints, processing time constraints, etc. The key challenge is delivering interesting, actionable data mining results to an end-user with a background in marketing, sales, business development, or merchandising rather than data mining or advanced analytics.

Usama Fayyad is a co-founder and has served as President and CEO since digiMine's inception in March 2000. Prior to digiMine, Usama founded and led Microsoft Research's Data Mining & Exploration (DMX) Group. His work at Microsoft Research included development of data mining components for Microsoft Site Server (Commerce Server 3.0 and 4.0) and SQL Server and OLAP Services. Usama helped establish a new industry standard in data mining based on Microsoft's OLE DB API. Prior to Microsoft Research, Usama founded the Machine Learning Systems Group and developed data mining systems at the Jet Propulsion Laboratory (JPL), California Institute of Technology. During that time he received the most distinguished excellence award from Caltech/JPL and a U.S. Government Medal from NASA. He remained affiliated with JPL as Distinguished Visiting Scientist after joining Microsoft. Usama has a Ph.D. in engineering from the University of Michigan, Ann Arbor (1991). He has served as program co-chair of KDD-94 and KDD-95 and as general chair of KDD-96 and KDD-99. Usama serves as Editor-in-Chief of the journal Data Mining and Knowledge Discovery and SIGKDD Explorations.

Neal Rothleder serves as the Lead Program Manager for Data Mining at digiMine, Inc. His focus is on delivering powerful, scalable data mining solutions to business users in an intuitive, actionable framework. His research interests include machine learning approaches to data mining, recently focusing on making academic research work in real-world problems and incorporating domain knowledge into data mining. Prior to joining digiMine, Dr. Rothleder was a Lead Engineer with the MITRE Corporation working on research and development in data mining technologies and applications. While there, he worked on projects in network intrusion detection, aviation safety, and a variety of fraud detection scenarios. Dr. Rothleder has held adjunct faculty appointments at the University of Michigan and George Mason University. He holds a Ph.D. and an M.S. in Computer Science and Engineering from the University of Michigan.

Paul Bradley is Data Mining Development Lead at digiMine. His primary focus is on integrating data mining technology into digiMine's service offering. Prior to joining digiMine, he was a Researcher in the Data Management, Exploration and Mining Group at Microsoft Research. While at Microsoft Research, he worked on developing new data mining algorithms and on shipping data mining components in Microsoft products such as SQL Server and Commerce Server. His research interests include classification and clustering algorithms; underlying mathematical problem formulations; and issues related to scalability. He received the Ph.D. degree from the University of Wisconsin in 1998 on the topic of mathematical programming and data mining. Paul serves as Associate Editor of SIGKDD Explorations, is KDD-2001 Exhibits Chair and was KDD-2000 Publicity Chair.