|
|
|

Following the success of the demonstration sessions in previous
KDD conferences, the KDD-97 program will also
include demonstrations of knowledge discovery products, knowledge
discovery applications and research
prototypes. Unlike previous demonstration sessions, we will clearly
differentiate between commercial product
demonstrations and research demonstrations.
|


Both commercial product exhibits and research prototype demonstrations
will be held at the
Newport Beach Marriott Hotel and Tennis Club, Newport Beach, California,
Friday, August 15, 1997, from 12.30pm to 5.00pm.
|


We invite commercial vendors to exhibit at KDD-97.
The exhibitor fee for KDD-97 will be a nominal $250.00.
Exhibitors will be provided with a 6 foot table top.
In this space vendors will be allowed to distribute product or
company literature, show product demonstrations and set up signage.
Vendors will need to bring all necessary
hardware and software that they will require for their demonstrations.
The exhibit area will be open August 15th from 12:30-5:00 pm.
Total attendance at KDD-96 was 457. Of these 35 percent were affiliated
with universities and 65 percent were
affiliated with industry. If you would like to exhibit at KDD-97 please
fill out the registration form and send it along
with the name of your product(s) and/or service(s) and a 200 word (maximum)
description of product(s)/service(s) to:
AAAI
KDD-97 Exhibit
445 Burgess Drive
Menlo Park, CA 94025 USA
Your description will be published in the conference program.
The current list of exhibitors is shown below.
AcknoSoft
- Contact: Michel Manago
- 58 rue du dessouse des berges
- 75013, Paris
- France
- Tel: 331 44 24 88 00
- Fax: 331 44 24 88 66
- Email: manago@ibpc.fr
- Web: www.acknosoft.com
The KATE suite includes KATE datamining to generate automatically
decision trees from cases, KATE-CBR to retrieve cases that are similar
to a query and reason by analogy, KATE-editor to build a database that
is object-oriented or link to most existing market databases. As an
option, users may add KATE for R/3 Service Management to link to SAP and
KATE WebServer to access the decision support system over an intranet or
the internet. Written in C, KATE is also available as dynamic link
libraries in Windows or as shared libraries on Unix (Sun Solaris). The
KATE suite is tailor made for analyzing complex data in technical
domains. Applications of KATE include: maintenance of Boeing 737 engines
(Cfm), of marine diesel engines (New Sulzer Diesel, Switzerland), of
Naples' metro (Ansaldo in Italy), of the French high speed train TGV
(GEC Alsthom), help desk for robots in the plastic industry (Sepro in
France), telecom networks (Alcatel in France), CAD/CAM workstations
(Mercedes, Germany), reliability analysis of offshore platforms (Nork
hydro Norway), nuclear power plants (French electricity), industrial gas
meters (French Gas), the Ariane space center (Matra), quality management
in manufacturing (Schlumberger), evaluation of costs for manufacturing
plastic parts (Legrand, France), sales support for electronic devices
(Analog devices).
IBM Corporation
- Contact: Carolina Salcedo
- Route 100, Maildrop 3128
- Sommers, NY 10589
- Tel: (914) 766-3929
- Fax: (914) 766-8328
- Email: csalcedo@vnet.ibm.com
Researchers at many IBM labs around the world are continuously
developing powerful algorithms for analyzing large data sets stored in
databases or flat files. These algorithms cover a full spectrum of data
mining technologies and enable analyses ranging from classification and
predictive modeling to association discovery and database segmentation.
They can run on small workstations but are highly scaleable since they
have parallel implementations optimized to handle large and parallel
super computers and databases. Along with a comprehensive set of data
analysts and application developers through IBM's flagship data mining
technology product, the Intelligent Miner. Leveraging the Intelligent
Miner technology, IBM has developed a collection of applications,
Business Discovery Solutions (BDS), to make data mining more accessible
to Business Users. Through an easy to use Java GUI, BDS addresses
business problems such as customer retention, risk analysis, store
layout optimization, and cross selling. IBM's top-caliber data mining
analysts have extensive industry expertise and have helped more than 60
companies exploit the new developments in data mining. IBM, on whose
machines 70% of the world's data reside, supports all the components
required to guarantee a customer's success in data mining.
Information Discovery, Inc.
- Contact: Pamela Lerwick
- 703B Pier Avenue
- Suite 169
- Hermosa Beach, CA 90254
- Tel: (310) 937-3600
- Fax: (310) 937-0967
- Email: datamine@ix.netcom.com
- Web: www.datamining.com
The Data Mining SuiteTM is a comprehensive and integrated set of data
mining products that provide complete solutions for knowledge discovery,
predictive modeling and internet/intranet-based applications within a
unified frame-work. Each product is novel and useful in its own right,
but the joint application of the techniques used within the suite
delivers unprecedented benefits to corporate users. The Data Mining
SuiteTM is highly scaleable and accesses large SQL databases directly
without sampling or extracts.
Kluwer Academic Publishers
- Contact: Adam Chesler
- PO Box 358
- Accord Station
- Hingham, MA 02018-0358
- Tel: (617) 871-6600
- Fax: (617) 871-6528
- Email: achesler@wkap.com
- Web: www.wkap.nl
Kluwer Publishers will have the journal Data Mining and Knowledge
Discovery (Editors-in-chief: Usama Fayyad, Heikki Mannila, and Gregory
Shapiro-Piatetsky) on display! Many other fine journals are also
available for review, as well as over 25 new books, discounted 20% for
KDD '97 attendees!
MathSoft, Inc.
- Contact: Tina Styer
- Data Analysis Products Division
- 1700 Westlake Ave. N. #500
- Seattle, WA 98109
- Tel: 800-569-0123
- Email: mktg@statsci.com
- Web: www.mathsoft.com
MathSoft, Inc., is a leading provider of knowledge discovery software,
with more than a million users worldwide. Mathsoft has a flexible family
of products, offering solutions for data analysis, statistical data
mining and decision support. S-Plus is the premier solution for powerful
data analysis, visualization and statistical data mining. S-Plus offers
the richest data analysis environment available, with over 2,000
built-in functions including both classical and robust techniques, all
within a customizable, intuitive user interface. In addition, exclusive
TRELLIS graphics allow you to reveal hidden meanings in complex,
multidimensional data. MathSoft Stat Server is a powerful new approach
that delivers a competitive edge in decision support. The first
enterprise-wide solution for distributing sophisticated analyses and
graphics, StatServer leverages your company's existing client/server and
Internet/intranet technology to put information in the hands of decision
makers. Stop by to see a demonstration of the power of S-Plus and
StatServer.
Morgan Kaufmann Publishers
- Contact: Patricia Kim
- 340 Pine Street, 6th Floor
- San Francisco, CA 94104-3205
- Tel: (650) 392-2665
- Fax: (650) 982-2665
- Email: pkim@mkp.com
- Web: www.mkp.com
Since 1984, Morgan Kaufmann has published the finest technical
information resources for computer and engineering professionals. Our
audience includes the research and development communities, information
technology (IS/IT) managers, and students in professional degree
programs. We publish in book and digital form in such areas as
databases, computer networking, computer systems, human computer
interaction, computer graphics, multimedia information and systems,
artificial intelligence, and software engineering. Many of our books are
considered to be the definitive works in their fields. Please stop by
our display and receive a 15% Knowledge Discovery and Data Mining 1997
conference discount.
PC AI Magazine
- Contact: Robin Okun
- P.O. Box 30130
- Phoenix, AZ 85046
- Tel: (602) 971-1869
- Fax: (602) 971-2321
- Email: robin@pcai.com
- Web: www.pcai.com/pcai/
PC AI Magazine provides the information necessary to help managers,
programmers, executives, and other professionals understand the quickly
unfolding realm of artificial intelligence (AI) and intelligent
applications (IA). PC AI addresses the entire range of personal
computers including the Mac, IBM PC, neXT, Apollo, and more. PC AI
features developments in expert systems, neural networks, object
oriented development, and all other areas of artificial intelligence.
Feature articles, product reviews, real-world application stories, and a
Buyer's Guide present a wide range of topics in each issue.
Salford Systems
- Contact: Kerry Martin
- 8880 Rio San Diego Dr.
- Suite 1045
- San Diego, CA 92108
- Tel: (619) 543-8880
Salford Systems is exhibiting CART 3.0 (Classification and Regression
Trees), a tree-structured data-mining tool, co-developed with the
original authors of CART at UC Berkeley and Stanford (Breiman, Freidman,
Olshen, and Stone). The premier decision tree tool-complete with
built-in n-fold cross-validation, user-definable variable
misclassification costs, linear combination splits, efficient handling
of high-dimensional categorical predictors, and the new feature of
combining multiple trees-is now available in one affordable package.
CART 3.0 has a completely revised Windows graphical user interface and
the ability to interact with the tree after database analysis. Useful
diagnostics include dynamic pruning, gains charts for all sub-trees, and
simultaneous viewing of training and test data accuracy scores.
Experienced users can control CART 3.0 via command scripts, while
newcomers can use point-and-click menu selections. Throughout an
interactive session using the menu, CART 3.0 records command
equivalents, providing an audit trail for the session. An in-depth,
comprehensive manual explains every feature and nuance of CART within
the context of over 30 examples. Salford Systems has been developing
advanced tools for data analysis for PC and Unix platforms since 1983,
and also provides consulting services to the telecommunications,
financial services, health care and direct mail industries.
Silicon Graphics
- Contact: Ron Kohavi
- Mailstop 80-876
- 2011 N. Shoreline Blvd.
- Mountain View, CA 94043
- Tel: (650) 933-3126
- Fax: (650) 932-2874
- Web: www.sgi.com
MineSet (TM) version 2.0 is the fourth release of SGI's product for
exploratory data analysis. Combining powerful integrated, interactive
tools for data access and transformation, data mining, and visual data
mining, MineSet provides you with a revolutionary paradigm for getting
maximum value from your vast data resources. MineSet enables you to gain
a deeper, intuitive understanding of your data, by helping you to
discover hidden patterns, important trends and new knowledge. It is this
deep understanding which can be used for developing powerful business
strategies leading to greater competitive advantage.
SRA International
- Contact: Jim Hayden
- 4300 Fair Lakes Court
- Fairfax, VA 22033
- Tel: (703) 803-1689
- Fax: (703) 803-1793
- Email: Jim_Hayden@SRA.com
- Web: www.SRA.com
SRA International has been creating innovative solutions to practical
problems faced by businesses and government agencies for over eighteen
years. We specialize in the fields of intelligent information retrieval;
machine learning; knowledge-based systems; database engineering; and
natural language processing. SRA empowers organizations with the ability
to discover and detect patterns critical to their success through the
use of a complete line of scaleable data mining tools and professional
services. SRA's KDD Toolset includes multi-strategy algorithms for
discovering Associations, Classifications, Sequences, and Clusters, as
well as high-speed rule and sequence-based pattern matching algorithms.
These algorithms employ direct database access for mining data.
Additionally, they are parallelized to take advantage of multiprocessor
platforms for rapid analysis of extremely large data sets. Finally, we
employ a comprehensive set of JDBC-compliant Java-based user interfaces
for configuration and execution of algorithms as well as visualization
of results for analysis and interpretation. SRA's knowledge discovery
specialists understand how best to apply these advanced capabilities to
enable you to utilize your most strategic asset: electronic information.
Together, SRA's KDD Toolset and professional services provide solutions
that are ideal for tackling such large-scale problems as fraud detection
and prevention, competitive intelligence, and market behavior.
Torrent Systems Inc.
- Contact: Sondra Barrison
- 5 Cambridge Center
- Cambridge, MA 02138
- Tel: (617) 354-8484
- Fax: (617) 354-6767
- Email: sondra@torrent.com
Torrent Systems, Inc. develops and markets tools, component software and
applications to support system integrators and applications developers
in building advanced data warehousing and data mining applications that
run on parallel processing systems. ORCHESTRATE, Torrentis parallel
development environment, hides the complexity of parallel programming
and facilitates the creation of fully parallel, high-performance
data-processing solutions for your MPP or SMP systems. ORCHESTRATE is
fully compatible with all major scalable servers including the IBM
RS/6000, SP server using IBM DB2, Parallel Sysplex, Sun, HP, NCR,
Digital and Intel and supports Oracle Parallel Server, Informix XPS, and
IBM DB2/PE. Torrent Systems is headquartered in Cambridge, MA.
Toshiba Corporation
- Contact: Hiroshi Tsukimoto
- 70 Yanagi-cho, Saiwai-ku
- Kawasaki
- Japan
- Tel: 81-44-548-5469
- Fax: 81-44-520-5856
- Email: tukimoto@ssel.toshiba.co.jp
NNE is a neural network based data mining tool. NEX (Neural network
EXplainer) is the explanation module of NNE. NEX provides the
explanation for trained neural networks by extracting rules from the
networks. Since trained neural networks are black boxes and are
difficult for humans to understand, the neural network is incomplete as
a technique of data mining, which aims to discover understandable
knowledge from databases. So NEX makes the neural network a complete
data mining technique. NEX has the following features. (1) NEX can be
applied to any neural network including recurrent neural networks. (2) NEX
can be applied to any training method. (3) NEX can be applied not only to
discrete values but also to continuous values. (4) NEX extracts accurate
and simple rules in a short time. Especially, when classes are
continuous, there is no other systematic method which can discover
understandable knowledge. NEX discovers understandable knowledge, that
is, rules, from trained neural networks. The accuracies of the rules are
based on the trained networks. NEX can be connected to any neural
network. So any neural network user can obtain an explanation for
trained neural networks just by connecting NEX to their neural networks.
WizSoft Ltd.
- Contact: Abraham Meidan
- 3 Beit Hillel St.
- Tel Aviv
- Israel, 67017
- Tel: 972-2-5631948
- Fax: 972-3-5611945
- Email: abraham@wizsoft.com
- Web: www.wizsoft.com
Wiz Why for windows 95/windows NT is a data mining application for
issuing prediction and classification. WizWhy analyzes the data, reveals
all the if-then rules and mathematical formula rules, and calculates the
significance level of each rule. WizWhy then predicts future cases based
on the discovered rules. In empirical tests WizWhy was found to be
faster and more accurate than neural networks, decision trees and
genetic algorithms.
WizRule for Windows 95/windows NT is a data cleansing and auditing tool.
WizRule reveals all if-then rules and formula rules in the database. It
then points at the deviations from the set of all the discovered rules
as suspected errors, and calculates the level of unlikelihood of each
deviation. WizRule avoids false alarms; almost every deviation with a
high level of unlikelihood is indeed an error.
|


We are also soliciting demonstrations of research prototypes at KDD-97.
This demonstration session will be held on August 15 from 12:30 to 5:00
pm. We have a limited budget for providing hardware for research
demonstrations. This year we will give priority to demonstrations that
are in conjunction with accepted papers at KDD-97. Within budget and
space constraints we will make every effort to accommodate as many
demonstrations as possible. If you would like your demonstration to be
considered for KDD-97 please provide the following information to Tej
Anand (tej.anand@atlantaga.ncr.com) by June 1, 1997:
- Name of demonstration
- Title of paper (if this demonstration is in conjunction with a
paper/poster at KDD-97)
- Development team
- Affiliations of development team members
- Contact telephone number
- Description of demonstration (approximately 200 words)
- What is unique about your system or application? (No more than 50 words)
- Status: Is the system a research prototype, a commercially available
product, or a fielded application?
- Hardware required: Are there any special memory or disk requirements?
- Operating system (specific version number)
- WAN connection needed? (Are thereany special modem requirements?)
- Will you bring your own hardware?
- Any other requirements?
The current list of demonstrations is shown below.
An Interactive Visualization Environment for Data Exploration
Paper Title:An Interactive Visualization Environment for Data Exploration
Development Team: Mark Derthick, John Kolojejchick, and
Steven F. Roth, Carnegie Mellon University
Telephone:412-268-8812
We will demonstrate an information-centric interface architecture for
unifying the subtasks of knowledge discovery, allowing the analyst to
focus on the process rather than the tools. These subtasks include
datacleaning, creating a dataset, data reduction and projection, and
exploratory visualization. Architectural integration is achieved with a
shared object-oriented database, which is accessible to the user via a
visual query language. Tight integration between queries and
visualizations make exploration more interactive and less of a feed
forward process than in previous systems.
What Is Unique about the System?
Tight integration of querying and visualization. Interactive
visualizations involving attributes of multiple objects.
Document Explorer
Paper Titles:(1) Visualization Techniques to Explore Data
Mining Results for Document Collections, and (2) Maximal Association Rules: A
New Tool for Mining for Keyword Co-Occurrences in Document Collections
Development Team: Ronen Feldman and Amir Zilberstein,
Bar-Ilan University; Willi Kloesgen, GMD
Telephone:972-3-5318629 or 972-3-9326702
Document Explorer is a data mining system for document collections. Such
a collection represents an application domain, and the primary goal of
the system is to derive patterns that provide knowledge about this
domain. Additionally, the derived patterns can be used to browse the
collection. Document Explorer searches for patterns that capture
relations between concepts of the domain. The patterns which have been
verified as interesting are structured and presented in a visual user
interface allowing the user to operate on the results to refine and
redirect mining queries or to access the associated documents. The
system offers preprocessing tools to construct or refine a knowledge
base of domain concepts and to create an intermediate representation of
the document collection that will be used by all subsequent data mining
operations. The main pattern types the system can search for are
frequent sets, associations, concept distributions, and keyword graphs.
To enable to provide some explicit bias, the system provides a dedicated
query language for searching the vast implicit spaces of pattern
instances that exist in the collection. This query language offers
syntactical, background, quality and redundancy constraints. The query
language is embedded in a GUI which makes it easy even for novice users
to explore the document collections.
What Is Unique about the System?
(1) Dealing with unstructured information; (2)
unique visualization tools; (3)
special query language designed for text mining; and (4)
unique browsers that enable interactive exploration.
GeoMiner: A Geo-Spatial Data Mining Engine
Paper Title:Described in Proceedings of SIGMOD'97
Development Team: Jiawei Han, Krzysztof Koperski Nebojsa
Stefanovic, and Qing Chen, Simon Fraser University
Intelligent Database Systems Research Lab
Telephone:(604) 291-4411
Spatial data mining is to mine high-level spatial information and
knowledge from large spatial databases. A spatial data mining system
prototype, GeoMiner, has been designed and developed based on our years
of experience in the research and development of relational data mining
system, DBMiner, and our research into spatial data mining. The data
mining power of GeoMiner includes five spatial data mining modules:
characterization, comparison, association, clustering, and
classification. The SAND (Spatial And Nonspatial Data) architecture is
applied in the modeling of spatial databases, whereas GeoMiner includes
the spatial data cube construction module, spatial on-line analytical
processing (OLAP) module, and spatial data mining modules. A spatial
data mining language, GMQL (Geo-Mining Query Language), is designed and
implemented as an extension to Spatial SQL, for spatial data mining.
Moreover, an interactive, user-friendly data mining interface is
constructed and tools are implemented for visualization of discovered
spatial knowledge.
What Is Unique about the System?
A spatial data mining system performing knowledge discovery based on
both spatial and non-spatial properties of objects. It also includes
spatial OLAP modules and tools for visualization of discovered spatial
knowledge.
Id-Vis
Paper Title:A Visual Interactive Framework for Attribute
Discretization
Development Team: Ramesh Subramonian, Ramana Venkata, and
Joyce Chen, Microcomputer Research Laboratory, Intel Corporation
Telephone:(408) 653 - 6794 (Ramana Venkata)
We will demonstrate the Discretizer module of Id-Vis, our interactive
platform for visual data mining on a client-server architecture. This
module features multiple available algorithms, a drag-and-drop cut-point
editor, multiple levels of data visualization with drill-down capability
etc. A number of the variables are exposed to user experimentation. The
system provides visual cues to the "optimal" number and locations of the
cut-points. It also provides feedback to the user about the extra
"badness," over the system-derived optima, incurred during the
experimentation.
The central philosophy is that the system should place the user within
the appropriate context of system-derived values, and provide the user
with the opportunity to intelligently modify (e.g. display the impact on
accuracy of these modifications) these optima and propagate them
downstream. The server originally ships a compacted version of the raw
data, or a equi-probabilistically distilled density function to the
client. If, during the drill-down process, the user's information
request and the associated accuracy constraints cannot be satisfied by
the locally available data, the client obtains the appropriate data from
the server and displays it.
What Is Unique about the System?
Visual interactivity; opportunity for the user to encode his or her
intuition/domain knowledge into and during the mining process; a
feedback-loop paradigm of data mining as a learning process.
Interactive Knowledge Exploration Using DBMiner
Paper Title:Metarule-Guided Mining of Multi-Dimensional
Association Rules Using Data Cubes
Development Team: Jiawei Han, Jenny Y. Chiang, Sonny Chee, Shan Cheng, Wan
Gong, Micheline Kamber, Kris Koperski, Yijun Lu, Nebojsa Stefanovic,
Lara Winstone, Betty Xia, Osmar R. Zaiane, and Hua Zhum, Simon Fraser
University
Telephone:(604) 291-4411
With years of research and development efforts, the DBminer system
developed in CS/SFU, Canada has incorporated many advanced research
results into our system. This includes multiple-level knowledge in large
relational databases and data warehouses, a wide spectrum of data mining
functions, including characterization, comparison, association,
classification, prediction, and clustering, and a data visualization
package. The major technologies adopted are integration with data
warehouse and OLAP technology, attribute-oriented induction, statistical
analysis, progressive deepening for mining multiple- level knowledge,
and meta-rule guided mining. The system provides a user- friendly,
interactive data mining environment with good performance.
What Is Unique about the System?
Integration with OLAP, multiple-level mining and multiple data
mining modules.
Kensington - High-Performance Distributed and Parallel Data Mining
Paper Title:Large Scale Data Mining: Challenges and Responses
Development Team: J. Chattratichat, J. Darlington, M. Ghanem,
Y. Guo, M. Kohler, A. Saleem, J. Sutiwaraphun, and D. Yang,
Department of Computing,
Imperial College
Telephone:+44-171-5948360
Kensington is a prototype of an open, distributed, web-based,
high-performance data mining system for use on parallel servers. A
web-based client-tool, written in Java, gives access to a distributed
collection of data bases and data mining modules which are executed on a
high-performance parallel machine. The application consists of
components which are integrated using Java/Corba middleware. The main
components are: a database server, a data mining server, a visualisation
and Web-server, and the client control tool.
The data mining servers are embedded in Corba objects and distributed
across a LAN or WAN. The database server is accessed via JDBC. The
client-tool can access and control the data mining actions from
anywhere. The high-performance mining modules are portable parallel
implementations in C and MPI, and cover commonly needed functions such
as classification (C4.5), prediction (neural networks), association
discovery and self-organizing maps. The light-weight client tool is
enriched by visualization applets for data mining results, whenever they
become available.
The overall goal is to (a) integrate distributed servers, e.g. data and
computation servers, and (b) to make the integrated system universally
accessible over the Web.
What Is Unique about the System?
The Kensington architecture combines a flexible integration
approach, based on distributed object technology, with high-performance
datamining components. The component technology is platform-independent,
and allows straightforward extension of the system. Accessibility over
the Web is an inherent part of all components. In summary, the key
features of Kensington are: accessibility, extensibility, distributed
object architecture, platform-independence, high performance components.
Mining For Many Kinds of Knowledge
Paper Title:Knowledge = Concepts: A Harmful Equation
Development Team: Arun Sanjeev and Jan Zytkow, Wichita State University
Telephone:316-978-3015 (Arun Sanjeev) or 316-978-3925 (Jan Zytkow)
We will demonstrate Forty-Niner (49er), an automated discovery system
that discovers knowledge in databases. We will apply the system to
several databases to demonstrate how different forms of knowledge can be
automatically discovered. 49er searches for many types of knowledge. It
starts from contingency tables (CTs) and then recognizes special types
of CTs which lead to other, more specialized forms of knowledge, such as
equations, equivalence relations, or subset relations. When many
relations of the same type have been discovered, 49er combines them into
forms such as taxonomies and subset graphs. We will contrast 49er with
specialized systems that are focused on a single form of knowledge, even
if other forms of knowledge are much more appropriated for a given
dataset. Further, we will contrast 49er's focus on knowledge that
contains as much of empirical contents as possible with the focus on
concept definitions typical to machine learning. We will show how CTs
can be used to generate decision trees and how 49er's search for
additional "redundant" knowledge makes decision trees more flexible and
statistically significant. We will also contrast 49er's approach to
taxonomy formation (combine many approximate equivalence relations) with
statistical and conceptual clustering.
What Is Unique about the System?
49er is an autonomous knowledge discoverer. It automatically tunes
itself to the forms of knowledge that are appropriate for a given
dataset: equations, contingency tables, taxonomies, decision trees, and
the like. 49er explores huge hypotheses spaces, evaluating the strength
(to ensure predictive power) and significance of results (to prevent
overfit).
SONAR (System for Optimized Numeric Association Rules)
Paper Title:Computing Optimized Rectilinear Regions for Association Rules
Development Team: Takeshi Fukuda, Yasuhiko Morimoto, Hirofumi
Matsuzawa, Shinichi Morishita, Takeshi Tokuyama, and Kunikazu Yoda, IBM
Tokyo Research Laboratory
Telephone:+81-462-73-4946 (Kunikazu Yoda)
Recent progress in technologies for data input have made it easier
for finance and retail organizations to collect massive amounts of data
and to store them on disk at a low cost. Such organizations are
interested in extracting from these huge databases previously unnoticed
information that inspires new marketing strategies. In this
demonstration, we introduce a system for mining optimized association
rules and for generating decision/regression trees from databases with
numeric data as well as categorical data.
What Is Unique about the System?
Our system uses novel algorithms for efficiently creating ranges and
regions with respect to various optimization criteria such as
maximization of confidence or support, and minimization of entropy and
mean squared error.
S-PLUS DataBlade for Informix Universal Server
Paper Title:Data Mining with
Trellis Graphics
Development Team: Kevin Brown and Jun Luo, Informix; Vikram
Chalana, Scott Blachowicz, Marianna Clark and Doug Martin, Mathsoft
Telephone:(206) 283-8802 x229 (Doug Martin)
Informix Universal Server (IUS) is an object-relational data base.
S-PLUS is an object-oriented language and
system for data analysis, statistical modeling, visualization and
programming with data. An IUS DataBlade is a collection of types
(classes) of objects and access methods, that closely integrates
applications software with the IUS database, typically on the server.
The S-PLUS datablade for IUS provides new data types in IUS
corresponding to intrinsic S-PLUS datatypes and functions to apply any
S-PLUS expression on these datatypes. It also provides functions to
convert IUS native data to these S-PLUS datatypes and vice-versa. The
demonstration will include several data mining applications examples,
including: integrated query and statistical data mining, visualizing and
modeling the relationship between equity returns, firm size and
book-to-market; robust beta mining (finding firms listed on the AMEX,
NASDAQ and NYSE or which the beta calculation is influenced by outliers,
and visualizing the data for such firms); hexagonal binding
visualization of scatterplots for largish data sets; application of
trellis graphics to a Lucent customer value analysis (CVA) study.
What Is Unique about the System?
The object-oriented aspects of IUS and S-PLUS make the S-PLUS
DataBlade a natural marriage of the two technologies: arbitrary object
types can have mirror images in the IUS data base. Furthermore, the use
of the IUS SQL93 is smoothly integrated with the use of S-PLUS
functions.
|

|
|