Skip to main content
CORESTA Meeting, Smoke/Technology, Vienna, 1995, ST11

Data analysis and modeling with data mining tools

PALESIS J.A.; PODRAZA K.F.
Philip Morris USA, Richmond, VA, USA
Recent work in Artificial Intelligence has produced a wide variety of "data mining" (or "machine learning") tools for automatically extracting knowledge from data. Like Neural Networks, data mining tools build a general model of some phenomenon (process, behavior, etc.) from a "training set" which consists of a set of specific examples illustrating that phenomenon. Neural Network models are essentially "black boxes" in the sense that the knowledge they incorporate is not made explicit but is rather hidden in the weights of the connections between neurons. Thus, in situations where the objective is to understand the patterns and relationships underlying the data, Neural Networks is not an appropriate learning paradigm. Data-mining tools are specifically designed to explain the data, thus, they typically represent the induced knowledge explicity as rules. At Philip Morris USA R&D, data mining tools have been used in several areas including process, product, and consumer modeling. Results indicate that data mining technologies can play a significant role in industrial research. Although they do not eliminate the need for rigorous statistical analysis, these tools can often provide a quick and easy method of data analysis and modeling. This paper will provide an overview of data mining tools and methodologies and illustrate how a data mining algorithm called ID3 was used to analyze laboratory data and develop a rule-based model of an experimental chemical process.