Bull. Spec. CORESTA Symposium, Winston-Salem, 1982, p. 11, S02, ISSN.0525-6240

Pattern recognition of cigarette smoke analyzed by capillary gas chromatography

Philip Morris USA, Research Center, Richmond, VA, USA
Initially, differentiation between two cigarette types was accomplished by comparing the organic gas phase profiles obtained using glass capillary gas chromatography. Although the use of a computer was instrumental in achieving these comparison profiles, pattern recognition techniques were not applied. The task of differentiating more than two cigarette types simultaneously requires the use of statistical analyses because of their ability for data reduction, pattern extraction, and ranking the importance of the gas chromatographic peaks. These tools have been applied successfully to the derivatized extracts of the particulate and gas phases of cigarette smoke. Ten cigarettes were studied : Both cased and uncased versions of 100% bright, 100% burley, 100% oriental, and blends of 33%/33%/33% and 60%/30%/10%, respectively. For the gas phase data, at least five chromatographic profiles were obtained for each different cigarette, giving a total of 52 chromatograms to serve as the data base. Each gas phase chromatogram contained approximately 100 resolvable peaks. A subset of 30 of the most intense peaks were selected manually for each chromatogram to give an array of 52 x 30 data points. For the derivatized TPM extracts, 49 reproducibly integrated peaks, both major and minor, were selected from 75 chromatograms, without knowing the identity of the peaks. The data were processed by means of two software programs available on the DECSYSTEM 20/60:BMDP7M for discriminate analysis and BMDP4M for factor analysis. Discriminant analysis requires the identification of group membership prior to classification of the data groups. For example, the data representing burley uncased cigarettes are so identified, bright cased cigarette data are identified, and so forth for each group. The BMDP7M program was successful in producing both maximum intergroup separation and maximum intragroup homogeneity of our cigarette model system. One could reasonably expect that discriminant analysis would be successful in separating the cigarettes in our model system since they were known to be significantly different even though their chromatographic profiles were so similar. A more taxing exercise is the application of factor analysis which does not require any information as to group membership identification prior to classification. Results from the BMDP4M program were successful in providing discrimination using three factors. The degree of discrimination was unexpected, especially since the chromatograms were so similar visually. To obtain a clearer perception of the success of the three factors for discriminating our model system, an animation was made which rotates the product map around the Factor 1 axis in 1.degree. increments. The animation gives the best picture of the value of pattern recognition for obtaining more useful information from chromatographic data of such a complex matrix as cigarette smoke.