Near-infrared spectroscopy and pattern recognition as screening methods for classification of commercial tobacco blends
Group classification of tobacco blends is commonly performed using several different types of compositional data including tobacco compounds, additive or process components. But all of these wet chemistry methods are relatively time-consuming. A need exists for a fast and reliable procedure in order to determine the blend type. The purpose of this study was to assess the ability of near infrared reflectance spectroscopy (NIRS) to be a qualitative means to classify tobacco material based on its spectral features. Two hundred and seventy eight "blond" commercial cigarette products from European countries have been subjected to NIRS at a broad range of wavelengths (400 - 2500 nm). A Hierarchical Cluster Analysis (HCA) performed on chemical variables measured in tobacco blends has enabled us to define four distinctive groups among the commercial products. Different well-known supervised pattern recognition algorithms were applied to spectral data: Linear Discriminant Analysis either after a data compression step with a PCA (LDA/PCs) or on selected wavelengths by stepwise discriminant analysis (LDA/SW), Discriminant Partial Least Squares (DPLS) and Soft Independent Modeling of Class Analogy (SIMCA). The performance of the multivariate data models investigated here in combination with a variety of wavelength regions and data pre-treatment is evaluated by comparing the classification predictions with the predefined chemical categories. The LDA using factors scores calculated from near infrared region (1100 - 2500 nm) showed more accurate differentiations than those based on selected wavelengths or on the DPLS approach. Whereas the SIMCA algorithm has a weak discrimination power. The work reported in this paper confirmed that near infrared spectroscopy coupled with an appropriate chemometric procedure can reveal the identity of a range of commercial blends with a high degree of confidence.