Skip to main content
CORESTA Congress, Online, 2022, Smoke Science/Product Technology Groups, ST 26

Application of innovative big data techniques for improving data processing and accelerating risk assessment in new tobacco products

LARROQUE S.; SONNERAT D.; CHARRIÈRE M.; BECERRIL E.; MOLINA J.M.; OLIVA J.; ABDUSAMIEV K.
JT International SA, Geneva, Switzerland

Data analyses are often performed on a per testing level by traditional data-processing applications involving data collection, data summary, validation, and generation of final outputs. Usually due to dispersed knowledge, testing results are interpreted in stand-alone reports, missing the ability to link with other informative data for improving reviews and decision-making processes (e.g., putting together data from consumption, chemical, toxicological, and clinical assessments).

Our objective was to evaluate available big data solutions for setting an automatic storage of datasets, for creating visualization tools and for running machine learning (ML) predictions in near real time.

Standard data management scripts have been written to import and structure original raw datasets received from various providers to make them ready for concatenation. The data lake: a unique repository with unlimited storage space is then used to centralise all the information and is linked to our artificial intelligence (AI) business applications where ML models are trained to produce predictions and feed dynamic dashboards. These are effective tools made for handling huge amounts of data to allow for continuous re-assessments while entering records into the lake.

Two examples will be shown regarding automation of pooled analyses when accumulating adverse events from clinical studies and on-live statistical comparisons from vapour chemistry emissions levels integrating toxicological risk assessments.

This big data platform was customised to fulfil specific needs regarding access to large volumes of data, to perform data enrichment and run instant analyses with the aim to optimise data driven reporting and accelerate safety assessments of novel tobacco products. Furthermore, this shift from traditional analytics platforms may enable changes to our ways of working, creating possibilities for new ways of thinki–g, new skills and new opportunities.