Mpho MAFATA1,2, Jeanne BRAND1, and Astrid BUICA1
1 South African Grape and Wine Research Institute, Department of Viticulture and Oenology, Stellenbosch University, South Africa
2 School for Data Science and Computational Thinking, Stellenbosch University, South Africa

Email contact: mafata[@]

AIM: Patterns in data obtained from wine chemical and sensory evaluations  are difficult to infer using classical statistics. Pattern recognition can be resolved by coupling data fusion with machine learning techniques, possibly leading to new hypotheses being formed. This study  demonstrates the applicability of two pattern recognition approaches using as case study involving Chenin Blanc wines (recently bottled and after two years storage) from young (<35 years) and old (>35 years) vines.

METHODS: Sensory (sorting (Mafata et al. 2020)) and chemical (NMR: nuclear magnetic resonance, HRMS: high resolution mass spectrometry, and UV-Vis: ultraviolet spectrophotometry) data were collected for the young and aged (two years in the bottle) wines. Data sets were combined using multiple factor analysis (MFA). Exploratory unsupervised cluster analysis was performed by agglomerative hierarchical clustering (AHC) and Fuzzy-k means (Bezdek 1981). Optimal cluster conditions were found for both methods and the cophenetic coefficient was used to assess the most confident clustering method.

RESULTS: Since large data sets were fused, the models were very complex. There were no consistent clustering patterns when varying clustering conditions, signalling high similarity between samples. The samples could not confidently be distinguished from one another even at the highest optimized conditions. Although Fuzzy-k means gave more confident clustering, it was still not sufficient for solving classification issues in this sample set.

CONCLUSIONS: Fuzzy-k means was better at resolving the natural grouping of samples. Coupled to data fusion, it could potentially lead to better pattern recognition, especially for oenological chemical and sensory data. The fuzzy approach should be explored, keeping in mind it is more sensitive to small differences in the data compared to classical statistics. 


Bezdek, J.C., (1981) Pattern Recognition with Fuzzy Objective Function Algorithms, First. ed, Pattern Recognition with Fuzzy Objective Function Algorithms. Springer US.

Mafata, M., Brand, J., Panzeri, V. and Buica, A., (2020) Investigating the Concept of South African Old Vine Chenin Blanc. South African J. Enol. Vitic.

Related sheets: