Pedro Melo-Pinto1,2*, Véronique Gomes1, Armando Fernandes3,4, Ana Mendes-Ferreira5,6
1 CITAB-Centre for the Research and Technology of Agro-Environmental and Biological Sciences, Universidade de Trás-os-Montes e Alto Douro, 5000-801, Vila Real, Portugal.
2 Departamento de Engenharias, Escola de Ciências e Tecnologia, Universidade de Trás-os-Montes e Alto Douro, 5000-801 Vila Real, Portugal.
3 INOV-INESC Inovação, Rua Alves Redol, 9, 1000-029 Lisboa, Portugal
4 INESC-ID, Rua Alves Redol, 9, 1000-029 Lisboa, Portugal
5 BioISI - Biosystems & Integrative Sciences Institute, Faculty of Sciences, University of Lisbon, Campo Grande, 1749-016 Lisbon, Portugal.
6 Departamento de Biologia e Ambiente, Escola de Ciências da Vida e Ambiente, Universidade de Trás-os-Montes e Alto Douro, 5000-801 Vila Real, Portugal.
*Corresponding author: firstname.lastname@example.org
The wine industry has been striving to achieve wine quality and consistency, which involves harvesting and selection of grapes at the optimal maturity and according to the desired traits. Hyperspectral imaging (HSI) combined with machine learning algorithms (ML) has emerged as a promising cost-effective alternative to the traditional analytical methods to predict important enological parameters and assist on harvesting critical decisions. However, the large amount of data generated by HSI, together with the large variability associated (grape variety, terroir), raise computational challenges for data-driven-modelling turning the selection of proper models, which best suit the problem under study and assure its generalization, a cumbersome task.
In our work, the large database collected (>2000 samples from 2012 up to now) allowed robust testing of the ML prediction models, whose performance was assessed through n-fold-Cross-Validation and independent test sets for generalization ability (GA) evaluation, using samples from different vintages, varieties and growth conditions not used in the training, addressing the issue of natural variability. Our established models have successfully predicted the pH, sugar and anthocyanin levels of red grapes under lab conditions. Depending on the variety, the obtained results suggesting that it is possible to get models that generalize well.
Keywords: Hyperspectral imaging, Grape Berries, Prediction, Neural Networks, PLS, Deep Learning
Article based on the paper presented at the SIVE OENOPPIA Awards (12th edition of Enoforum; Vicenza, Italy, May 21-23, 2019)
The science of winemaking has significantly evolved at every stage of its production process, starting at the vineyard where much is defined about the quality of wine. Besides phytosanitary status, the evaluation of grape quality is mainly associated to grape ripeness, based on the evolution of oenological parameters over time, which determines the optimal time for harvesting depending on the desired wine to be obtained. Monitoring maturation possess problems related to the huge variability of grape composition, grape variety and terroir. Usually this evaluation is done through classic physical and chemical methods performed off-line, using a limited number of samples, that are time consuming, costly, invasive and generate chemical waste.
Last years have witnessed significant efforts of different academic researchers and producers to develop innovative and less expensive approaches to accomplish a faster, non-destructive, non-invasive and ultimately more sustainable grape maturity assessment. Spectroscopy coupled with digital image techniques, namely hyperspectral imaging (HSI), has emerged as a very attractive and viable alternative to classic techniques. This imaging technology, in reflectance mode, collects information about the intensity of light reflected by grapes as a function of their wavelength (Gowen et al., 2007; Hall et al., 2002), measuring simultaneously thousands of points over a sample without requiring contact between the spectrometer/camera and the grape. Additionally, hyperspectral imaging allows the acquisition of a large number of samples and assess grape ripeness locally in the vineyard, being an important added value for the industry. However, the large amount of data generated by this approach, as it includes not only relevant but also a lot of redundant information, raise computational challenges for data-driven modelling. Several multivariate and machine learning approaches have been proposed as an additional powerful tool to handle such data characteristics and have been proven to be an objective and efficient methodology combination in predicting oenological parameters of grape berries (Cao et al., 2010; Chen et al., 2015; Cozzolino et al., 2004, 2006; Fadock et al., 2016; Fernandes et al., 2015; Ferrer-Gallego et al., 2011; Geraudie et al. 2009; Gomes et al., 2015, 2017a,b; González-Caballero et al., 2011; Larrain et al., 2008; Le Moigne et al. 2008; Nogales-Bueno et al. 2014; dos Santos Costa et al. 2019; Silva et al. 2018).
Inspecting the generalization ability of the approach with different vintages/varieties is fundamental to acquire robustness in the final methodology. If it is not possible to reach the task of generalization in grapes the problem becomes more complex, since it will be necessary to train a model at every year to be used for that particular year. There are numerous works employing data from different vintages and varieties. However, the scientific literature for the generalization task is practically non-existent and only a very few works (mostly ours) that trained models with grapes from one vintage and tested the models with grapes from another vintage can be found (Fadock et al., 2016; Gomes et al., 2017a,b; Janik et al., 2007; Silva et al., 2018). Taking this into account, the current study is focused on two interrelated objectives: i) evaluating the performance of each developed method (partial least squares regression, PLSR; and Neural Networks, NNs) in generalizing, that is, a successful prediction using data from different vintage and/or variety, not employed in the training process; and ii) comparing the performance of the proposed machine learning with deep learning algorithms (convolutional neural network, CNN) using data from several vintages (from 2012 up to 2018) to train and test the models, addressing the issue of natural variability. The use of deep learning algorithms, that are emerging in computer science domain with excellent results in extracting complex patterns of data for a wide field of applications, can be a plus in this prediction context.
Also, performing this comparison is relevant since the fundamentals of the proposed learning algorithms are different which result in different models, and it is impossible to know in advance which algorithm will be the most suitable to the problem.
Herein, we present the application of these approaches towards the assessment of grape ripeness focusing on sugar content, an essential maturity index.
Material and Methods
An illustrative scheme of the overall procedure considered in this study is summarized in Figure 1, and the different steps are briefly described below.
Figure 1. Graphical representation of the procedure adopted in the current work.
This study focused on Touriga Franca (TF), Touriga Nacional (TN) and Tinta Barroca (TB) grape varieties, harvested from vineyards of Quinta do Bomfim, Pinhão, Portugal, property of Symington Family Estates, one of the world’s largest producers of Port wine. Grape samples of TF were collected in 2012, 2013, 2014, 2016, 2017 and 2018 vintages, while TN and TB varieties were harvested in 2013, 2014, 2016 and 2017 vintages. A total of 2682 grapes bunches were collected between the beginning of veraison and maturity, from three different vineyard locations. Each sample comprising six or 12 grape berries, randomly collected from a single bunch with their pedicel attached, was imaging, and then frozen at -18ºC.
Hyperspectral data samples were collected using the following hyperspectral imaging system acquisition: a hyperspectral camera, composed of a JAI Pulnix (JAI, Yokohama, Japan) black and white camera and a Specim Imspector V10E spectrograph (Specim, Oulu, Filand); lighting, using a lamp holder with 300x300x175 mm3 (length x width x height) that held four 20W, 12V halogen lamps and two 40W, 220V blue reflector lamps (Spotline, Philips, Eindhoven, Netherlands), powered by continuous current power supplies to avoid light flickering and the reflector lamps were powered at only 110V to reduce lighting and prevent camera saturation. The acquired images had 1040 x 1392 pixels, with the 1040 pixels corresponding to the measured wavelength channels, ranging between 380 and 1028 nm, with approximately 0.6 nm width for each channel. The 1392 pixels stand for the spatial dimension over the samples with 110 mm of width. The distance between the camera and the sample base was 420 mm, and the camera was controlled with Coyote software from JAI. After imaging, the grape berries were identified and extracted using image segmentation methods.
In order to enable an effective and correct extraction of the information content, caused by the illumination and the focal plane of the hyperspectral camera, it was necessary to compute the reflectance measurements of the acquired HSI. To minimize the measurements noise, an accumulation of 32 hyperspectral images on each grape berry was performed for the intensity of light reflected by the grapes (GI), the intensity of light coming from the white reference (SI) and the dark current signal associated (DI). The reflectance measurements for each sample were carried out along the berry “equator”, considering the pedicel as the pole, and for two berry rotations. To create a unique reflectance spectrum, all berries’ points were averaged over the spatial dimension and rotation positions. Also, to eliminate fluctuations in the measured light intensities, due to the grape berry size and curvature, each spectrum was normalized, so that the minimum and maximum values of that particular spectrum would correspond to zero and one, respectively.
To properly create the prediction models, true contents of sugar were assessed by conventional chemical analysis. In this line, the grapes were defrosted, crushed and then ºBrix was determined by refractometry (Organisation International de la Vigne e du Vin, 2006).
Regarding the predictive methods, three different approaches (PLSR, NN and CNN) were considered in this study and used to build the models. PLSR is a multivariate statistical method firstly introduced in the field of chemometrics. The basic assumptions of PLSR are based on the creation of new variables, called latent variables (LV), corresponding to the projection of the independent (X) and dependent (Y) set of variables into new directions, maximizing the covariance between X and Y (Wold et al., 2001). Herein, the best number of latent variables was selected by minimizing the root mean squared error (RMSE) obtained by n-fold-cross-validation for the PLSR model created. Neural Networks method belongs to machine learning area and involves a large number of mathematical processors in parallel that are capable of processing information in a way that attempts to replicate what the human brain does (Bishop, 1995). This method has the ability of learning from patterns that constitute the training set. In our work, a feedforward multilayer perceptrons, composed of several layers of neurons linked to each other by weights that store the knowledge acquired during the learning process, was employed. During the training process, the weights were computed iteratively so that, for each input sample, the differences between the neural network output neurons outcomes and the true contents are minimized. Each iteration for weights adjustment is called an epoch. The NN was trained using the Levenberg-Marquardt algorithm, a backpropagation approach with a variable learning rate which is more efficient than the conventional ones. The training step was repeated for 100 different initial weights generated randomly and was stopped when the number of epochs with lowest mean squared error for validation patterns was achieved (known as early stopping). Hyperbolic tangent (non-linear function) and the identity (linear function) were the activation functions used to compute the hidden and output neurons, respectively. Due to the limitation on the input data dimensionality of a neural network, which should be as small as possible to provide good accuracies, principal component analysis technique (PCA) was applied in order to reduce the large dimension of the spectra. The procedure of n-fold-cross-validation was also adopted for the development of the NN models. CNNs, also abbreviated as ConvNets, fit in deep learning approaches and have been applied for different research fields that deal with images, being initially developed for image classification problems, so they are mostly applied for at least two-dimensional (2D) images (Albawi et al., 2018). However, in our approach, the sets of spectral data resulting from HSI procedure are one-dimensional (1D) which implies using a 1D CNN instead of the commonly employed 2D or 3D CNNs. The customized HyGrapeNet was built in Python using keras v.2.2.4 package, fed with a one-dimensional input (1 x 1040) and followed by two one-dimensional convolutional layers. Twenty 1D filters were used in each convolutional layer with eight and sixteen kernel sizes using rectified linear unit (ReLU) as non-linear activation function. The convolutional kernels outputs were flattened and a dropout layer was added to avoid overfitting before connecting to a fully connected dense layer with a ReLU activation function. The output layer was a single dense neuron with a linear activation function. The training process was done using Adadelta optimizer (Zeiler, 2012) and the convolutional weights were initiated to random values and computed, iteratively, for 300 epochs. Batch size was set to 64. To obtain the model, an early stopping was included and the mean squared error was defined as loss function.
PLSR and NNs computations were conducted in the MATLAB R2018b environment, version 9.5 (MathWorks, Inc.) using in-house developed code.
Results and Discussion
A summary of the maximum, minimum and mean values of sugar parameters obtained by conventional techniques is shown in Table 1. These enological values were used as reference values to create and test the proposed models. From Table 1 is possible to verify the difference within a vintage and between harvest vintages, which might make more difficult the prediction of new vintages not employed in the created models.
Table 1: Sampling characterization with their reference measurements for sugar contents.
The prediction models were built using the procedure described above. In Table 2 are represented the results obtained for each created model, taking into account the generalization ability problem, i.e., test a vintage and/or variety that was not employed in the created model.
Table 2: N-fold cross-validation and external test results obtained for each model
Analyzing the results for the models created with 2012 samples, it is possible to verify that both approaches show similar performances when the same variety (TF) is used to create and test the models (2012 or 2013 vintages). However, for TN and TB 2013 vintages, that were used as external test set in the created [TF Model (2012)], the performance suffers a major drop. Nevertheless, the NN approach seems to present a better accuracy than PLSR. The same seems to occur with the second set of data used to build and test the models, [TF Model (2012+2013)] and [TF Test (2014)]. This can be justified by the differences within a vintage and between harvest vintages, as showed in table 1. Regarding the last set used to test the generalization ability of the proposed models, [TF Model (2012+2013+2014+2016+2017)] and [TF Test (2018)], there is an increase in the performance with lower RMSE values in the external test set for both created models.
In table 3 are shown the results obtained, regarding the proposed task of comparing the performance of a neural network with a deep learning algorithm (HyGrapeNet). Through the interpretation of the results we can state that our HyGrapeNet shows a better performance than NN for the specific study case under evaluation.
Table 3: Results obtained using all vintages of TF variety to create each model and an external dataset comprising vintages/varieties unseen during the training process to test each model.
One of the strengths of deep learning and consequently HyGrapeNet is the ability to automatically perform feature extraction which is a fundamental procedure in machine learning to improve the model accuracy. As mentioned before, we implemented PCA to reduce the input dimensionality of the NN approach (from 14 to 18 principal components were selected as NN feature input) but for the creation of the HyGrapeNet all 1040 wavelengths were considered as features input. On the other hand, the big challenge of deep learning relies on the huge amount of data required to properly train an algorithm, since it first needs lots of examples to learn and to tune the patterns to then solve the problem. Even though, the results obtained with HyGrapeNet provide good insights into the characteristics of the relationship between our HSI methodology and the oenological parameters. To the best of our knowledge, our work is groundbreaking in the use of a deep learning algorithm to predict oenological parameters of grapes using hyperspectral imaging technology.
Conclusion and Future Perspectives
Hyperspectral imaging proved to be a powerful tool for fast, non-destructive and non-invasive evaluation of intact grapes quality. Its combination with chemometric, machine and deep learning approaches allowing the development of valuable predictive models. The prediction errors obtained are in acceptable range and the performances achieved were indeed very satisfactory, leading us to believe in the robustness of our methodology.
Our results show that HyGrapeNet can be successfully applied to estimate sugar contents of wine grapes berries, achieving a better performance rate when compared with the conventional Neural Network.
Overall, and considering the two main objectives, the results obtained are very promising to accurately measure the sugar content of wine grapes during ripening, providing an alternative to the conventional methods. The future increase of the number of training samples will further improve the robustness and the generalization ability of prediction models, towards its use irrespective of the vintage year. Furthermore, the models are being tested and validated under real conditions, with images acquired in the vineyards. In addition, with the purpose of decreasing the equipment cost, different approaches for wavelength bands selection are underway in order to reduce the dimensionality of data without losing predictive power.
The authors acknowledge financial support through projects: Deus ex Machina Project - Symbiotic technology for societal efficiency gains under NORTE-01-0145-FEDER-000026 and INTERACT project – “Integrated Research in Environment, Agro-Chain and Technology”, no. NORTE-01-0145-FEDER-000017, in its line of research entitled VitalityWINE, co-financed by the European Regional Development Fund (ERDF) through NORTE 2020 (North Regional Operational Program 2014/2020). Support of the Centre for the Research and Technology of Agro-Environmental and Biological Sciences (CITAB) supported by National Funds by FCT under the project UID/AGR/04033/2019 is also acknowledge.
VG is a recipient of a PhD grant funded by FCT-Portuguese Foundation for Science and Technology (PD/BD/128272/2017), under the Doctoral Programme “Agricultural Production Chains - from fork to farm” (PD/00122/2012);
Albawi, S., Mohammed, T. A., & Al-Zawi, S. (2018). Understanding of a convolutional neural network. In Proceedings of 2017 International Conference on Engineering and Technology, ICET 2017. https://doi.org/10.1109/ICEngTechnol.2017.8308186
Bishop, C. M. (1995). Neural Networks for Pattern Recognition. Oxford University Press, Inc.
Cao, F., Wu, D., & He, Y. (2010). Soluble solids content and pH prediction and varieties discrimination of grapes based on visible–near infrared spectroscopy. Computers and Electronics in Agriculture, 71, S15–S18.
Chen, S., Zhang, F., Ning, J., Liu, X., Zhang, Z., & Yang, S. (2015). Predicting the anthocyanin content of wine grapes by NIR hyperspectral imaging. Food Chemistry, 172, 788–793.
Cozzolino, D., Dambergs, R., Janik, L., Cynkar, W., & Gishen, M. (2006). Review: Analysis of grapes and wine by near infrared spectroscopy. Journal of Near Infrared Spectroscopy, 14(5), 279–289. Retrieved from http://dx.doi.org/10.1255/jnirs.679
Cozzolino, D., Esler, M., Dambergs, R., Cynkar, W., Boehm, D., Francis, I., & Gishen, M. (2004). Prediction of colour and pH in grapes using a diode array spectrophotometer (400–1100 nm). Journal of Near Infrared Spectroscopy, 12(2), 105–111. Retrieved from http://dx.doi.org/10.1255/jnirs.414
dos Santos Costa, D., Oliveros Mesa, N. F., Santos Freire, M., Pereira Ramos, R., & Teruel Mederos, B. J. (2019). Development of predictive models for quality and maturation stage attributes of wine grapes using vis-nir reflectance spectroscopy. Postharvest Biology and Technology, 150, 166–178. https://doi.org/10.1016/j.postharvbio.2018.12.010
Fadock, M., Brown, R. B., & Reynolds, A. G. (2016). Visible-Near Infrared Reflectance Spectroscopy for Nondestructive Analysis of Red Wine Grapes. American Journal of Enology and Viticulture, 67(1), 38–46. https://doi.org/10.5344/ajev.2015.15035
Fernandes, A. M., Franco, C., Mendes-Ferreira, A., Mendes-Faia, A., Costa, P. L. da, & Melo-Pinto, P. (2015). Brix, pH and anthocyanin content determination in whole Port wine grape berries by hyperspectral imaging and neural networks. Computers and Electronics in Agriculture, 115, 88–96. https://doi.org/10.1016/j.compag.2015.05.013
Ferrer-Gallego, R., Hernández-Hierro, J. M., Rivas-Gonzalo, J. C., & Escribano-Bailón, M. T. (2011). Determination of phenolic compounds of grape skins during ripening by NIR spectroscopy. LWT - Food Science and Technology, 44, 847–853.
Geraudie, V., Roger, J. M., Ferrandis, J. L., Gialis, J. M., Barbe, P., Bellon Maurel, V., & Pellenc, R. (2009). A revolutionary device for predicting grape maturity based on NIR spectrometry. In FRUTIC 09, 8th Fruit Nut and Vegetable Production Engineering Symposium (p. 8 p.). Concepcion, Chile.
Gomes, V., Fernandes, A., Martins-Lopes, P., Pereira, L., Mendes Faia, A., & Melo-Pinto, P. (2017a). Characterization of neural network generalization in the determination of pH and anthocyanin content of wine grape in new vintages and varieties. Food Chemistry, 218, 40–46. https://doi.org/10.1016/j.foodchem.2016.09.024
Gomes, V. M., Fernandes, A. M., Faia, A., & Melo-Pinto, P. (2017b). Comparison of different approaches for the prediction of sugar content in new vintages of whole Port wine grape berries using hyperspectral imaging. Computers and Electronics in Agriculture, 140. https://doi.org/10.1016/j.compag.2017.06.009
González-Caballero, V., Pérez-Marín, D., López, M.-I., & Sánchez, M.-T. (2011). Optimization of NIR Spectral Data Management for Quality Control of Grape Bunches during On-Vine Ripening. Sensors, 11, 6109–6124.
Gowen, A. A., O’Donnell, C. P., Cullen, P. J., Downey, G., & Frias, J. M. (2007). Hyperspectral imaging – an emerging process analytical tool for food quality and safety control. Trends in Food Science & Technology, 18(12), 590–598. https://doi.org/http://dx.doi.org/10.1016/j.tifs.2007.06.001
Hall, A., Lamb, D. W., Holzapfel, B., & Louis, J. (2002). Optical remote sensing applications in viticulture - a review. Australian Journal of Grape and Wine Research, 8(1), 36–47. https://doi.org/10.1111/j.1755-0238.2002.tb00209.x
Janik, L. J., Cozzolino, D., Dambergs, R., Cynkar, W., & Gishen, M. (2007). The prediction of total anthocyanin concentration in red-grape homogenates using visible-near-infrared spectroscopy and artificial neural networks. Anal Chim Acta, 594(1), 107–118. https://doi.org/10.1016/j.aca.2007.05.019
Larrain, M., Guesalaga, A. R., & Agosin, E. (2008). A Multipurpose Portable Instrument for Determining Ripeness in Wine Grapes Using NIR Spectroscopy. Instrumentation and Measurement, IEEE Transactions On, 57(2), 294–302.
Le Moigne, M., Dufour, E., Bertrand, D., Maury, C., Seraphin, D., & Jourjon, F. (2008). Front face fluorescence spectroscopy and visible spectroscopy coupled with chemometrics have the potential to characterise ripening of Cabernet Franc grapes. Anal Chim Acta, 621(1), 8–18.
Nogales-Bueno, J., Hernández-Hierro, J. M., Rodríguez-Pulido, F. J., & Heredia, F. J. (2014). Determination of technological maturity of grapes and total phenolic compounds of grape skins in red and white cultivars during ripening by near infrared hyperspectral image: A preliminary approach. Food Chemistry, 152(0), 586–591.
Organisation International de la Vigne e du Vin. (2006). Recueil des méthodes internationales d’analyse des vins et des mouts. OIV.
Silva, R., Gomes, V., Mendes-Faia, A., & Melo-Pinto, P. (2018). Using support vector regression and hyperspectral imaging for the prediction of oenological parameters on different vintages and varieties ofwine grape berries. Remote Sensing. https://doi.org/10.3390/rs10020312
Wold, S., Sjöström, M., & Eriksson, L. (2001). PLS-regression: A basic tool of chemometrics. In Chemometrics and Intelligent Laboratory Systems (Vol. 58, pp. 109–130). https://doi.org/10.1016/S0169-7439(01)00155-1
Zeiler, M. D. (2012). ADADELTA: An Adaptive Learning Rate Method. CoRR, abs/1212.5. Retrieved from http://arxiv.org/abs/1212.5701