Wavenumbers of high absolute value of correlation coefficient to Total Soluble Solids (TSS), pH, or Titratable Acidity (TA) were selected from reflection Fourier transform near infrared (FT-NIR) spectra of intact grape berries of the white variety Thompson Seedless. Multiple linear regression (MLR) and partial least squares (PLS) regression were applied to the spectra to construct trained regression models able to predict TSS, pH, and TA. Square Pearson’s correlation coefficient (R2) and the Mean Square Error (MSE) were used to evaluate the precision of prediction. TSS content was predicted with R2 score of 0.972 and MSE 0.094 using MLR and with R2 0.926 and MSE 0.223 using PLS regression. The pH prediction scores were R2 0.812 and MSE 0.002 with MLR. With PLS regression the values were R2 0.485 and MSE 0.004. TA can be predicted only from the second derivatives of the spectra. MLR produced R2 for prediction 0.745 and MSE 0.076, while the scores using PLS regression were R2 0.648 and MSE 0.114. It was concluded that variable selection could greatly improve the prediction accuracy. The appropriateness of the two regression methods depends on the structure of the spectra dataset and on the characteristics whose prediction is sought.
C. Chariskou, C. Bazinas, A. J. Daniels, U. L. Opara, H. H. Nieuwoudt, V. G. Kaburlasos, “Variable selection for the prediction of TSS, pH and TA of intact berries of Thompson seedless grapes from their NIS reflection”, 30th International Conference on Software, Telecommunications and Computer Networks (SoftCOM 2022), Split, Croatia, 22-24 September 2022.