Publication Summary
The standard least-squares regression (LS) is useful for modelling if the explanatory variables X of size n◊p are approximately independent and its parameters can only be estimated if the number of variables p is greater than n. However, this ’ideal’ situation may not exist in certain experiments. In chemometrics, calibration of near-infrared instruments produces data with thousands of variables, but from limited number of samples. This is exacerbated with the fact that the variables are highly correlated with correlation coecient ranges from 0.96 to 1. With these characteristics, LS either fails due to n < p, or unstable due to high variance of the estimates. Ridge regression (RR) is one of several methods that can be employed to deal with the problem. RR works in such situation due to ’regularisation’ of parameter estimation compared to that of LS. This project will explore the use of RR with applications to chemometrics data.
CAER Authors
Dr. Arief Gusnanto
University of Leeds