The Growing Curvilinear Component Analysis (GCCA) Neural Network

Big high dimensional data is becoming a challenging field of research. There exist a lot of techniques which infer information. However, because of the curse of dimensionality, a necessary step is the dimensionality reduction (DR) of the information.

DR can be performed by linear and nonlinear algorithms. In general, linear algorithms are faster because of less computational burden. A related problem is dealing with time-varying high dimensional data, where the time dependence is due to nonstationary data distribution. Data stream algorithms are not able to project in lower dimensional spaces. Indeed, only linear projections, like principal component analysis (PCA), are used in real time while nonlinear techniques need the whole database (offline).

The Growing Curvilinear Component Analysis (GCCA) neural network addresses this problem; it has a self-organized incremental architecture adapting to the changing data distribution and performs simultaneously the data quantization and projection by using CCA, a nonlinear distance-preserving reduction technique. This is achieved by introducing the idea of “seed”, pair of neurons which colonize the input domain, and “bridge”, a novel kind of edge in the manifold graph, which signals the data non-stationarity.