Functional principal component and factor analysis of spatially correlated data
MetadataShow full item record
While multivariate data analysis is concerned with data in the form of random vectors, functional data analysis goes one big step farther, focusing on data that are infinite-dimensional, such as curves, shapes and images. We focus on functional data that are measured over time across multiple subjects. The first part of the thesis focuses on spatially correlated functional data. This correlation is modeled by correlating functional principal component scores. We propose a Spatial Principal Analysis by Conditional Expectation framework to explicitly estimate spatial correlations and reconstruct individual curves. This approach works even when the observed data per curve are extremely sparse. Assuming spatial stationarity, empirical between-curve correlations are calculated as the ratio of eigenvalues of the smoothed covariance surface Cov(Xi(s),Xi(t)) and cross-covariance surface Cov(Xi(s),Xj(t)). Then a parametric spatial correlation model is employed to fit empirical correlations. Finally, principal component scores are estimated to reconstruct the sparsely observed curves. This framework could naturally accommodate arbitrary covariance structures, but there is an enormous reduction in computation if one can assume the separability of temporal and spatial components. We propose hypothesis tests to examine the separability and isotropy effect of spatial correlation. Simulation studies and applications of empirical data show improvements in the curve reconstruction using our framework over the method where curves are assumed to be independent. In addition, asymptotic properties of estimates are discussed in details. In the second part of this work, we present a new approach to factor rotation for functional data. This is achieved by rotating the functional principal components toward a predefined space of periodic functions designed to decompose the total variation into components that are nearly-periodic and nearly-aperiodic with a predefined period. We show that the factor rotation can be obtained by the calculation of canonical correlations between appropriate spaces. Moreover, we demonstrate that our proposed rotations provide stable and interpretable results in the presence of highly complex covariance. This work is motivated by the goal of finding interpretable sources of variability in a gridded time series of vegetation index measurements obtained from remote sensing, and we demonstrate our methodology through the application of factor rotation of this data.