Haku

On Independent Component Analysis and Supervised Dimension Reduction for Time Series

QR-koodi

On Independent Component Analysis and Supervised Dimension Reduction for Time Series

The main goal of this thesis work has been to develop tools to recover hidden structures, latent variables, or latent subspaces for multivariate and dependent time series data. The secondary goal has been to write computationally efficient algorithms for the methods to an R-package.

In Blind Source Separation (BSS) the goal is to find uncorrelated latent sources by transforming the observed data in an appropriate way. In Independent Component Analysis (ICA) the latent sources are assumed to be independent. The well-known ICA methods FOBI and JADE are generalized to work with multivariate time series, where the latent components exhibit stochastic volatility. In such time series the volatility cannot be regarded as a constant in time, as often there are periods of high and periods of low volatility. The new methods are called gFOBI and gJADE. Also SOBI, a classic method which works well once the volatility is assumed to be constant, is given a variant called vSOBI, that also works with time series with stochastic volatility.

In dimension reduction the idea is to transform the data into a new coordinate system, where the components are uncorrelated or even independent, and then keep only some of the transformed variables in such way that we do not lose too much of the important information of the data. The aforementioned BSS methods can be used in unsupervised dimension reduction; all the variables or time series have the same role.

In supervised dimension reduction the relationship between a response and predictor variables needs to be considered as well. Wellknown supervised dimension reduction methods for independent and identically distributed data, SIR and SAVE, are generalized to work for time series data. The methods TSIR and TSAVE are introduced and shown to work well for time series, as they also use the information on the past values of the predictor time series. Also TSSH, a hybrid version of TSIR and TSAVE, is introduced. All the methods that have been developed in this thesis have also been implemented in R package tsBSS.

Tallennettuna: