T-fold sequential-validation technique for out-of-distribution generalization with financial time series data
Fecha
2021-06Autor
Muñoz-Elguezábal, Juan F.
Sánchez-Torres, Juan D.
Metadatos
Mostrar el registro completo del ítemDescripción
The temporal structure in financial time series (FTS) data demands non-trivial considerations in the use of cross-validation (CV). Such frequently used technique is based on statistical learning theory, which is founded on the assumption that training samples are i.i.d. Although there is progress in studying fundamental phenomenons in certain learning methods such as feature selection imbalance during the learning stage, it is currently widely accepted that there will be no reason to expect good out of sample results from a learning process without such strong assumption. In FTS, there are conditions under which sub-sampling data leads to overshadow the effect of non-deterministic relationships between features and the target variable among different samples. Such effect remains unnoticed given the use of the additivity property in the decomposition of objective functions for the Learning Process. Moreover, it reduces to a particular operation the relationship among samples without information attribution. We present a technique that controls information leakage and decomposes the global probability distribution into local probability distributions, providing identification of each sample contribution to the learning process, maintaining information sparsity, therefore, relaxing the effects of the i.i.d. assumption. Parametric stability, as a result, is presented for exchange rate prediction using different predictive models.ITESO, A.C.