Lee, Sze Ming, Chen, Yunxiao
ORCID: 0000-0002-7215-2324 and Sit, Tony
(2025)
A latent variable approach to learning high-dimensional multivariate longitudinal data.
Journal of the American Statistical Association.
ISSN 0162-1459
(In Press)
|
Text (main_unblinded)
- Accepted Version
Pending embargo until 1 January 2100. Available under License Creative Commons Attribution. Download (774kB) |
Abstract
High-dimensional multivariate longitudinal data, which arise when many outcome variables are measured repeatedly over time, are becoming increasingly common in social, behavioral and health sciences. We propose a latent variable model for drawing statistical inferences on covariate effects and predicting future outcomes based on high-dimensional multivariate longitudinal data. This model introduces unobserved factors to account for the between-variable and across-time dependence and assist the prediction. Statistical inference and prediction tools are developed under a general setting that allows outcome variables to be of mixed types and possibly unobserved for certain time points, for example, due to right censoring. A central limit theorem is established for drawing statistical inferences on regression coefficients. Additionally, an information criterion is introduced to choose the number of factors. The proposed model is applied to customer grocery shopping records to predict and understand shopping behavior.
| Item Type: | Article |
|---|---|
| Divisions: | Statistics |
| Subjects: | H Social Sciences > HA Statistics |
| Date Deposited: | 15 Dec 2025 11:45 |
| Last Modified: | 20 Dec 2025 00:04 |
| URI: | http://eprints.lse.ac.uk/id/eprint/130619 |
Actions (login required)
![]() |
View Item |

Download Statistics
Download Statistics