Yu, Shuguang, Fang, Shuxing, Peng, Ruixin, Qi, Zhengling, Zhou, Fan and Shi, Chengchun ORCID: 0000-0001-7773-2099 (2024) Two-way deconfounder for off-policy evaluation in causal reinforcement learning. In: 38th Annual Conference on Neural Information Processing Systems, 2024-12-10 - 2024-12-15, Vancouver Convention Center, Vancouver, Canada, CAN. (In Press)
Text (Two_way_Deconfounder_for_Off_policy_Evaluation_under_Unmeasured_Confounding)
- Accepted Version
Pending embargo until 1 January 2100. Download (758kB) |
Abstract
This paper studies off-policy evaluation (OPE) in the presence of unmeasured confounders. Inspired by the two-way fixed effects regression model widely used in the panel data literature, we propose a two-way unmeasured confounding assumption to model the system dynamics in causal reinforcement learning and develop a two-way deconfounder algorithm that devises a neural tensor network to simultaneously learn both the unmeasured confounders and the system dynamics, based on which a model-based estimator can be constructed for consistent policy value estimation. We illustrate the effectiveness of the proposed estimator through theoretical results and numerical experiments.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Additional Information: | © 2024 The Author(s) |
Divisions: | Statistics |
Date Deposited: | 21 Nov 2024 17:06 |
Last Modified: | 21 Nov 2024 18:03 |
URI: | http://eprints.lse.ac.uk/id/eprint/126146 |
Actions (login required)
View Item |