Cookies?
Library Header Image
LSE Research Online LSE Library Services

Two-way deconfounder for off-policy evaluation in causal reinforcement learning

Yu, Shuguang, Fang, Shuxing, Peng, Ruixin, Qi, Zhengling, Zhou, Fan and Shi, Chengchun ORCID: 0000-0001-7773-2099 (2024) Two-way deconfounder for off-policy evaluation in causal reinforcement learning. In: 38th Annual Conference on Neural Information Processing Systems, 2024-12-10 - 2024-12-15, Vancouver Convention Center, Vancouver, Canada, CAN. (In Press)

[img] Text (Two_way_Deconfounder_for_Off_policy_Evaluation_under_Unmeasured_Confounding) - Accepted Version
Pending embargo until 1 January 2100.

Download (758kB)

Abstract

This paper studies off-policy evaluation (OPE) in the presence of unmeasured confounders. Inspired by the two-way fixed effects regression model widely used in the panel data literature, we propose a two-way unmeasured confounding assumption to model the system dynamics in causal reinforcement learning and develop a two-way deconfounder algorithm that devises a neural tensor network to simultaneously learn both the unmeasured confounders and the system dynamics, based on which a model-based estimator can be constructed for consistent policy value estimation. We illustrate the effectiveness of the proposed estimator through theoretical results and numerical experiments.

Item Type: Conference or Workshop Item (Paper)
Additional Information: © 2024 The Author(s)
Divisions: Statistics
Date Deposited: 21 Nov 2024 17:06
Last Modified: 21 Nov 2024 18:03
URI: http://eprints.lse.ac.uk/id/eprint/126146

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics