Up a level |
Yu, Shuguang, Fang, Shuxing, Peng, Ruixin, Qi, Zhengling, Zhou, Fan and Shi, Chengchun ORCID: 0000-0001-7773-2099 (2024) Two-way deconfounder for off-policy evaluation in causal reinforcement learning. In: 38th Annual Conference on Neural Information Processing Systems, 2024-12-10 - 2024-12-15, Vancouver Convention Center, Vancouver, Canada, CAN. (In Press)
Li, Ting, Shi, Chengchun ORCID: 0000-0001-7773-2099, Wang, Jianing, Zhou, Fan and Zhu, Hongtu (2023) Optimal treatment allocation for efficient policy evaluation in sequential decision making. In: Oh, A., Naumann, T., Globerson, A., Saenko, K., Hardt, M. and Levine, S., (eds.) Advances in Neural Information Processing Systems 36 (NeurIPS 2023). Neural Information Processing Systems Foundation.
Shi, Chengchun ORCID: 0000-0001-7773-2099, Qi, Zhengling, Wang, Jianing and Zhou, Fan (2023) Value enhancement of reinforcement learning via efficient and robust trust region optimization. Journal of the American Statistical Association. pp. 1-15. ISSN 0162-1459