Up a level |
Shi, Chengchun ORCID: 0000-0001-7773-2099, Qi, Zhengling, Wang, Jianing and Zhou, Fan (2023) Value enhancement of reinforcement learning via efficient and robust trust region optimization. Journal of the American Statistical Association. pp. 1-15. ISSN 0162-1459
Li, Ting, Shi, Chengchun ORCID: 0000-0001-7773-2099, Wang, Jianing, Zhou, Fan and Zhu, Hongtu (2023) Optimal treatment allocation for efficient policy evaluation in sequential decision making. In: Oh, A., Naumann, T., Globerson, A., Saenko, K., Hardt, M. and Levine, S., (eds.) Advances in Neural Information Processing Systems 36 (NeurIPS 2023). Neural Information Processing Systems Foundation.
Yu, Shuguang, Fang, Shuxing, Peng, Ruixin, Qi, Zhengling, Zhou, Fan and Shi, Chengchun ORCID: 0000-0001-7773-2099 (2024) Two-way deconfounder for off-policy evaluation in causal reinforcement learning. In: 38th Annual Conference on Neural Information Processing Systems, 2024-12-10 - 2024-12-15, Vancouver Convention Center, Vancouver, Canada, CAN. (In Press)