![]() | Up a level |
Zhu, Jin ORCID: 0000-0001-8550-5822, Wan, Runzhe, Qi, Zhengling, Luo, Shikai and Shi, Chengchun
ORCID: 0000-0001-7773-2099
(2024)
Robust offline reinforcement learning with heavy-tailed rewards.
Proceedings of Machine Learning Research, 238.
541 - 549.
ISSN 2640-3498
Shi, Chengchun ORCID: 0000-0001-7773-2099, Wan, Runzhe, Song, Ge, Luo, Shikai, Zhu, Hongtu and Song, Rui
(2023)
A multiagent reinforcement learning framework for off-policy evaluation in two-sided markets.
Annals of Applied Statistics, 17 (4).
2701 - 2722.
ISSN 1932-6157
Wan, Runzhe, Zhang, Sheng, Shi, Chengchun ORCID: 0000-0001-7773-2099, Luo, Shikai and Song, Rui
(2021)
Pattern transfer learning for reinforcement learning in order dispatching.
In: International Joint Conference on Artificial Intelligence, 2021-08-19 - 2021-08-26.
(In Press)
Shi, Chengchun ORCID: 0000-0001-7773-2099, Wan, Runzhe, Chernozhukov, Victor and Song, Rui
(2021)
Deeply-debiased off-policy interval estimation.
In: International Conference on Machine Learning, 2021-07-18 - 2021-07-24, Online.
(In Press)
Shi, Chengchun ORCID: 0000-0001-7773-2099, Wan, Runzhe, Song, Rui, Lu, Wenbin and Leng, Ling
(2020)
Does the Markov decision process fit the data: testing for the Markov property in sequential decision making.
In: International Conference on Machine Learning, 2020-07-12 - 2020-07-18, Online.
(In Press)