![]()  | Up a level | 
    Uehara, Masatoshi, Shi, Chengchun 
ORCID: 0000-0001-7773-2099 and Kallus, Nathan 
  
(2025)
A review of off-policy evaluation in reinforcement learning.
    Statistical Science.
    
     ISSN 0883-4237
  
   (In Press)
    Shi, Chengchun 
ORCID: 0000-0001-7773-2099, Uehara, Masatoshi, Uehara, Masatoshi, Huang, Jiawei and Jiang, Nan 
  
(2022)
A minimax learning approach to off-policy evaluation in confounded Partially Observable Markov Decision Processes.
    Proceedings of Machine Learning Research.
    
     ISSN 2640-3498
  
  
    Uehara, Masatoshi, Kiyohara, Haruka, Bennett, Andrew, Chernozhukov, Victor, Jiang, Nan, Kallus, Nathan, Shi, Chengchun 
ORCID: 0000-0001-7773-2099 and Sun, Wenguang 
  
(2023)
Future-dependent value-based off-policy evaluation in POMDPs.
    
      In: Oh, A., Naumann, T., Globerson, A., Saenko, K., Hardt, M. and Levine, S., (eds.)
      Advances in Neural Information Processing Systems 36 (NeurIPS 2023).
    
    Neural Information Processing Systems Foundation.