Guoxi Zhang

Graduate School of Informatics, Kyoto University
Yoshidahonmachi, Sakyo Ward,
Kyoto, Japan 606-8501
I am a PhD student in Graduate School of Informatics, Kyoto university, where I have the fortune to work with Prof. Hisashi Kashima. I am interested in leveraging human guidance to train reinforcement learning agents and making these agents more interpretable.
In particular, I am working on preference-based reinforcement learning in offline setting, which aims at training agents that comply with human preferences using pre-collected experiences.