Guoxi Zhang


Graduate School of Informatics, Kyoto University

Yoshidahonmachi, Sakyo Ward,

Kyoto, Japan 606-8501

I am a PhD student in Graduate School of Informatics, Kyoto university, where I have the fortune to work with Prof. Hisashi Kashima. I am interested in leveraging human guidance to train reinforcement learning agents and making these agents more interpretable.

In particular, I am working on preference-based reinforcement learning in offline setting, which aims at training agents that comply with human preferences using pre-collected experiences.