Research Preference-based Reinforcement Learning Preferences are relative human evaluation for agent behaviors. This project aims at training RL agents using human evaluation and enables agents to learn from humans. Improving Pairwise Rank Aggregation via Query for Rank Difference This study investigates how to elicit and utilize information about differences in objects' rankings.