Publications
Journal Papers
2023
- Learning state importance for preference-based reinforcement learningGuoxi Zhang and Hisashi KashimaMachine Learning, 2023
Preference-based reinforcement learning (PbRL) develops agents using human preferences. Due to its empirical success, it has prospect of benefiting human-centered applications. Meanwhile, previous work on PbRL overlooks interpretability, which is an indispensable element of ethical artificial intelligence (AI). While prior art for explainable AI offers some machinery, there lacks an approach to select samples to construct explanations. This becomes an issue for PbRL, as transitions relevant to task solving are often outnumbered by irrelevant ones. Thus, ad-hoc sample selection undermines the credibility of explanations. The present study proposes a framework for learning reward functions and state importance from preferences simultaneously. It offers a systematic approach for selecting samples when constructing explanations. Moreover, the present study proposes a perturbation analysis to evaluate the learned state importance quantitatively. Through experiments on discrete and continuous control tasks, the present study demonstrates the proposed framework’s efficacy for providing interpretability without sacrificing task performance.
@article{Zhang:2023aa, author = {Zhang, Guoxi and Kashima, Hisashi}, date = {2023/01/09}, doi = {10.1007/s10994-022-06295-5}, id = {Zhang2023}, isbn = {1573-0565}, journal = {Machine Learning}, title = {Learning state importance for preference-based reinforcement learning}, url = {https://doi.org/10.1007/s10994-022-06295-5}, year = {2023}, bdsk-url-1 = {https://doi.org/10.1007/s10994-022-06295-5}, }
2022
- Machine Learning in Materials Chemistry: An InvitationDaniel Packwood, Linh Thi Hoai Nguyen, Pierluigi Cesana, Guoxi Zhang, Aleksandar Staykov, Yasuhide Fukumoto, and Dinh Hoa NguyenMachine Learning with Applications, 2022
Materials chemistry is being profoundly influenced by the uptake of machine learning methodologies. Machine learning techniques, in combination with established techniques from computational physics, promise to accelerate the discovery of new materials by elucidating complex structure–property relationships from massive material databases. Despite exciting possibilities, further methodological developments call for a greater synergism between materials chemists, physicists, and engineers on one side, with computer science and math majors on the other. In this review, we provide a non-exhaustive account of machine learning in materials chemistry for computer scientists and applied mathematicians, with an emphasis on molecule datasets and materials chemistry problems. The first part of this review provides a tutorial on how to prepare such datasets for subsequent model building, with an emphasis on the construction of feature vectors. We also provide a self-contained introduction to density functional theory, a method from computational physics which is widely used to generate datasets and compute response variables. The second part reviews two machine learning methodologies which represent the status quo in materials chemistry at present – kernelized machine learning and Bayesian machine learning – and discusses their application to real datasets. In the third part of the review, we introduce some emerging machine learning techniques which have not been widely adopted by materials scientists and therefore present potential avenues for computer science and applied math majors. In the final concluding section, we discuss some recent machine learning-based approaches to real materials discovery problems and speculate on some promising future directions.
@article{PACKWOOD2022100265, author = {Packwood, Daniel and Nguyen, Linh Thi Hoai and Cesana, Pierluigi and Zhang, Guoxi and Staykov, Aleksandar and Fukumoto, Yasuhide and Nguyen, Dinh Hoa}, journal = {Machine Learning with Applications}, keywords = {Materials chemistry, Kernelized machine learning, Density functional theory, Bayesian optimization, Ensemble methods, Reinforcement learning, Federated learning}, pages = {100265}, title = {Machine Learning in Materials Chemistry: An Invitation}, volume = {8}, year = {2022}, bdsk-url-1 = {https://www.sciencedirect.com/science/article/pii/S2666827022000093}, }
Conference Papers
2023
- (To Appear) Behavior Estimation from Multi-Source Data for Offline Reinforcement LearningGuoxi Zhang and Hisashi KashimaIn Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023
2022
- (To Appear) Batch Reinforcement Learning from CrowdsGuoxi Zhang and Hisashi KashimaIn Machine Learning and Knowledge Discovery in Databases, 2022
A shortcoming of batch reinforcement learning is its requirement for rewards in data, thus not applicable to tasks without reward functions. Existing settings for the lack of reward, such as behavioral cloning, rely on optimal demonstrations collected from humans. Unfortunately, extensive expertise is required for ensuring optimality, which hinder the acquisition of large-scale data for complex tasks. This paper addresses the lack of reward by learning a reward function from preferences between trajectories. Generating preferences only requires a basic understanding of a task, and it is faster than performing demonstrations. Thus, preferences can be collected at scale from non-expert humans using crowdsourcing. This paper tackles a critical challenge that emerged when collecting data from non-expert humans: the noise in preferences. A novel probabilistic model is proposed for modelling the reliability of labels, which utilizes labels collaboratively. Moreover, the proposed model smooths the estimation with a learned reward function. Evaluation on Atari datasets demonstrates the effectiveness of the proposed model, followed by an ablation study to analyze the relative importance of the proposed ideas.
- (To Appear) Improving Pairwise Rank Aggregation via Querying for Rank DifferenceGuoxi Zhang, Jiyi Li, and Hisashi KashimaIn Proceedings of the Ninth IEEE International Conference on Data Science and Advanced Analytics, 2022
Pairwise rank aggregation (PRA) aims at learning a ranking from pairwise comparisons between objects that specify relative ordering of objects. The present study proposes the use of rank difference information for PRA, which characterizes the extent winners in paired comparisons beat their opponents. While such information can be effortlessly recognized by annotators, to our knowledge, it has not been utilized for PRA before. The challenge is three-fold: how to solicit such information, how to utilize it in rank aggregation, and how to overcome the noise from heterogeneous annotators. The present study proposes a new query for soliciting information about rank difference from annotators with limited cognitive burden. As prior methods for PRA abounds, an objective is to empower them with information on rank difference. To this end, the present study proposes a conservative learning objective that can be combined with many existing PRA algorithms in a straightforward manner. The third contribution is a new method for PRA called mixture of exponentials (MoE). Annotators from a heterogeneous population might have diverse views concerning rank difference. An annotator might be good at recognizing rank difference only for a subset of items but not the rest. This means that information about rank difference is likely to be perturbed. Unfortunately, such an object-dependent error pattern cannot be modeled with existing approaches. MoE assumes that each annotator uses a mixture of ranking functions in generating answers. The mixture components can capture object-related patterns in data. The present study evaluates the proposals with extensive experiments on both real and synthetic datasets. The results confirm the efficacy of the proposals and shed light on their practical usage.
2018
- On Reducing Dimensionality of Labeled Data EfficientlyGuoxi Zhang, Tomoharu Iwata, and Hisashi KashimaIn Advances in Knowledge Discovery and Data Mining, 2018
We address the problem of reducing dimensionality for labeled data. Our objective is to achieve better class separation in latent space. Existing nonlinear algorithms rely on pairwise distances between data samples, which are generally infeasible to compute or store in the large data limit. In this paper, we propose a parametric nonlinear algorithm that employs a spherical mixture model in the latent space. The proposed algorithm attains grand efficiency in reducing data dimensionality, because it only requires distances between data points and cluster centers. In our experiments, the proposed algorithm achieves up to 44 times better efficiency while maintaining similar efficacy. In practice, it can be used to speedup k-NN classification or visualize data points with their class structure.
@inproceedings{10.1007/978-3-319-93040-4_7, address = {Melbourne, Australia}, author = {Zhang, Guoxi and Iwata, Tomoharu and Kashima, Hisashi}, booktitle = {Advances in Knowledge Discovery and Data Mining}, pages = {77--88}, publisher = {Springer Cham}, title = {On Reducing Dimensionality of Labeled Data Efficiently}, year = {2018}, }
2017
- Robust Multi-view Topic Modeling by Incorporating Detecting AnomaliesGuoxi Zhang, Tomoharu Iwata, and Hisashi KashimaIn Machine Learning and Knowledge Discovery in Databases, 2017
Multi-view text data consist of texts from different sources. For instance, multilingual Wikipedia corpora contain articles in different languages which are created by different group of users. Because multi-view text data are often created in distributed fashion, information from different sources may not be consistent. Such inconsistency introduce noise to analysis of such kind of data. In this paper, we propose a probabilistic topic model for multi-view data, which is robust against noise. The proposed model can also be used for detecting anomalies. In our experiments on Wikipedia data sets, the proposed model is more robust than existing multi-view topic models in terms of held-out perplexity.
@inproceedings{10.1007/978-3-319-71246-8_15, address = {Skopje, Macedonia}, author = {Zhang, Guoxi and Iwata, Tomoharu and Kashima, Hisashi}, booktitle = {Machine Learning and Knowledge Discovery in Databases}, pages = {238--250}, publisher = {Springer Cham}, title = {Robust Multi-view Topic Modeling by Incorporating Detecting Anomalies}, year = {2017}, }