独立性尺度に基づく知識の粒度の教師なし推定

Abstract

Modeling the association between items in a dataset is a problem that is frequently encountered in data and knowledge mining research. Most previous studies have simply applied a predefined fixed pattern to extract the substructure of each item pair and then analyzed the association between these substructures. The use of such fixed patterns may not, however, capture the significant association. To address this problem, we propose a novel machine learning task of extracting a strongly associated substructure pair (co-substructure) from each input item pair. We formalize it as a dependence maximization problem. Then, we discuss two critical issues in the task, namely the data sparsity problem and a huge search space. To address the data sparsity problem, we adopt the Hilbert–Schmidt independence criterion as an objective function. To improve search efficiency, we adopt the Metropolis–Hastings algorithm. We report results of empirical evaluations, in which the proposed method is applied to the acquisition of narrative event pairs, a knowledge mining task that is an active area of study in the field of natural language processing.

Publication
人工知能学会第31回全国大会予稿集
Date