닫기
18.191.167.138
18.191.167.138
close menu
}
KCI 등재 SCIE SCOPUS
A Distance Approach for Open Information Extraction Based on Word Vector
( Liu Peiqian ) , ( Wang Xiaojie )
UCI I410-ECN-0102-2018-500-003791202

Web-scale open information extraction (Open IE) plays an important role in NLP tasks like acquiring common-sense knowledge, learning selectional preferences and automatic text understanding. A large number of Open IE approaches have been proposed in the last decade, and the majority of these approaches are based on supervised learning or dependency parsing. In this paper, we present a novel method for web scale open information extraction, which employs cosine distance based on Google word vector as the confidence score of the extraction. The proposed method is a purely unsupervised learning algorithm without requiring any hand-labeled training data or dependency parse features. We also present the mathematically rigorous proof for the new method with Bayes Inference and Artificial Neural Network theory. It turns out that the proposed algorithm is equivalent to Maximum Likelihood Estimation of the joint probability distribution over the elements of the candidate extraction. The proof itself also theoretically suggests a typical usage of word vector for other NLP tasks. Experiments show that the distance-based method leads to further improvements over the newly presented Open IE systems on three benchmark datasets, in terms of effectiveness and efficiency.

1. Introduction
2. Related Work
3. The Distance Approach
4. Experimental Results and Analysis
5. Conclusion and Future Work
References
[자료제공 : 네이버학술정보]
×