닫기
216.73.216.214
216.73.216.214
close menu
KCI 후보
통계 기반 한국어 형태소 분석기의 성능 개선
Improving the Performance of Statistical Korean Morphological Analyzer
심광섭 ( Kwangseob Shim )
人文科學硏究 34권 285-316(32pages)
UCI I410-ECN-0102-2016-000-000708579

Statistical Korean morphological analysis is a brand-new approach in that it does not require a manually built machine-readable morphology dictionary. Instead, it uses statistical information that is acquired from POS-tagged corpus. The acquisition of statistical information is fully automated, so that no human intervention is required in the process. This is a good side of the statistical approach to Korean morphological analysis. The bad side of the approach is its low precision, meaning that the number of false positives is relatively high. In order to improve the precision, this paper proposes a method of filtering false positives. The proposed method introduces two types of dictionaries, one-syllable-morpheme dictionary and josa-eomi dictionary, which are automatically constructed when statistical information is collected from the POS-tagged corpus. To evaluate the performance of the proposed method, 10-fold cross-validation is performed with 10 million eojeol Sejong POS-tagged corpus. The experimental results show that the precision has been improved by 5%.

1. 서론
2. 음절 단위의 한국어 형태소 분석
3. 통계 기반 한국어 형태소 분석
4. 통계 기반 한국어 형태소 분석 성능 개선 방안
5. 실험 및 결과
6. 결론
[자료제공 : 네이버학술정보]
×