닫기
216.73.216.214
216.73.216.214
close menu
KCI 등재
앙상블 알고리즘과 BERT를 이용한 연구논문 주제영역 분류
Topic Classification of Research Paper Using Ensemble Algorithms and BERT
김성현 ( Kim Sung Hyun ) , 김영민 ( Kim Young Min )
DOI 10.35373/KMES.29.1.2
UCI I410-151-25-02-091392361

Purpose Developing and comparing a model to classify the topic of research paper using abstract text. Methods Abstract data from 120,000 papers on arXiv was collected, and classification models were developed using ensemble algorithms and BERT. For feature extraction in the ensemble algorithm, TF-IDF, LDA, and Doc2Vec methods were used to create seven feature sets. A total of 22 models were developed using various feature sets and algorithms, and their performance was compared. Results The BERT model exhibited the highest performance with an accuracy of 0.848 and an f1-score of 0.808. Among the ensemble algorithms, LightGBM performed exceptionally well, and the direct reflection of word importance through the TF-IDF vectorization method proved to be effective. Conclusion Developing a model that automatically classifies paper topics by analyzing text offers researchers the opportunity to swiftly access the latest information and identify their research interests. This enhances accessibility to information in research fields and presents the possibility for researchers across diverse domains to gain new insights.

1. 서론
2. 관련 연구
3. 연구방법
4. 모델
5. 연구 결과
6. 결론
참고문헌
[자료제공 : 네이버학술정보]
×