트랜스포머 기반 모델을 이용한 연구논문 멀티레이블 주제영역 분류

김성현; 김영민

doi:10.35373/KMES.29.4.6

한국경영공학회 한국경영공학회지 트랜스포머 기반 모델을 이용한 연구논문 멀티레이블 주제영역 분류

KCI 등재

트랜스포머 기반 모델을 이용한 연구논문 멀티레이블 주제영역 분류

Multi-label Topic Classification of Research Papers Using Transformer-based Models

김성현 ( Kim Sung Hyun ) , 김영민 ( Kim Young Min )

한국경영공학회 2024.12

한국경영공학회지 29권 4호 77-91(15pages)

DOI 10.35373/KMES.29.4.6

UCI I410-151-25-02-091980549

인용하기 URL 복사 보관함 담기

미리보기

초록

Purpose To develop MLP and transformer-based models for the multi-label topic classification of research papers using abstract text. Methods Abstracts from 119,600 papers in the Computer Science category of arXiv were collected to create a multi-label dataset with up to three categories out of a total of 15 possible categories. Performance was evaluated by developing a baseline MLP model along with transformer-based models: BERT, RoBERTa, and DistillBERT. Results The transformer models outperformed the traditional MLP model. The DistillBERT model achieved the highest micro F1-score of 0.749, while the BERT model recorded macro and weighted F1-scores of 0.655 and 0.733, respectively. The RoBERTa model excelled in the samples method with a score of 0.772. Conclusion This study enables researchers to quickly explore recent findings and effectively identify their research topics. Additionally, it is expected to significantly contribute to the efficient sharing of academic knowledge and the revitalization of the research community.

키워드

1. 서론
2. 관련 연구
3. 연구방법
4. 모델
5. 연구 결과
6. 결론
참고문헌

참고문헌 (0)

[자료제공 : 네이버학술정보]