데이터 특성에 따른 스태킹 모델의 유효성 연구

조성빈

doi:10.35373/KMES.30.2.2

한국경영공학회 한국경영공학회지 데이터 특성에 따른 스태킹 모델의 유효성 연구

KCI 등재

데이터 특성에 따른 스태킹 모델의 유효성 연구

A Study on the Effectiveness of Stacking Model by Data Characteristic

조성빈 ( Cho Sungbin )

한국경영공학회 2025.06

한국경영공학회지 30권 2호 17-30(14pages)

DOI 10.35373/KMES.30.2.2

인용하기 URL 복사 보관함 담기

미리보기

초록

Purpose This study attempts to explore the performance of various stacking models by applying into two kinds of data, which have very different characteristic. Methods The Base model includes decision tree, random forest, Naive Bayes, logistic regression, whereas support vector machine is adopted as the Meta model. Two kinds of data are ‘hmeq’ data and ‘bankrupt’ data. The performance is measured by accuracy, sensitivity, specificity, false positive ratio, and false negative ratio. Results For ‘hmeq’ data which are very well refined, random forest results super performance that all stacking models do not exceed. For ‘bankrupt’ data which are raw, containing much noise, stacking models perform, in overall, better than individual base models. Stacking model 5 outperforms, in particular. Conclusion The empirical analysis results indicate that when data are highly refined and have a limited number of input variables, stacking approach seems not a good strategy. Choosing a highly hyperparameter-trainable model would be a good choice. On the other hand, when data are almost raw with many input variables, stacking can perform better than individual models by integrating individual model’s ability in reading underlying patterns.

키워드

1. 서론
2. 선행 연구
3. 자료 구성과 연구 모델
4. 분석 결과
5. 결론 및 시사점
참고문헌

참고문헌 (0)

[자료제공 : 네이버학술정보]