닫기
216.73.216.191
216.73.216.191
close menu
KCI 등재
A Comparison of Ensemble Methods Combining Resampling Techniques for Class Imbalanced Data
( Hee Jae Leea ) , ( Sung Im Lee )
응용통계연구 27권 3호 357-371(15pages)
UCI I410-ECN-0102-2015-300-000239769
* 발행 기관의 요청으로 이용이 불가한 자료입니다.

There are many studies related to imbalanced data in which the class distribution is highly skewed. To address the problem of imbalanced data, previous studies deal with resampling techniques which correct the skewness of the class distribution in each sampled subset by using under-sampling, over-sampling or hybridsampling such as SMOTE. Ensemble methods have also alleviated the problem of class imbalanced data. In this paper, we compare around a dozen algorithms that combine the ensemble methods and resampling techniques based on simulated data sets generated by the Backbone model, which can handle the imbalance rate. The results on various real imbalanced data sets are also presented to compare the e.ectiveness of algorithms. As a result, we highly recommend the resampling technique combining ensemble methods for imbalanced data in which the proportion of the minority class is less than 10%. We also .nd that each ensemble method has a well-matched sampling technique. The algorithms which combine bagging or random forest ensembles with random undersampling tend to perform well; however, the boosting ensemble appears to perform better with over-sampling. All ensemble methods combined with SMOTE outperform in most situations.

[자료제공 : 네이버학술정보]
×