교차 프로젝트 결함 예측을 위한 유사도 측정 기법 비교 연구

류덕산; 백종문

한국정보처리학회 정보처리학회 논문지 (KTSDE) 교차 프로젝트 결함 예측을 위한 유사도 측정 기법 비교 연구

KCI 등재

교차 프로젝트 결함 예측을 위한 유사도 측정 기법 비교 연구

A Comparative Study on Similarity Measure Techniques for Cross-Project Defect Prediction

류덕산 ( Duksan Ryu ) , 백종문 ( Jongmoon Baik )

한국정보처리학회 2018.06

정보처리학회 논문지 (KTSDE) 7권 6호 205-220(16pages)

UCI I410-ECN-0102-2018-500-003797320

인용하기 URL 복사 보관함 담기

미리보기

초록

소프트웨어 결함 예측은 결함이 자주 발생하는 모듈에 집중함으로써 소프트웨어 품질 보증 활동에 귀중한 프로젝트 리소스를 효과적으로 할당하는 데 도움이 될 수 있다. 회사 내에서 수집 된 충분한 기록 데이터를 사용하여 정확한 결함 발생 가능성이 높은 모듈 예측에 대해WPDP (프로젝트 내 결함 예측)를 사용할 수 있다. 회사가 과거 데이터를 유지하지 못한 경우 CPDP (Cross-Project Defect Prediction) 메커니즘을 기반으로 오류를 예측하는 분류기를 만드는 것이 도움이 될 수 있다. CPDP는 다른 조직에서 수집 한 다른 프로젝트 데이터를 사용하여 분류기를 작성하기 때문에 정확한 분류기를 만드는데 가장 큰 장애물은 소스와 대상 프로젝트 간의 서로 다른 분포이다. 이 문제의 해결을 위해 효과적인 유사도 측정 기술을 식별하는 것이 중요하므로, 본 논문에서는 다양한 유사도 측정 기술을 CPDP 모델에 적용하여 성능을 비교한다. 유사도 가중치의 유효성을 평가하고, 통계적 유의성 검정 및 효과 크기 검정을 통해 결과를 검증한다. 실험 결과, k-Nearest Neighbor (k-NN), LOcal Correlation Integral (LOCI) 및 Range 방법이 유사도 측정 기술 중 상위 3 개에 속했고, 이들을 사용하는 CPDP 예측 성능이 WPDP의 성능과 유사하였다.

Software defect prediction is helpful for allocating valuable project resources effectively for software quality assurance activities thanks to focusing on the identified fault-prone modules. If historical data collected within a company is sufficient, a Within-Project Defect Prediction (WPDP) can be utilized for accurate fault-prone module prediction. In case a company does not maintain historical data, it may be helpful to build a classifier towards predicting comprehensible fault prediction based on Cross-Project Defect Prediction (CPDP). Since CPDP employs different project data collected from other organization to build a classifier, the main obstacle to build an accurate classifier is that distributions between source and target projects are not similar. To address the problem, because it is crucial to identify effective similarity measure techniques to obtain high performance for CPDP, In this paper, we aim to identify them. We compare various similarity measure techniques. The effectiveness of similarity weights calculated by those similarity measure techniques are evaluated. The results are verified using the statistical significance test and the effect size test. The results show k-Nearest Neighbor (k-NN), LOcal Correlation Integral (LOCI), and Range methods are the top three performers. The experimental results show that predictive performances using the three methods are comparable to those of WPDP.

키워드

교차 프로젝트 결함 예측

유사도 측정

이상점 발견

Cross-Project Defect Prediction

Similarity Measure

Outlier Detection

1. Introduction
2. Related Work
3. Similarity Measure Techniques
4. Methodology
5. Result
6. Threats to Validity
7. Conclusion
References

참고문헌 (0)

[자료제공 : 네이버학술정보]