글로버메뉴 바로가기 본문 바로가기 하단메뉴 바로가기

논문검색은 역시 페이퍼서치

> 한국보건정보통계학회 > 보건정보통계학회지 > 41권 1호

임상자료를 이용한 나무구조 분류모형의 성능 비교

Comparison of Various Classification Tree Methods with Clinical Data

신혜정 ( Hyejung Shin ) , 이윤동 ( Yoondong Lee ) , 이은경 ( Eun Kyung Lee )

- 발행기관 : 한국보건정보통계학회

- 발행년도 : 2016

- 간행물 : 보건정보통계학회지, 41권 1호

- 페이지 : pp.135-146 ( 총 12 페이지 )


학술발표대회집, 워크숍 자료집 중 1,2 페이지 논문은 ‘요약’만 제공되는 경우가 있으니,

구매 전에 간행물명, 페이지 수 확인 부탁 드립니다.

5,200
논문제목
초록(외국어)
Objectives: A classification tree is one of the statistical tools that is widely used in the data mining field. It is useful for making statistical decisions, for example, in medical, biology, and business management area. In this paper, we examine newly developed classification tree algorithms and compare them with real examples in medical study, and provide a guideline to select appropriate methods for data analysis. Methods: For the comparison, we used four clinical datasets from UCI (University of California, Irvine) repository. We divide each data to 2/3 training and 1/3 test data set. After fitting the models with various R packages (tree, rpart, party, evtree, CORElearn and randomForest), misclassification rates for training data and test data are calculated separately. Also, specificity and sensitivity are calculated for test data. This procedure is repeated 200 times and compare misclassification rates with one-way analysis of variance and Tukey’s honest significant difference (HSD). Also, specificities and sensitivities are compared. Results: In every case, randomForest shows the best performance. For the single tree methods, the performance of methods is different in each data set. evtree show better performance than the other methods in most data sets. Most sensitivities in Breast Tissue and Dermatology data are quite large. rpart and ctree show very low specificity in Dermatology Data. Conclusions: Every method has its own characteristic and the performance depends on data. Our study shows that the best single tree methods are different in four example data and evtree shows slightly better performance than the other single tree methods in most data sets. randomForest always shows the best performance, mainly because of using a lot of trees instead of one tree.

논문정보
  • - 주제 : 의약학분야 > 예방의학및보건학
  • - 발행기관 : 한국보건정보통계학회
  • - 간행물 : 보건정보통계학회지, 41권 1호
  • - 발행년도 : 2016
  • - 페이지 : pp.135-146 ( 총 12 페이지 )
  • - UCI(KEPA) : I410-ECN-0102-2016-510-000711946
저널정보
  • - 주제 : 의약학분야 > 예방의학및보건학
  • - 성격 : 학술지
  • - 간기 : 계간
  • - 국내 등재 : KCI 등재
  • - 해외 등재 : -
  • - ISSN : 2465-8014
  • - 수록범위 : 1976–2022
  • - 수록 논문수 : 755