유튜브 과학 채널에 대한 이용실태 분석 및 채널 판별 예측 모형 평가 - 소셜 빅데이터 분석 및 머신 러닝 활용을 중심으로 -

김형욱; 송진웅

한국교육공학회 교육공학연구 유튜브 과학 채널에 대한 이용실태 분석 및 채널 판별 예측 모형 평가 - 소셜 빅데이터 분석 및 머신 러닝 활용을 중심으로 -

KCI 등재

Analysis of Utilization and Assessment of Predicting Models for YouTube Science Channel - Focusing on using Social big data analysis and Machine learning -

김형욱 ( Hyunguk Kim ) , 송진웅 ( Jinwoong Song )

한국교육공학회 2020.06

교육공학연구 36권 2호 383-412(30pages)

DOI 10.17232/KSET.36.2.383

UCI I410-ECN-0102-2021-000-000790629

인용하기 URL 복사 보관함 담기

미리보기

초록

본 연구에서는 유튜브 과학 채널에 대한 대중들의 관심과 이용실태 및 특성을 알아보기 위하여 구독자 수와 동영상 수가 많은 두 개의 채널을 선정하고, 소셜 빅데이터 분석의 관점에서 연구를 수행하였다. 또한, 머신 러닝을 이용하여 유튜브 채널 판별에 대한 예측모형의 평가로 대중들의 반응에 대한 체계적인 탐색 가능성을 살펴보았다. 연구 결과, ‘1분과학’이 ‘과학쿠키’보다 조회수, 좋아요수, 싫어요수, 댓글수의 유튜브 지표에서 모두 평균이 높았다. 하지만, 조회수가 좋아요수로 귀결되는 추세선 분석 결과는 ‘과학쿠키’가 ‘1분과학’보다 높은 수준을 가지는 것으로 나타났다. 또한, 대중들이 많은 좋아요수를 통해 관심을 보인 동영상의 주제는 우주와 양자역학에 관련된 주제였다. 채널을 개설한 이후부터 분석한 댓글 추이도 위의 2가지 주제의 동영상이 주목을 받거나 업로드가 되었던 시기에 눈에 띄는 변화를 보였다. 머신 러닝을 이용한 유튜브 채널 판별의 예측 모형 분석결과는 SVM의 시그모이드형 커널 함수가 90.06%의 정확도를 보여 가장 성능이 우수한 모형이었다. 그리고 랜덤 포레스트 모형, 로지스틱 회귀분석도 각각 89.96%, 88.20%의 높은 정확도를 가진 것으로 나타났다. 의사결정 나무 모형과 knn 분석의 정확도는 앞선 모형들과 비교하였을 때 다소 낮은 편이었으나, 인공 신경망 모형은 다양한 조합의 활성 함수와 은닉 노드의 개수 변화에도 불구하고 모형의 성능이 개선되지 않았다. 본 연구의 결과를 토대로 빅데이터 분석과 머신 러닝을 활용하여 대중들의 관심사를 신속하게 파악하고 그 관심이 높아지고 있는 분야에 대한 교육이 이루어져야 할 것이다.

In this study, we investigated the public interest in YouTube science channels, the condition of use and characteristics, and selected two channels with many subscribers and videos and conducted research from the perspective of social big data analysis. In addition, this study investigated the possibility of systematic exploration of the reaction of the public with the assessment of a model for predicting the classification of YouTube channels, using machine learning. As a result of the study, “One-Minute Science” had a higher average in YouTube Indices, such as the number of views, the number of likes, the number of dislikes, and the number of comments than “Science Cookie.” However, as a result of trend line analysis in which the number of views led to the number of likes, “Science Cookie” had higher levels than “One-Minute Science.” In addition, the topics of videos in which the public showed interests through the number of likes were topics related to space and quantum mechanics. As for the change of comments analyzed since the opening of the channels, when the videos with the above two topics drew attention or were uploaded, there was a noticeable change. As a result of analysis of the model for predicting the classification of YouTube channels, using machine learning, the sigmoid kernel function of SVM was the model with the best performance, showing the accuracy of 90.06%. In addition, it turned out that the random forest model and logistic regression analysis had a high accuracy of 89.96% and 88.20%, respectively. The accuracy of the decision tree model and KNN analysis was somewhat lower than the previous models; however, the performance of the artificial neural network model did not improve, in spite of active functions in various combinations and the change of the number of hidden nodes. Based on the results of this study, big data analysis and machine learning should be used to quickly understand the public's interests and provide education on areas where interest is increasing.

키워드

Ⅰ. 서 론
Ⅱ. 연구 방법
Ⅲ. 연구 결과
Ⅳ. 결론 및 논의
참고문헌

참고문헌 (0)

[자료제공 : 네이버학술정보]