Loading

Mining and YouTube Data Analysis using Hadoop
B. Uma Maheswari1, N. Mythili2

1B. Uma Maheswari, Associate Professor, St. Joseph‟s College of Engineering, Chennai (Tamil Nadu) India.
2N. Mythili, Assistant Professor, St. Joseph‟s College of Engineering, Chennai (Tamil Nadu) India.
Manuscript received on December 16, 2019. | Revised Manuscript received on December 22, 2019. | Manuscript published on January 10, 2020. | PP: 1461-1465 | Volume-9 Issue-3, January 2020. | Retrieval Number: B7922129219/2020©BEIESP | DOI: 10.35940/ijitee.B7922.019320
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Analysis of structured and consistent data has seen remarkable success in past decades. Whereas, the analysis of unstructured data in the form of multimedia format remains a challenging task. YouTube is one of the most popular and used social media tool. It reveals the community feedback through comments for published videos, number of likes, dislikes, number of subscribers for a particular channel. The main objective of this work is to demonstrate by using Hadoop concepts, how data generated from YouTube can be mined and utilized to make targeted, real time and informed decisions. In our paper, we analyze the data to identify the top categories in which the most number of videos are uploaded. This YouTube data is publicly available and the YouTube data set is described below under the heading Data Set Description. The dataset will be fetched from the Google using the YouTube API (Application Programming Interface) and going to be stored in Hadoop Distributed File System (HDFS). Using MapReduce we are going to analyze the dataset to identify the video categories in which most number of videos are uploaded. The objective of this paper is to demonstrate Apache Hadoop framework concepts and how to make targeted, real-time and informed decisions using data gathered from YouTube. 
Keywords: Map Reduce, Mapper Algorithm, Reducer Algorithm, You Tube, Data Analysis
Scope of the Article: Decision making