An Overview of Preprocessing on Web Log Data for Web Usage Analysis
Naga Lakshmi1, Raja Sekhara Rao2, Sai Satyanarayana Reddy3

1Naga Lakshmi Theerthala, Assistant Professor, Department of Information Technology, Usha Rama College of Engineering, and Technology, Telaprolu, Unguturu  (A.P), India.
2Dr. Raja Sekhara Rao Kurra, Professor and Dean, Department of Computer Science and Engineering, KL University, Vaddeswaram, Guntur (A.P), India.
3Dr. Sai Satyanarayana Reddy Seelam, Professor and Head, Department of Computer Science and Engineering, Lakireddy Balireddy College of Engineering, Mylavaram (A.P), India.
Manuscript received on 12 March 2013 | Revised Manuscript received on 21 March 2013 | Manuscript Published on 30 March 2013 | PP: 274-279 | Volume-2 Issue-4, March 2013 | Retrieval Number: D0584032413/13©BEIESP
Open Access | Editorial and Publishing Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Web has been growing as a dominant platform for retrieving information and discovering knowledge from web data. Web data is stored in web server log files. Web usage analysis or web usage mining or web log mining or click stream analysis is the process of extracting useful knowledge from web server logs, database logs, user queries, client side cookies and user profiles in order to analyze web users’ behavior. Web usage analysis requires data abstraction for pattern discovery. This data abstraction can be achieved through data preprocessing. This paper presents different formats of web server log files and how web server log data is preprocesses for web usage analysis.
Keywords: Web Server Logs, Web Usage Analysis, Preprocessing, Data Cleaning, User Identification, Session Identification, Path Completion, Pattern Discovery, Pattern Analysis.

Scope of the Article: Data Mining