Text and Data Formatting for Machine Learning
Balika J. Chelliah1, Arth Jain2, Utkarsh Singh3, Garima Mehta4
1Dr Balika J.Chelliah, Computer Science Engineering, SRM Institute of Science and Technlogy.
2Arth jain, Computer Science Engineering, SRM Institute of Science and Technology, Ramapuram, Chennai.
3Utkarsh singh, Computer Science Engineering.SRM Institute of Science and Technology Ramapuram Chennai.
4Garima Mehta, Department of Computer Science and Technology, SRM Institute of Science and Technology.
Manuscript received on October 17, 2019. | Revised Manuscript received on 22 October, 2019. | Manuscript published on November 10, 2019. | PP: 2756-2760 | Volume-9 Issue-1, November 2019. | Retrieval Number: A5216119119/2019©BEIESP | DOI: 10.35940/ijitee.A5216.119119
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Machine learning is a prominent tool for getting data from large amounts of information. Whereas a good amount of machine learning analysis has targeted on increasing the accuracy and potency of coaching and reasoning algorithms, there is less attention within the equally vital issues of observing the standard of information fed into the machine learning model. The standard of huge information is far away from good. Recent studies have shown that poor quality will bring serious errors to the result of big data analysis and this could have an effect on in making additional precise results from the information. Advantages of data preprocessing within the context of ML are advanced detection of errors, model-quality improves by the usage of better data, savings in engineering hours to debug issues
Keywords: Data Science, Dataset ,Text Preprocessing
Scope of the Article: Machine Learning