Automatic Image Captioning Methods
Ruchitesh Malukani1, Nihaal Subhash2, Chhaya Zala3
1Ruchitesh Malukani, Department of Computer Engineering, G. H. Patel College of Engineering & Technology, Anand, India.
2Nihaal Subhash, Department of Computer Engineering, G. H. Patel College of Engineering & Technology, Anand, India.
3Prof. Chhaya Zala, Department of Computer Engineering, G. H. Patel College of Engineering & Technology, Anand, India.
Manuscript received on 27 April 2020 | Revised Manuscript received on 09 May 2020 | Manuscript Published on 22 May 2020 | PP: 93-97 | Volume-9 Issue-7S July 2020 | Retrieval Number: 100.1/ijitee.G10130597S20 | DOI: 10.35940/ijitee.G1013.0597S20
Open Access | Editorial and Publishing Policies | Cite | Zenodo | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open-access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: A language known to humans is a natural language. In computer science it is the most challenging task to make the computers understand the natural languages and generating caption automatically from the given image. While a lot of work has been done, the total solution to this problem has been demonstrated daunting so far. Image captioning is a crucial job involving linguistic image understanding and the ability to generate interpretation of sentences with proper and accurate structure. It requires expertise in Image processing and natural language processing. The publishers suggest in this practice a system using the multilayer Convolutional Neural Network (CNN) to generate language describing the images and Long Short Term Memory (LSTM) to concisely frame relevant phrases using the driven keywords. We aim in this article to provide a brief overview of current methods and algorithms of image captioning using deep learning. We also address datasets and measurement criteria widely used for the same.
Keywords: Image Captioning, Deep Learning, Computer Vision, Natural Language Processing, CNN, RNN, LSTM.
Scope of the Article: Image Security