Natural language image descriptor

Anurag Kishore, Sanjay Singh

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Generating descriptions for visual data (images and video) automatically has been a complicated task in the field of Computer Vision and Artificial Intelligence. This paper discusses the working of and improvements on an algorithm called Neural Image Captioner (NIC) by Oriol Vinyals and his team, which uses a deep convolutional and recurrent architecture to generate natural language sentences to describe the visual data input. We look at the possibility of making this algorithm train faster without allowing it to lose accuracy via the usage of techniques like Stochastic Gradient Descent and also employ an algorithm to find the perfect depth of the convolutional part of the network for different datasets. A drop of 33% was observed in the number of iterations required to get the algorithm to its original proficiency as claimed by Oriol et al.

Original languageEnglish
Title of host publication2015 IEEE Recent Advances in Intelligent Computational Systems, RAICS 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages110-115
Number of pages6
ISBN (Electronic)9781467366700
DOIs
Publication statusPublished - 09-06-2016
Event2015 IEEE Recent Advances in Intelligent Computational Systems, RAICS 2015 - Trivandrum, Kerala, India
Duration: 10-12-201512-12-2015

Conference

Conference2015 IEEE Recent Advances in Intelligent Computational Systems, RAICS 2015
CountryIndia
CityTrivandrum, Kerala
Period10-12-1512-12-15

Fingerprint

Computer vision
Artificial intelligence

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Science Applications
  • Control and Systems Engineering

Cite this

Kishore, A., & Singh, S. (2016). Natural language image descriptor. In 2015 IEEE Recent Advances in Intelligent Computational Systems, RAICS 2015 (pp. 110-115). [7488398] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/RAICS.2015.7488398
Kishore, Anurag ; Singh, Sanjay. / Natural language image descriptor. 2015 IEEE Recent Advances in Intelligent Computational Systems, RAICS 2015. Institute of Electrical and Electronics Engineers Inc., 2016. pp. 110-115
@inproceedings{b96f939bead8450389bdc2a61961f791,
title = "Natural language image descriptor",
abstract = "Generating descriptions for visual data (images and video) automatically has been a complicated task in the field of Computer Vision and Artificial Intelligence. This paper discusses the working of and improvements on an algorithm called Neural Image Captioner (NIC) by Oriol Vinyals and his team, which uses a deep convolutional and recurrent architecture to generate natural language sentences to describe the visual data input. We look at the possibility of making this algorithm train faster without allowing it to lose accuracy via the usage of techniques like Stochastic Gradient Descent and also employ an algorithm to find the perfect depth of the convolutional part of the network for different datasets. A drop of 33{\%} was observed in the number of iterations required to get the algorithm to its original proficiency as claimed by Oriol et al.",
author = "Anurag Kishore and Sanjay Singh",
year = "2016",
month = "6",
day = "9",
doi = "10.1109/RAICS.2015.7488398",
language = "English",
pages = "110--115",
booktitle = "2015 IEEE Recent Advances in Intelligent Computational Systems, RAICS 2015",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
address = "United States",

}

Kishore, A & Singh, S 2016, Natural language image descriptor. in 2015 IEEE Recent Advances in Intelligent Computational Systems, RAICS 2015., 7488398, Institute of Electrical and Electronics Engineers Inc., pp. 110-115, 2015 IEEE Recent Advances in Intelligent Computational Systems, RAICS 2015, Trivandrum, Kerala, India, 10-12-15. https://doi.org/10.1109/RAICS.2015.7488398

Natural language image descriptor. / Kishore, Anurag; Singh, Sanjay.

2015 IEEE Recent Advances in Intelligent Computational Systems, RAICS 2015. Institute of Electrical and Electronics Engineers Inc., 2016. p. 110-115 7488398.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Natural language image descriptor

AU - Kishore, Anurag

AU - Singh, Sanjay

PY - 2016/6/9

Y1 - 2016/6/9

N2 - Generating descriptions for visual data (images and video) automatically has been a complicated task in the field of Computer Vision and Artificial Intelligence. This paper discusses the working of and improvements on an algorithm called Neural Image Captioner (NIC) by Oriol Vinyals and his team, which uses a deep convolutional and recurrent architecture to generate natural language sentences to describe the visual data input. We look at the possibility of making this algorithm train faster without allowing it to lose accuracy via the usage of techniques like Stochastic Gradient Descent and also employ an algorithm to find the perfect depth of the convolutional part of the network for different datasets. A drop of 33% was observed in the number of iterations required to get the algorithm to its original proficiency as claimed by Oriol et al.

AB - Generating descriptions for visual data (images and video) automatically has been a complicated task in the field of Computer Vision and Artificial Intelligence. This paper discusses the working of and improvements on an algorithm called Neural Image Captioner (NIC) by Oriol Vinyals and his team, which uses a deep convolutional and recurrent architecture to generate natural language sentences to describe the visual data input. We look at the possibility of making this algorithm train faster without allowing it to lose accuracy via the usage of techniques like Stochastic Gradient Descent and also employ an algorithm to find the perfect depth of the convolutional part of the network for different datasets. A drop of 33% was observed in the number of iterations required to get the algorithm to its original proficiency as claimed by Oriol et al.

UR - http://www.scopus.com/inward/record.url?scp=84979009562&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84979009562&partnerID=8YFLogxK

U2 - 10.1109/RAICS.2015.7488398

DO - 10.1109/RAICS.2015.7488398

M3 - Conference contribution

SP - 110

EP - 115

BT - 2015 IEEE Recent Advances in Intelligent Computational Systems, RAICS 2015

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

Kishore A, Singh S. Natural language image descriptor. In 2015 IEEE Recent Advances in Intelligent Computational Systems, RAICS 2015. Institute of Electrical and Electronics Engineers Inc. 2016. p. 110-115. 7488398 https://doi.org/10.1109/RAICS.2015.7488398