Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

An Efficient Approach of Image Parsing


Affiliations
1 RCET Bhilai , Chattisgarh., India
2 CSE , RCET Chattisgarh., India
     

   Subscribe/Renew Journal


In this paper, we present an image parsing to text description framework that generates text descriptions of image and video content based on image understanding. Image parsing is the problem of assigning an object label to each pixel. It unifies the image segmentation and object recognition problems. This framework follows three steps: 1) Input images are decomposed into their constituent visual patterns by an image parsing engine. 2) The image parsing results are converted into semantic representation in the form of Web Ontology Language. 3) A text generation engine converts the results from previous steps into semantically meaningful, human readable and queryable text reports.Pixels cannot be classified in this manner based only on their intensities or even local feature descriptors. Contextual information plays a critical role in Resolving ambiguities. The proposed framework has two aims. First, we use semi-automatic method to parse images from the Internet. Second, we use automatic methods to parse image/video in specific domains and generate text reports that are useful for real-world applications. Image parsing can be posed as a supervised learning problem where a classifier is learnt from training data consisting of images and corresponding label maps. Auto context and convolution networks are two promising approaches that apply context to image parsing in the supervised learning setting. Convolution networks are a type of artificial neural network (ANN) in which each processing element carries out a convolution followed by nonlinearity.
Subscription Login to verify subscription
User
Notifications
Font Size


  • An Analytical Study on Image Parsing: Volume 2, Issue 8, August 2012, ISSN: 2277 128X, Available online at: www.ijarcsse.com.
  • Calora. http://www.calora.org.
  • K. Barnard and D. A. Forsyth. Learning the semantics of words and pictures. In Int. Conf. on Computer Vision, pages 408{15, 2001.
  • J. Bilmes. A gentle tutorial on the em algorithm and its application to parameter estimation for gaussian mixture and hidden markov models. Technical Report ICSI-TR-97-021, University of Berkeley, 1997.
  • Hulton Getty Archive. http://search.hultongetty.com/.
  • Informedia Project. http://informedia.cs.cmu.edu.
  • TV archive. http://televisionarchive.org.
  • Web archive. http://www.archive.org.
  • Yahoo News. http://news.yahoo.com.
  • L.-M. Albiges. Remote public access to picture databanks. Audiovisual Librarian, 18(1):22{27, 1992}
  • L.H. Armitage and P.G.B. Enser. Analysis of user need in image archives. Journal of Information Science, 23(4):287{299, 1997}.
  • K. Barnard, P. Duygulu, N. de Freitas, D. A. Forsyth, D. Blei, and M. Jordan. Matching words and pictures. Journal of Machine Learning Research, 3:1107{ 1135, 2003}
  • K. Barnard, P. Duygulu, and D. A. Forsyth. Clustering art. In IEEE Conf. on Computer Vision and Pattern Recognition, volume 2, pages 434{441, 2001}
  • K. Barnard, P. Duygulu, and D. A. Forsyth. Modeling the statistics of image features and associated text. In Document Recognition and Retrieval IX, Electronic Imaging, 2002.
  • K. Barnard, P. Duygulu, and D. A. Forsyth. Recognition as translating images into text. In Internet Imaging IX, Electronic Imaging, 2003.
  • K. Barnard, P. Duygulu, R. Guru, P. Gabbur, and D. A. Forsyth. The effects of segmentation and feature choice in a translation model of object recognition. In IEEE Conf. on Computer Vision and Pattern Recognition, 2003.

Abstract Views: 400

PDF Views: 2




  • An Efficient Approach of Image Parsing

Abstract Views: 400  |  PDF Views: 2

Authors

Anurag Kumar Mishra
RCET Bhilai , Chattisgarh., India
Sipi Dubey
CSE , RCET Chattisgarh., India

Abstract


In this paper, we present an image parsing to text description framework that generates text descriptions of image and video content based on image understanding. Image parsing is the problem of assigning an object label to each pixel. It unifies the image segmentation and object recognition problems. This framework follows three steps: 1) Input images are decomposed into their constituent visual patterns by an image parsing engine. 2) The image parsing results are converted into semantic representation in the form of Web Ontology Language. 3) A text generation engine converts the results from previous steps into semantically meaningful, human readable and queryable text reports.Pixels cannot be classified in this manner based only on their intensities or even local feature descriptors. Contextual information plays a critical role in Resolving ambiguities. The proposed framework has two aims. First, we use semi-automatic method to parse images from the Internet. Second, we use automatic methods to parse image/video in specific domains and generate text reports that are useful for real-world applications. Image parsing can be posed as a supervised learning problem where a classifier is learnt from training data consisting of images and corresponding label maps. Auto context and convolution networks are two promising approaches that apply context to image parsing in the supervised learning setting. Convolution networks are a type of artificial neural network (ANN) in which each processing element carries out a convolution followed by nonlinearity.

References