Objectives: The amount of information available on web today is more than at any point in history, and greater challenges arouse due to this huge wealth of information available. Also to deal with this information overload, challenging tools are required. Method of Analysis: Internet in the present day especially in India is spreading both in rural and urban areas. Bilingual and Multilingual websites are increasing to a larger extent. Even websites are becoming multitasking. Our main problem is to deal with multilingual web documents and ancient documents. Because, content extraction becomes difficult when such documents are considered. The present paper proposes a neural network approach and attribute generation to justify the content extraction studies for multilingual web documents. Findings: Results obtained are well defined and a thorough analysis is done. Novelty/Improvement: The method is versatile in using pixel-maps, analytically stable in that the matrix input is used and is demonstrated for adoption to different models.
Keywords
Attribute, Content Extraction, Mining, Multi-Lingual, Neural Network, Pattern, Pixel.
User
Information