Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Disambiguating the Appearances of People by an Automatic Discovery Model of Personal Name


Affiliations
1 Department of Computer Science and Engineering, Kumaraguru College of Technology, Coimbatore, India
     

   Subscribe/Renew Journal


The most common activities of internet users are searching for information. Retrieving information about people from web search engines can become difficult when a person has nicknames or name aliases or in different names. Thus there is an important issue of knowing the exact name of the user in case of information reclamation, outlook analysis, personal name disambiguation, and relation extraction. Identifying aliases of a name are important in information retrieval. Automatically extracted lexical pattern-based approach is used to efficiently extract a large set of candidate aliases from snippets retrieved from a web search engine. The proposed method comprises two main components: pattern extraction, and pseudonym extraction and ranking. Using a seed list of name-alias pairs, first extract lexical patterns that are frequently used to convey information related to alias on the web. The extracted patterns are then used to find candidate pseudonyms or aliases for a given name. They define various ranking scores using the hyperlink structure on the web and page counts retrieved from a search engine to identify the correct aliases among the extracted candidates. Reduction in generation of infrequent candidates will improve the speed of mining process to many folds making the algorithm highly efficient.

Keywords

Information Retrieval, Web Mining, Information Mining, Text Analysis.
User
Subscription Login to verify subscription
Notifications
Font Size

Abstract Views: 166

PDF Views: 2




  • Disambiguating the Appearances of People by an Automatic Discovery Model of Personal Name

Abstract Views: 166  |  PDF Views: 2

Authors

D. Deepika
Department of Computer Science and Engineering, Kumaraguru College of Technology, Coimbatore, India
P. Betty
Department of Computer Science and Engineering, Kumaraguru College of Technology, Coimbatore, India

Abstract


The most common activities of internet users are searching for information. Retrieving information about people from web search engines can become difficult when a person has nicknames or name aliases or in different names. Thus there is an important issue of knowing the exact name of the user in case of information reclamation, outlook analysis, personal name disambiguation, and relation extraction. Identifying aliases of a name are important in information retrieval. Automatically extracted lexical pattern-based approach is used to efficiently extract a large set of candidate aliases from snippets retrieved from a web search engine. The proposed method comprises two main components: pattern extraction, and pseudonym extraction and ranking. Using a seed list of name-alias pairs, first extract lexical patterns that are frequently used to convey information related to alias on the web. The extracted patterns are then used to find candidate pseudonyms or aliases for a given name. They define various ranking scores using the hyperlink structure on the web and page counts retrieved from a search engine to identify the correct aliases among the extracted candidates. Reduction in generation of infrequent candidates will improve the speed of mining process to many folds making the algorithm highly efficient.

Keywords


Information Retrieval, Web Mining, Information Mining, Text Analysis.