Record Matching in Web Databases Using Unsupervised Approach

Fouzia Sultana; Manjusha Kalekuri

Record Matching in Web Databases Using Unsupervised Approach

Affiliations
1 Department of Computer Science& Engineering, Muffakam Jah College of Engineering & Technology, Banjara-Hills, Hyderabad-500034, India
2 Department of Computer Science& Engineering, Muffakam Jah College of Engineering & Technology, Banjara-Hills, Hyderabad-500034, Pakistan

Abstract
References
Article Metrics
Refbacks

Record Matching is the problem of combining information from multiple heterogeneous databases. One step of data integration is relating the records that appear in the different databases specifically, determining which sets of records refer to the same real-world entities. Performing record matching solves the duplication detection problems; hence the needs for identifying the suitable record matching technique follow. Most of record matching methods are supervised, which requires the user to provide training data. These methods are not applicable for the Web database scenario, where the records to match are query results dynamically generated. To overcome the problem, a new record matching method named Unsupervised Duplicate Detection (UDD) is proposed which, for a given query, can effectively identify duplicates from the query result records of multiple Web databases and eliminating duplicates among records in dynamic query results. The idea of this paper is to adjust the weights of record fields in calculating similarities among records. Two classifiers namely weight component similarity summing classifier and support vector machine classifier are iteratively employed with UDD to identify duplicates in the query results from multiple Web databases.

Keywords

Record Matching, Unsupervised, UDD, Query Results.

I-Scholar

Journal Help

User

Notifications

Journal Content
Browse

Font Size

Information

Abstract Views: 108

PDF Views: 0

Record Matching in Web Databases Using Unsupervised Approach

Abstract Views: 108 | PDF Views: 0

Authors

Fouzia Sultana
Department of Computer Science& Engineering, Muffakam Jah College of Engineering & Technology, Banjara-Hills, Hyderabad-500034, India

Manjusha Kalekuri
Department of Computer Science& Engineering, Muffakam Jah College of Engineering & Technology, Banjara-Hills, Hyderabad-500034, Pakistan

Abstract

Keywords

Record Matching, Unsupervised, UDD, Query Results.

Username
Password
Remember me

Username
Password
Remember me

International Journal of Scientific Engineering and Technology

International Journal of Scientific Engineering and Technology

Record Matching in Web Databases Using Unsupervised Approach

Keywords

Record Matching in Web Databases Using Unsupervised Approach

Authors

Abstract

Keywords