Open Access Open Access  Restricted Access Subscription Access

Ontology Based Data Unit Similarity With Combining Tag And Value For Data Extraction And Alignment


 

Web database extraction is used to retrieve relevant information from the query result page. By combining tag and value one can extracts data from query result pages by first identifying and segmenting the query result records (QRRs) in the query result pages and then aligning the segmented QRRs into a table. But combining tag and value similarity measure doesn’t handle non-contiguous QRR. To overcome this problem a novel method is proposed to display the most distinct query records from user’s query result pages. In this method, First distinct tags are extracted from the result records to build the tag vector table, and then the similarity between each record is found using several similarity methods. Finally the values of similar records are combined and aligned using ontology based alignment.

                


Keywords

Ontology based CTVS, Web Database Extraction, Jaccard Similarity Measure, QRR Extraction, Distinct Tag and Value Extraction using ontology
User
Notifications
Font Size

Abstract Views: 123

PDF Views: 2




  • Ontology Based Data Unit Similarity With Combining Tag And Value For Data Extraction And Alignment

Abstract Views: 123  |  PDF Views: 2

Authors

Abstract


Web database extraction is used to retrieve relevant information from the query result page. By combining tag and value one can extracts data from query result pages by first identifying and segmenting the query result records (QRRs) in the query result pages and then aligning the segmented QRRs into a table. But combining tag and value similarity measure doesn’t handle non-contiguous QRR. To overcome this problem a novel method is proposed to display the most distinct query records from user’s query result pages. In this method, First distinct tags are extracted from the result records to build the tag vector table, and then the similarity between each record is found using several similarity methods. Finally the values of similar records are combined and aligned using ontology based alignment.

                


Keywords


Ontology based CTVS, Web Database Extraction, Jaccard Similarity Measure, QRR Extraction, Distinct Tag and Value Extraction using ontology