Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

A Proposal for the Semantic based Report Generation of Related HTML Documents


Affiliations
1 Department of Information Technology, Thiagarajar College of Engineering, Madurai, Tamil Nadu, India
2 Department of Computer Science and Engineering, Thiagarajar College of Engineering, Madurai, Tamil Nadu, India
     

   Subscribe/Renew Journal


Today most of the web pages are in the form of HTML only. Many data do exist, but there is no or less way for generating reports from various but related HTML pages. For example, the information of an individual person may be stored in HTML pages. There is no way for collectively getting the report about all the people for particular information. Most of the time, this is done manually. This paper proposes a semantic based approach for generating reports from HTML pages using semantic technologies like OWL, RDF and SPARQL. The required HTML pages are navigated and information from the table and the list are collected as a first step. The data is pre-processed and formatted in a CSV file, such that it enables further processing easier. OWL files are created for the corresponding domain which can act as a dictionary for the application. CSV contents are separated based on the OWL files and the rules. Separated contents are stored in the RDF format and SPARQL is used to query the RDF file. The proposed model thus can be a handy tool for the management people to generate reports readily, without spending much manual time.

Keywords

RDF, OWL, SPARQL, HTML Reports.
User
Subscription Login to verify subscription
Notifications
Font Size

Abstract Views: 143

PDF Views: 3




  • A Proposal for the Semantic based Report Generation of Related HTML Documents

Abstract Views: 143  |  PDF Views: 3

Authors

A. M. Abirami
Department of Information Technology, Thiagarajar College of Engineering, Madurai, Tamil Nadu, India
A. Askarunisa
Department of Computer Science and Engineering, Thiagarajar College of Engineering, Madurai, Tamil Nadu, India

Abstract


Today most of the web pages are in the form of HTML only. Many data do exist, but there is no or less way for generating reports from various but related HTML pages. For example, the information of an individual person may be stored in HTML pages. There is no way for collectively getting the report about all the people for particular information. Most of the time, this is done manually. This paper proposes a semantic based approach for generating reports from HTML pages using semantic technologies like OWL, RDF and SPARQL. The required HTML pages are navigated and information from the table and the list are collected as a first step. The data is pre-processed and formatted in a CSV file, such that it enables further processing easier. OWL files are created for the corresponding domain which can act as a dictionary for the application. CSV contents are separated based on the OWL files and the rules. Separated contents are stored in the RDF format and SPARQL is used to query the RDF file. The proposed model thus can be a handy tool for the management people to generate reports readily, without spending much manual time.

Keywords


RDF, OWL, SPARQL, HTML Reports.