Rule-Based Wrappers for a Dataspace System
Mrityunjay Singh1, Niranjan Lal2, Shashank Yadav3
1Mrityunjay Singh, Department of Computer Science and Engineering, Bundelkhand Institute of Engineering and Technology, Jhansi, India.
2Niranjan Lal, Department of Computer Science and Engineering, Symbiosis Entrance Test, Mody University, Sikar, India.
3Shashank Yadav, Department of Information Technology, Shri Ramswaroop Memorial University, India Time Zone, National Capital Region, Campus, Modinagar, India
Manuscript received on 03 April 2019 | Revised Manuscript received on 10 April 2019 | Manuscript Published on 13 April 2019 | PP: 80-90 | Volume-8 Issue-6C April 2019 | Retrieval Number: F12230486C19/19©BEIESP
Open Access | Editorial and Publishing Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open-access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Many organizations/individuals face the problem of managing a large amount of distributed and heterogeneous data in an efficient manner. The dataspace technology addresses this problem in an efficient manner. A dataspace system is a new abstraction for integrating heterogeneous data sources distributed over the sites that offers on-demand data integration solution with less effort and provides an integrated way of searching & querying capability over heterogeneous data sources. We require the set of automatic wrappers to extract the desired data from their data sources. A wrapper extracts the requested data from their respective data sources, and populates them into the dataspace in desired format (e.g., triple formate). This work presents a set of rule-based wrappers for a dataspace system that wrappers operate in ”pay-as-you-go” manner. We have divided our work into two parts: discussing a set of Transformation Rules (TRSs) and designing of a set of wrappers based on the TRSs. First, we explain the working of the TRSs for structured, semi-structured, and unstructured data model, then, we discuss the designing of rule-based wrappers for dataspace system based on TRSs. We have successfully implemented the wrapper for some real and synthetic data sets. Our some of the wrappers are semi-automatic because they requires the human involvement during the data extraction and translation.
Keywords: Dataspace System; Triple Model; Rule-Based Wrappers; Triplet DS; Pay-as-you-go.
Scope of the Article: Computer Science and Its Applications