A              RDB2Onto – Relational Database Data to Ontology Individuals Mapping Tool

A.1          Basic Information

Building ontology based information systems, it is frequently necessary to convert or replicate data from existing information systems (such as databases) to the ontology based information systems, if we want the ontology-based systems to work with real data. Usually data in existing information systems are stored in a Relational Database.

A.1.1      Basic Terms

Relational database

Any relational database, such as MySQL, MSSQL and others

Ontology

Formalized model of problem environment understandable by computer system

A.1.2      Method Description

RDB2Onto tool provides Relational Database Data to Ontology Individuals Mapping. The tool works with a domain ontology model and a relational database. The overall idea is to map SQL query to RDF/OWL XML template. Such OWL data are then sent to an ontology model. The tool is being implemented in Java using Jena or Sesame library for ontology manipulation and MySQL database for testing, but it is possible to use any other relational database, using JDBC connector.

A.1.3      Scenarios of Use

RDB2Onto can be used as a tool for replication from Relational Database to Ontology or to a specified file (in RDF/OWL format).

A.1.4      External Links and Publications

§  Seleng, M., Laclavik, M., Balogh, Z., Hluchy, L.: RDB2Onto: Approach for creating semantic metadata from relational database data, In INFORMATICS´2007 proceedings of the ninth international conference on informatics. Bratislava, Slovak Society for Applied Cybernetics and Informatics, 2007. ISBN 978-80-969243-7-0, s. 113-116.

§  Log4J, Java-based logging utility, Apache Software Foundation. (http://logging.apache.org/log4j)

§  MySQL, http://www.mysql.com/

§  Jena, http://jena.sf.net/

§  Sesame, http://openrdf.org/

A.2          Integration Manual

RDB2Onto is developed in Java (Standard Edition 5) and distributed as a jar archive. Access to the functionality of the tool is provided through Java Interface. RDB2Onto is not developed as a stand-alone application. It can be also included in other applications and domains.

A.2.1      Dependencies

RDB2Onto uses following libraries:

§  Log4J logging utility

§  Corporate memory consisting of three parts: file (Corporate memory libraries), relational database (Corporate memory libraries) and ontology (OntoCM libraries)

A.2.2      Installation

Deploying RDB2Onto into other application requires taking the following steps (Java Integrated Development Environment and Apache Ant must be used):

1.    Download all RDB2Onto files.

2.    For deploying and running RDB2Onto you must specify environment variable $NAZOU_HOME.

3.    Execute ant linux/windows command (depending on operating system) to start the demo, or ant dist command to create jar file.

4.    Change the directory to /etc/. Add the following line to the crontab file: */5****  $NAZOU_HOME/RDB2Onto/RDB2Onto.sh. Where the first part */5**** represents frequency of running the RDB2Onto tool to check and process new data in relational database. Value 5 means running RDB2Onto every 5 minutes. This value can be changed.

A.2.3      Configuration

RDB2Onto tool uses property files which need to be defined. Values such as Sesame repository type, file repository, database repository, username and password, ontology namespaces need to be defined. All property files are located in $NAZOU_HOME/config and $NAZOU_HOME/CorporateMemory directories.

A.2.4      User Guide

RDB2Onto can be used as a tool for replication from Relational Database to Ontology or to a specified file (in RDF/OWL format). RDB2Onto tool can be executed using the following command: java nazou.RDB2Onto.RDB2Onto schema_file add_to_ontology [output_file] [sql_query]. For a NAZOU project there is a shell executable file RDB2Onto.sh which must be executed in specified intervals (see step 4 in A.2.2).  The RDB2Onto.sh file contains the following line: java nazou.RDB2Onto.RDB2Onto file/CM_Source.tpl true RDB2Onto.rdf. This means that RDB2Onto tool uses CM_Source.tpl template file and fills it in with data retrieved from SQL query (which is created inside the tool code). The filled in template is then added to ontology and written to a RDB2Onto.rdf file for further processing. If users want to use RDB2Onto tool in a different way they can rewrite RDB2Onto.sh file and use their own template files and SQL queries (see developer part).

A.3          Developer Manual

A.3.1      Tool Structure

RDB2Onto tool consists of the following packages:

Architecture of the tool is shown in Figure 1.

Figure 1 RDB2Onto Architecture

A.3.2      Method Implementation

RDB2Onto tool core method works in three basic steps which can be explained in the following example: There is a document table with following fields: id, url, original_doc_path, converted_doc_path, download_date, lang in relational database. In this example SQL query will look as follows:

SELECT

id, url, original_doc_path, converted_doc_path,

download_date, IF(lang = 'sk', 'Slovak', 'English')

AS lang

FROM

document

The SQL query is executed and for each row of the query results it fills in the XML-based OWL template. Each element enclosed with {} brackets is replaced with adequate value from SQL query for a given row and composed OWL data are stored to the ontology model.

<?xml version="1.0" encoding="UTF-8"?>

<rdf:RDF

xmlns:jo="http://nazou.fiit.stuba.sk/nazou/ontologies/v0.6.17/offer-job#"

    xmlns:inst="http://nazou.fiit.stuba.sk/nazou/ontologies/v0.6.17/offer-job-inst#"

    xmlns:c="http://nazou.fiit.stuba.sk/nazou/ontologies/v0.6.17/classification#"

    xmlns:ofr="http://nazou.fiit.stuba.sk/nazou/ontologies/v0.6.17/offer#"

    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

    xmlns:owl="http://www.w3.org/2002/07/owl#"

    xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">

    <rdf:Description rdf:about="offer-job-inst:jo_{id}">

           <rdf:type rdf:resource="offer-job:JobOffer"/>

           <ofr:hasSource rdf:resource="offer-job-inst:source_{id}"/>

           <ofr:hasOfferCreator rdf:resource="offer-job-inst:OfferCreator_NAZOU_RDB2Onto"/>

    </rdf:Description> 

    <rdf:Description rdf:about="offer-job-inst:source_{id}">

           <rdf:type rdf:resource="offer:OfferSource"/>

           <ofr:acquisitionDate>{download_date}</ofr:acquisitionDate>

           <ofr:originalURI>{url}</ofr:originalURI>

           <ofr:localURI>{original_doc_path}</ofr:localURI>

           <ofr:localConvertedURI>{converted_doc_path}</ofr:localConvertedURI>

           <ofr:language rdf:resource="region:{lang}"/>         

    </rdf:Description>

</rdf:RDF>

Composed OWL is then stored in selected ontology.

A.3.3      Enhancements and Optimizing

There can be some core algorithm improvements (mainly in speed).

A.4          Manual for Adaptation to Other Domains

RDB2Onto tool can be easily extended or mapped to other domains and applications by specifying own SQL queries and templates (OWL/RDF files).

A.4.1      Configuring to Other Domain

When using RDB2Onto in other domain it is necessary to provide following modifications:

§  To change application or domain ontology

§  To change or modify used template file

§  To change or modify used SQL query

§  To change or modify all property files

A.4.2      Dependencies

§  Log4J logging utility

§  Corporate memory consisting of three parts: file (Corporate memory libraries), relational database (Corporate memory libraries) and ontology (OntoCM libraries)[1]



[1] All libraries can be omitted except relational part of Corporate Memory