A              OntoCM – Interface to the Ontology Corporate Memory

A.1          Basic Information

OntoCM is an interface to the ontology part of the Corporate memory. It serves as a library (API) for accessing domain ontology for tools.

A.1.1      Basic Terms

API

Application Programming Interface.

(Information) Domain

The information area, where the tool searches entities of a given type. We use the domain of job offers as an experimental information space.

Corporate memory

In this context, Corporate memory means all persistent data shared between tools in the project NAZOU. It consists of 3 parts: ontology, files and relational data.

Ontology

Description of data and data itself of an information domain in the form of a taxonomy of concepts and relations between them.

RDF graph

A graph representing an ontology. Each node is a concept or its individual instance and edges represent properties of concepts/individuals.

Ontology repository provider

Database system which serves as an ontology repository, e.g. Jena, Sesame, etc.

A.1.2      Method Description

OntoCM is a set of Java interfaces and classes which provide an interface to ontological memory. In general, it is an API independent of ontology repository provider (Pázman, 2006). It contains also the implementation of the interface for Sesame repository provider.

A.1.3      Scenarios of Use

OntoCM is used to connect to an ontology and to select and create ontological objects.

The connection is built in the following manner:

IFactory factory = OntoCreator.getMemoryFactory(PROPERTY_FILE);

IOntoMemory memory = factory.getMemoryInstance();

PROPERTY_FILE is the path to the configuration file (see section Configuration).

The selection of data is made by standard ontological languages. In the case of Sesame the most convenient one is SeRQL. Its usage is shown in the following example:

IResultRows result = memory.query(
           "SELECT * FROM {x} p {y} WHERE x LIKE \"*test*\"",
           IOntoMemory.Lang.SERQL);

for ( IResultRow row : result ) {
    System.out.print("Row:");
    for ( IResultData data : row ) {
           System.out.print(" \"" + data + "\"");
    }
    System.out.println();
}

The method query() returns a list of rows which consist of list of data. The result rows are data fulfilling the query.

The creation of ontological objects is supported by the following methods of IFactory interface:

§  IOntoMemory getMemoryInstance() – returns an object representing an ontological repository.

§  IIndividual getNewIndividualInstance() – creates an object representing an ontological subject (entity) with its predicates and objects.

§  IStatement getNewStatementInstance() – creates an object which represents an ontological subject–predicate–object triple.

§  ILiteral getNewLiteralInstance() – creates an ontological literal (simple value, such as string and number).

§  IGraph getNewGraphInstance() – creates an empty RDF graph.

The more detailed description of the classes and interfaces can be found in the Javadoc documentation of OntoCM.

A.1.4      External Links and Publications

The project where the method was developed is described in:

Návrat, P. – Bieliková, M. – Rozinajová, V. (2005). Methods and Tools for Acquiring and Presenting Information and Knowledge in the Web. In: CompSysTech 2005, Rachev, B., Smrikarov, A. (Eds.), Varna, Bulgaria, June 2005. pp. IIIB.7.1–IIIB.7.6.

The more detailed description of the principles which OntoCM is built on is described in:

Pázman, R. (2006). The Construction of an Independent Interface to the Ontological Organizational Memory. (In Slovak: Tvorba nezávislého rozhrania pre ontologickú organizačnú pamäť). In: Proceedings of the 1st Workshop on Intelligent and Knowledge Oriented Technologies (WIKT'06), Laclavík, M., Budinská, I., Hluchý, L. (Eds.), November 28–29, 2006, Bratislava, Slovakia.

A.2          Integration Manual

A.2.1      Dependencies

OntoCM currently depends on Sesame libraries, ITG-technology and log4j.

A.2.2      Installation

Installation of OntoCM consists of the following steps:

1.    Copy the file Nazou-OntoCM-*.jar from OntoCM distribution into the directory WEB-INF/lib of a web application.

2.    Create a file with settings of interface, or add these settings into common tools’ settings (Nazou-Commons.properties – see the documentation of ITG-technology). Basic parameters required in configuration file are described in the section Configuration.

The file should be placed in the configuration directory defined by ITG-technology. For more about using common configuration see the documentation of ITG-technology.

In order to use OntoCM for the development of a tool it is necessary to reference the JAR library in the classpath of the tool's project.

A.2.3      Configuration

OntoCM reads its configuration from a properties file using ITG-technology. The property file needs to have specified the following property:

FACTORY_NAME

Full class name of a class implementing the IFactory interface (for example: sk.nazou.cm.impl.sesame.FactorySesame).

In order to use the implementation of the ontology interface for Sesame repository provider, the following additional properties have to be specified:

REPOSITORY_TYPE

Specifies the type of the repository (local or server).

SESAME_SERVER

URL of the sesame server being used.

SESAME_USER

Login name of the Sesame user used to connect to Sesame.

SESAME_PASSWORD

Password of the Sesame user used to connect to Sesame.

SESAME_REPOSITORY_ID

Identification of the Sesame repository to be used.

SESAME_INFERENCING

Is set to 1 to enable inferencing for local repository or 0 to disable it (for server repository it is recommended to fill it in correctly too).

REPOSITORY_LOCAL_FILE_STORE

Path to (local) file in which the local repository should be stored.

REPOSITORY_AUTOLOAD

Allows to preload data from files at the initialization of MemorySesame. Any number of files may be used (specified by their file paths, delimited by ";").

LOG_WHOLE_QUERIES

If set to 1, whole queries are going to be logged. This produces a lot of data so it’s recommended to use it only for debugging purposes.

NAMESPACES_BASE

URI-prefix used by the default namespace (e.g.: http://nazou.fiit.stuba.sk/nazou/ontologies/v0.6.16/).

NS_SHORTCUT_*

Shortcut used for a namespace from NS_URI_* (* stands for the name of the namespace and it is the same identifier in both NS_SHORTCUT_* and NS_URI_*).

NS_URI_*

Local part or full namespace URI. If it is a local part of an URI, the prefix specified in NAMESPACES_BASE is used. Any number of NS_SHORTCUT_*NS_URI_* pairs can be used.

Example of using namespaces:

NAMESPACES_BASE=http://my.domain/my/ontologies/
NS_SHORTCUT_PIGS=pg
NS_URI_PIGS=pigs#
NS_SHORTCUT_COWS=cw
NS_URI_COWS=cows#
NS_SHORTCUT_OWL=owl
NS_URI_OWL=http://www.w3.org/2002/07/owl#

A.2.4      Integration Guide

OntoCM is a library which serves as an interface to an ontology. There is no need to integrate it with any other libraries or tools.

A.3          Development Manual

A.3.1      Tool Structure

OntoCM consists of a common package sk.nazou.cm and the packages sk.nazou.cm.impl.* for provider-specific implementation of the interfaces stated in the common package. OntoCM currently contains the implementation of common package interfaces for Sesame ontology repository provider.

A.3.2      Method Implementation

The common package sk.nazou.cm contains one main class and several Java interfaces. The class OntoCreator serves as an entry point to OntoCM. Using the configuration file, it creates a factory object for the selected ontology repository provider (using the method getMemoryFactory).

The IFactory interface allows to create basic ontology entities, such as literals (ILiteral interface), RDF graphs (IGraph), individuals with its properties (IIndividual) and RDF statements (IStatement).

It also provides the general object for a repository connection used mainly for selecting data from an ontology repository. The object is of type IOntoMemory, which allows obtaining data from an ontology using a query language (query and queryGraphs methods), adding data into the ontology (addDataFromFile, addDataFromReader, addGraph), removing data from the ontology (removeStatements, removeGraph) and some other operations.

The implementations for these interfaces for Sesame are put into the package sk.nazou.cm.impl.sesame. The following table shows their correspondence to interfaces from the common package:

FactorySesame

IFactory

MemorySesame

IOntoMemory

LiteralSesame

ILiteral

StatementSesame

IStatement

IndividualSesame

IIndividual

GraphSesame

IGraph

ResultRowsSesame

IResultRows

ResultRowSesame

IResultRow

ResultDataSesame

IResultData

The more detailed description of these Java classes and interfaces can be found in the Javadoc documentation of OntoCM.

A.3.3      Enhancements and Optimizing

OntoCM is implemented in a way which should not decrease the performance of an ontology provider. Some of the methods which supplies Sesame native API are not supported yet – we see some space to improve OntoCM in this way.

OntoCM is currently implemented only for Sesame ontology provider. It would be highly desirable to enlarge the number of supported providers (e.g. by Jena).

A.4          Manual for Adaptation to Other Domains

OntoCM is completely domain independent, so it is directly usable in any other information domain. Some domain specific information can be supplied in its configuration file (e.g. domain ontology prefixes), but the library itself need not be changed when moving to another domain.