Method for Values Normalization (tool VALINOR)

Automatic values normalization.

Institution: Softec, Ltd.
Technologies used: SMODELS, SWI-Prolog, Sesame
Inputs: 1. Values of properties which should be normalized.
2. Description of relations of different units.
Outputs: Normalized properties values
Documentation: HTML, doc, JavaDoc (tool, wrapper)
Distribution packages: zip
Video: demonstration video

Addressed Problems

Information acquired from different web pages can have values of the same property expressed in different units. In the domain of job offers it is the case of e.g. salary, which is expressed once as USD per hour, once as GBP per year, etc. The tools working with such data have difficulties using the data effectively, e.g. in the case of comparing salary values of different job offers.

Description

The method for values normalization serves for the conversion of ontological properties, which are expressed in different units and which should be comparable to each other. The normalization method is declaratively defined and is flexible for changes.

The method is based on three principles:

Any data stored in a database, and especially in the ontology, can be viewed as logic statements about entities described by the data. Inference of new statements from existing (known) statements is the focus of logic programming (LP) approach which we choose as the platform for values normalization. Logic programming is a tool universal enough for the realization of values conversions.

The method is based on the following groups of logic rules:

The method itself is more deeply described in (Pázman, 2007).

The method is implemented as the tool VALINOR, which reads the values of properties for normalization, converts them to the specified normal unit and finally writes them back as new offer properties.

Two LP approaches are used in VALINOR: Prolog and Answer Set Programming (ASP). For each approach, one prototype is created. The Prolog prototype is built using SWI-Prolog, the ASP one is created using SMODELS solver.

The normalization logic rules are similar but different in the prototypes. The expressive power is much stronger in Prolog than in ASP. The most visible difference is in the rules for conversion of compound units, which use function symbols. We used so called stratified (leveled) predicates in the ASP approach to allow to generate a ground program (needed by ASP) for rules with function symbols. In the Prolog approach we use simpler and more intuitive rules for compound units.

References

  1. Pázman, R.: Values Normalization with Logic Programming. In: Tools for Acquisition, Organisation and Presenting of Information and Knowledge (2). Návrat, P., Bartoš, P., Bieliková, M., Hluchý, L., Vojtáš, P. /eds./. Proceedings in Informatics and Information Technologies, Research Project Workshop, Horský hotel Polana, Slovakia, September 22-23, 2007, pp. 134-141.