A               UPreA - User Preference Acquirer

A.1          Basic Information

Methods for finding best objects use various representations of user preferences. They can be represented as sets or intervals of preferred attribute values. It is possible to use fuzzy sets with membership function range [0, 1] instead of classical crisp sets.

However, these methods are not concerned with acquiring user preferences. They are either fixed or expected as input. The problem of acquiring user preferences is addressed by tool UPreA. It provides user with initial fuzzy sets which can be used in the process of finding best objects.

A.1.1      Basic Terms

Local preference

Relates to one attribute, e.g. salary. It specifies which attribute values are preferred by user.

Global preference

Relates to all attributes. It provides a way to combine relevancies of objects obtained by user local preferences and get overall relevance of object.

Fuzzy set

Local preference that maps every attribute value to a number from [0, 1]. Higher number means higher relevance of this value. Such mapping can be viewed as a membership function of fuzzy set.

Aggregation function

Global preference, usually a weighted average. Aggregation function must be monotone in all its arguments.

Rules

Another type of global preference. If object fulfills expressions on the right side of the rule (body), then the overall value of this object is at least the value on the left side of the rule (head).

A.1.2      Method Description

UPreA tool deals with both local and global preferences which form user profile together with some personal user information. It provides various methods for working with preferences. These methods can be divided into following groups:

§  Working with existing user instances: select, insert, delete or update user instances in Sesame ontology. Select methods are fast, insert and update methods are slightly slower. They should be called only after registration of new user or after major changes of his/her profile.

§  Acquiring new user instances.

§  Providing facade for user-dependent search, delegating methods with user’s preferences to Top-k and IGAP tools.

Second group (acquiring new user instances) is the most important functionality. We will describe it in more detail.

Figure 1 shows three alternative ways of obtaining user preferences. First possibility is direct input from user via graphic interface. This is the most desirable possibility because we gain exact preferences. Second possibility is to generate user profile from users with similar personal data. The last possibility is to use default preferences created by domain expert.

Figure 1. Acquiring user preferences.

A.1.3      Scenarios of Use

UPreA can be used in the following scenarios:

§  Working with user ontology.

§  User-dependent searching for best objects.

§  Some tool needs to input user preferences, namely fuzzy sets, aggregation functions or rules.

UPreA should not be used in following cases:

§  User model is different than the model described here, based on fuzzy sets.

§  User ontology is different.

A.1.4      External Links and Publications

§  Gurský, P., Horváth, T., Novotný, R., Vaneková, V., Vojtáš, P.: UPRE: User Preference Based Search System. 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI'06), pages 841-844, IEEE Computer Society 2006, ISBN 0-7695-2747-7.

§  Vaneková, V.: Reprezentácia a spracovanie používateľských preferencií v RDF. MIS 2007.

A.2          Integration Manual

This section describes integration of UPreA to other application. UPreA is developed in Java SE 5 and its classes and methods can be imported from Java archive.

A.2.1      Dependencies

UPreA uses following Java archives which must be included in build path:

§  commons-configuration-1.3.jar - generic configuration interface which enables an application to read configuration data

§  Nazou-ITG-2.1.jar – integration technology; only class NazouConfiguration is used

§  log4j-1.2.12.jar – logging utility

§  spring.jar – only JDBC template is used

§  servlet-api.jar – packages for using HTTP servlets

§  sesame.jar – provides access to Sesame

§  mysql-connector-java-5.0.3-bin.jar – MySQL connector

This tool works with user ontology. It requires Sesame server up and running (not necessarily on the same computer) and filled with RDF data from user ontology.

A.2.2      Installation

Installation of UPreA requires the following steps:

1.    Sesame server must be installed on localhost or other computer and configured, usually under Tomcat.

2.    A repository that allows inferencing must be chosen. All OWL files from user ontology must be added to this repository.

3.    All jar archives listed above plus jar archive containing UPreA must be included into project.

4.    UPreA depends on default user profile provided by IndexHolder class from topk package. For installation instructions, see Top-k Integration manual.

5.    If UPreA is used as a facade for Top-k and IGAP, other jar archives and software must be included as well. See Integration manuals of Top-k and IGAP.

A.2.3      Configuration

Several properties files are necessary for proper working with UPreA tool. First requirement is to specify a root folder of the project in wid-home.properties. The most important file is sesame.properties file which influences access to Sesame database by following values:

§  sesame.url – URL of running Sesame server.

§  sesame.login, sesame.password – user name and password for Sesame.

§  sesame.repository – abbreviated name of Sesame repository. This repository must allow inferencing.

§  sesame.owl, sesame.c, sesame.r, sesame.gu, sesame.jou, sesame.inst, sesame.ofr, sesame.jo - namespace prefixes starting with “<” and ending with “#”. If ontology schema changes, these namespaces must be updated.

§  sesame.a – full URI of rdf:type predicate, enclosed in “< >”.

§  sesame.instancePrefix – specifies which namespace is valid for user instances.

Applet for editing of fuzzy sets requires file sesameseed.properties with following value:

§  sesameseed.dataSenderUrl – URL of DataSender servlet which provides initialization data for applet. It is usually deployed under Tomcat application server and the URL is set in web.xml file.

UPreA also works with index stored in a file. The file path is specified in uprea.properties:

§  uprea.indexfile – path to index file.

The last required properties file is mapping.properties. It maps names of all possible attributes to ontology predicates. For example mapping.Place denotes attribute Place of job offers. It maps to jo:hasDutyLocation URI. Such mapping has to be added for every attribute that is used in the process of finding best objects.

A.2.4      User Guide

Functions of UPreA can be divided to three groups:

§  Working with existing user instances.

§  Acquiring new user instances.

§  Providing facade for user-dependent search.

We will describe these functions in the following sections. All mentioned functions work with user instances; therefore we will detail classes User, UserProfile and ListOfAttributes. Figure 2 shows these classes with private fields. Getter and setter methods and some other methods are not included in the diagram.

Every instance of User has unique id, which is abbreviated URI identifier from user ontology. Other variables define user’s personal information independent from actual domain. Variable profile contains both local and global preferences. Local preferences are related to job offer attributes and they are represented as ListOfAttributes instance. Every attribute in this list is either fuzzy (attribute values have some natural ordering) or crisp (no natural ordering). Fuzzy attribute contains fuzzy set object, crisp attribute contains a set of possible values. Every attribute has weight that can be used in weighted average as global preference. Weighted average is applied only if UserProfile does not have any rules. IF-THEN rules are provided by IGAP tool and they also represent global preferences.

Figure 2. Class diagram for user preferences.

Working with existing user instances

Working with existing user instances includes database operations insert, select, update and delete. However, these operations are non-standard; they deal with RDF graphs and Sesame database instead of classical database tables. To insert a User object into Sesame, the method must first generate corresponding RDF triples, merge them to RDF document and send the document to Sesame server. Select operations are more similar to their SQL counterparts; they retrieve a table of variable-value bindings. New User instance is created and filled with these values.

Database operations are provided by class UserProfileDao. The following code shows how to use this class:

UserProfileDao dao = DaoFactory.getUserProfileDao();

//select operation

User u = dao.getFullUser("jou:u1");

//insert operation

dao.storeUser(u);

dao.storeUserProfile(u);

//update operation

dao.updateUserProfile(u);

//delete operation

dao.deleteUser(u);

//delete all users from ontology

dao.deleteUsers();

Method storeUser() inserts only user’s personal information. We assume that this information is stored only once, at the time of user’s registration. Method storeUserProfile() inserts only UserProfile instance. This approach saves time and amounts of transferred data. The same approach can be applied to select methods. Whole User instance is selected via getFullUser(), user personal information via getUser(), and profile via getProfile() method.

If these methods fail to connect to Sesame, they catch OntologyBackendException and write the cause to log4j log file.

Acquiring new user instances

Graphic interface for manual input of fuzzy sets is available in package sesameseed. The graphic applet can be embedded in HTML page using following code:

<APPLET

CODE = "uinf.wid.tools.sesameseed.applet.FuzzyApplet3.class"

ARCHIVE = "applets.jar, commons-logging-api.jar, log4j-1.2.14.jar, commons-collections-3.2.jar, commons-configuration-1.3.jar, commons-discovery-0.4.jar, commons-lang-2.1.jar"

NAME = "FuzzyApplet"

WIDTH = 800

HEIGHT = 600>

The applet works properly only if it receives default User instance from servlet DataSender. This servlet should be deployed in some application server and its URL set in sesameseed.properties file.

Package uprea maintains index of user preferences. If some User instance has no UserProfile, it can be filled from this index as follows:

User u = new User();

//changing user’s personal data

...

//filling user preferences

UPreA.fillProfileForUser(u);

The index file should be created offline before first call of any UPreA method:

UPreA.createIndex();

Every new user should be added to index file after first registration:

User u = new User();

//fill user profile

...

//add user to index

UPreA.addUserToIndex(u);

Providing facade for user-dependent search

Class UpreaFacade in package uprea provides methods for user-dependent search. These methods usually require user URI as one of the arguments. User instance must be previously stored in the user ontology as described in the section Working with existing user instances.

There are three main methods named getRatedInstances(), with different list of arguments:

getRatedInstances(String uri, List<DomainObject> ratedOffers, int k)

If this method gets non-null list of rated objects, it calls IGAP tool which learns new global preferences. Top-k objects are found with these global preferences and user’s local preferences.

getRatedInstances(String userURI, String typeURI, int k, List <RestrictedAttribute> list)

This method does not call IGAP. It gets top-k objects with user’s stored local and global preferences. Argument typeURI specifies the type of top-k objects, e.g. "jo:JobOffer" in case of job offers. There is also a possibility of restricting attribute domains with RestrictedAttribute objects. Such object has either a list of permitted values or interval bounds. Only values accepted by this restriction will have non-zero relevancies. If list of restricted attributes is empty, values are considered to be unrestricted.

getRatedInstances(String userURI, String typeURI, int k, List <RatedInstance> ratedList, List <RestrictedAttribute> list)

This method has the same functionality as previous method, but it also calls IGAP to learn new global preferences from list of RatedInstance.

All mentioned methods return List<RatedInstance> as a result. It is a common interface for rated objects from any domain. Interface RatedInstance has two getter methods, namely getURI() and getRating(), which return abbreviated URI of corresponding object and its rating. It is implemented by Objekt and DomainObject classes.

A.3          Developer Manual

A.3.1      Tool Structure

UPreA classes are grouped in the following packages:

§  uinf.wid.core – package with common classes used also by IGAP and Top-k tools. It includes classes User, UserProfile, Attribute and FuzzySet, but also SesameTemplate for accessing Sesame database.

§  uinf.wid.dao – data access objects, especially UserProfileDao for storing and selecting user profiles.

§  uinf.wid.tools.sesameseed – servlets supporting FuzzyApplet.

§  uinf.wid.tools.sesameseed.applet – applet for manual editing of user profiles.

§  uinf.wid.tools.upreaUPreA class which maintains index of user preferences, UpreaFacade for user-dependent search.

Figure 3. Packages and dependencies.

A.3.2      Method Implementation

The package uinf.wid.uprea contains UPreA class which maintains index of user preferences. This index is designed as two-dimensional hash table.

In our model, we have only finite number of distinct users, therefore we call them user types. There is also finite number of attributes. We can calculate a hash function for every user type and every attribute. Thus we get one value from the hash table for every user type and attribute. This value contains four numbers: number of ASCENDING, DESCENDING, HILL and VALUE fuzzy sets already used for the same type of user and attribute.

Table 1Illustrative example of index hash table.

User type

Salary

EducationLevel

ASC

DESC

HILL

VALL

ASC

DESC

HILL

VALL

22

5

0

3

0

1

7

0

0

23

12

0

3

1

4

10

2

0

24

10

3

4

0

3

2

12

1

Table 1 shows a simplified example of such index table. First column contains hashes of user types, first line contains names of attributes. For the sake of simplicity we restrict this table to three user types and two attributes, Salary and EducationLevel.

If we added a new user of type 23 with HILL fuzzy set for Salary and ASCENDING fuzzy set for EducationLevel, the resulting index table is shown in Table 2. Changed values are written with bold font.

Table 2Modified index hash table.

User type

Salary

EducationLevel

ASC

DESC

HILL

VALL

ASC

DESC

HILL

VALL

22

5

0

3

0

1

7

0

0

23

12

0

4

1

5

10

2

0

24

10

3

4

0

3

2

12

1

On the other hand, if we wanted to fill preferences for a new user of type 22, he would receive ASCENDING fuzzy set for Salary and DESCENDING for EducationLevel. In this case UPreA takes the prevailing type of fuzzy set.

A.3.3      Enhancements and Optimizing

It is possible to find other method of filling user preferences or importing them from other tools. To integrate such methods, it would be necessary to create a new class (e.g. MyClass) with method fillProfileForUser(User u). Then a new facade MyClassFacade would be created from UpreaFacade by replacing every occurrence of UPreA.fillProfileForUser(u) with MyClass.fillProfileForUser(u). Such facade can be used instead of UpreaFacade.

Although most methods are optimized, communication with Sesame database and especially update operations are not fast enough. If we wanted to test or use some other database, we would have to change SesameTemplate to access this new database and preserve current method names.

A.4          Manual for Adaptation to Other Domains

User model proposed in this manual can be used for arbitrary domain ontology. Fuzzy sets and rules are domain independent, only attribute names and mappings can change. UPreA depends only on user ontology, which is fixed and the ontology schema must be used „as is“.

A.4.1      Configuring to Other Domain

Adaptation to other domain requires a domain expert to find all interesting attributes which can be criteria for searching best objects. We need to know if these attributes have continuous range or just finite number of values, and if these values have some kind of natural (not alphabetical) ordering. If they have some ordering, we speak about fuzzy attributes, otherwise about crisp attributes. For every attribute from new domain ontology we need to write a SeRQL select query which retrieves tuples <object, attribute_value>. The domain expert should define default preferences for fuzzy attributes. All this information helps to write a new XML file configtopk.xml for Top-k class IndexHolder, which provides default user preferences.

After specifying interesting attributes it is necessary to modify mapping.properties file. If we have a new attribute named Price with URI ex:hasPrice, we add a new line:

mapping.Price = ex:hasPrice

Next step is to create new Top-k indexes. If the XML configuration file is set up correctly and data is in Sesame ontology, we can run the following code:

IndexHolder.createIndex();

ListOfAttributes loa = IndexHolder.getListOfAttributes();

for (Attribute a : loa) {

    System.out.println(a.getName());

}

System.out.println("Done.");

This code creates all necessary index files and writes attribute names on the console. If this step works, all UPreA methods have been successfully adapted to new domain.

A.4.2      Dependencies

UPreA depends on following jar archives: commons-configuration-1.3.jar, Nazou-ITG-2.1.jar, log4j-1.2.12.jar, spring.jar, servlet-api.jar, sesame.jar, mysql-connector-java-5.0.3-bin.jar. These packages are independent from particular domains and require no changes after changing domain. UPreA also uses IndexHolder, a part of topk package, but all necessary changes were described in previous section.

OWL files from new domain ontology must be added to Sesame repository together with user ontology schema.