Method for finding user dependent top-k offers (Top-k aggregator)

Top-k aggregator search offers in own index to find user dependent top-k offers.

Institution: Pavol Jozef Safarik University
Technologies used: Java, MySQL, Sesame, XML
Inputs: Offers in ontology, user preferences of attributes, fuzzy rules, number of required objects
Outputs: Top-k offers
Documentation: HTML, doc, JavaDoc
Distribution packages: zip

Addressed Problems

There is a need of searching algorithm that can retrieve some top offers with respect to user requirements. User often can say which property values of offers (e.g. salary, education requirement, place, hours/week) are interesting for him or her and which are not. Another form of user specification can be made by evaluation of some objects in a training set. We need to have an algorithm, that can return user dependent top-k objects with small number of accesses to the sources, thus without retrieving all the sources. We can use this method for aggregation of different forms of searching too.

Description

First of all we create offline indexes of attributes of offers, each in separate index structure. Indexes can hold several attibute types - ordinal, nominal or metric. User can set his preferences to attribute values by fuzzy sets, fuzzyfications or listing of permitted values. Default ordering settings and description of properties in ontology are described by domain expert in an XML file used during index preparation. In this method we retrieve attribute values from the indexes one by one from the best to the worst and agregate them using an aggregation function specified by user or using fuzzy rules obtained by the induction procedure in tool IGAP. To reach low number of accesses to the lists the we can use different heuristics and algorithms. At the moment, when we have k best objects, we can stop computing and return them as a result.

Block scheme

References

  1. Gurský P., Šumák M.: Top-k Aggregator. Proceedings Tools for Acquisition, Organisation and Presenting of Information and Knowledge, ISBN 80-227-2468-8, pages 115-124, 2006
  2. Gurský P.: Towards better semantics in the multifeature querying. Proceedings of Dateso 2006, ISBN 80-248-1025-5, pages 63-73, 2006
  3. P. Gurský, R. Lencses, P. Vojtáš: Algorithms for user dependent integration of ranked distributed information. Proceedings of TED Conference on e-Government (TCGOV 2005), ISBN 3-85487-787-0. 2005. 123-130.
  4. P. Gurský, T. Horváth: Dynamic search of relevant information. Proceedings of Znalosti, ISBN 80-248-0755-6. 2005. 194-201.
  5. P. Gurský, R. Lencses: Aspects of integration of ranked distributed data. Proceedings of Datakon, ISBN 80-210-3516-1. 2004. 221-230.