RFTS

Easily extendable tool for documents content indexing and rich full text search.

Institution: Slovak academy of sciences
Technologies used: Java, mysql, ogsa-dai
Inputs: Documents in plain text format
Outputs: Documents indexes, full-text search query results
Documentation: HTML, doc, JavaDoc
Distribution packages: zip

Addressed Problems

This tool address the problem of fast, content based, identification of specific documents from a large collection of documents. The documents in text formats are indexed; fast full text search over indexed collection is then possible. The motivation for implementing another search engine was to have an easily extendable and configurable document indexing tool to evaluate novel methods for information retrieval, documents statistical analysis and lemmatization and stemming methods for Slovak language.

Description

The tool consists of two logically separated parts - document indexing and full-text search

RFTS functionality in conjunction with Corporate Memory can be accessed locally (using JAVA interfaces or command line tools) as well as remotely using RPC calls or Web Service interface. The remote access and Web Service interface allows easy integration of the RFTS indexing and search solution in other components and allows rapid prototyping of new tools that require full-text search or some form of statistical analysis of document collection.

References

  1. CIGLAN, M. - LACLAVIK, Michal - SELENG, Martin - HLUCHY, Ladislav: Document indexing for automatic semantic annotation support. In INFORMATICS'2007 : proceedings of the ninth international conference on informatics. Bratislava : Slovak Society for Applied Cybernetics and Informatics, 2007. ISBN 978-80-969243-7-0, s. 163-169.
  2. Ciglan M.: Documents Content Indexing for Supporting Knowledge Acquisition Tools, In: Tools for Acquisition, Organisation and Presenting of Information and Knowledge. P.Navrat et al. (Eds.), Vydavatelstvo STU, Bratislava, 2006, pp.49-63, ISBN 80-227-2468-8. Workshop 29-30 September, Nizke Tatry, Slovakia. ITAT 2006, NAZOU Workshop, 26. 9 - 1. 10. 2006, Chata Kosodrevina, Bystrá dolina, Nízke Tatry, 2006