Method for Ontology-Based Text Annotation (tool OnTeA)

Method finds or creates semantic metadata according to domain ontology from text.

Institution: Institute of Informatics
Technologies used: RegEx, Java, Jena, Sesame
Inputs: HTML or text document, domain ontology
Outputs: Ontology individual of defined type representing input text
Documentation: HTML, doc, JavaDoc
Distribution packages: zip
Video: demonstration video

Addressed Problems

When documents (HTML, text) are processed by computer system it needs to understand document structure. Web documents are structured but its structure is understandable mainly for humans. This problem is basic problem of the Semantic Web. The OnTeA method tries to create structured semantic metadata out of such documents according to the application domain ontology model. Thus OnTeA does not create new ontology, but tries to map documents with its equivalent in defined application ontology.


OnTeA analyze document or text using a regular expression patterns and detects equivalent semantics elements according to defined domain ontology. Several cross application patterns are defined but to achieve good results new patterns need to be defined for each application. OnTeA also creates new ontology individual of defined class and assignees detected ontology elements/individuals as properties of defined ontology class. Thus ontology instance of job offer is created out of its text representation in NAZOU pilot application.

Ontea Architecture


  1. Ontea at
  2. Ontea Poster
  3. Laclavik M., Seleng M., Hluchy L.: Towards Large Scale Semantic Annotation Built on MapReduce Architecture In Proceedings of ICCS 2008; M. Bubak et al. (Eds.): ICCS 2008, Part III, LNCS 5103, pp. 331-338, 2008.
  4. Laclavik M., Ciglan M., Seleng M., Hluchy L.: Empowering Automatic Semantic Annotation in Grid to appear in proceedings of PPAM 07, Springer-Verlag
  5. Michal Laclavik, Martin Seleng, Emil Gatial, Zoltan Balogh, Ladislav Hluchy: Ontology based Text Annotation - OnTeA Information Modelling and Knowledge Bases XVIII. IOS Press, Amsterdam, Marie Duzi, Hannu Jaakkola, Yasushi Kiyoki, Hannu Kangassalo (Eds.), Frontiers in Artificial Intelligence and Applications, Vol. 154, February 2007, pp.311-315. ISBN 978-1-58603-710-9, ISSN 0922-6389.
  6. Michal Laclavik, Martin Seleng, Marian Babik OnTeA: Semi-automatic Ontology based Text Annotation Method In: Tools for Acquisition, Organisation and Presenting of Information and Knowledge. P.Navrat et al. (Eds.), Vydavatelstvo STU, Bratislava, 2006, pp.49-63, ISBN 80-227-2468-8. Workshop 29-30 September, Nizke Tatry, Slovakia. ITAT 2006, NAZOU Workshop, 26. 9 - 1. 10. 2006, Chata Kosodrevina, Bystra dolina, Nizke Tatry, 2006
  7. Video (English) | Video (Slovak)