Possible supervisor: Heiner Stuckenschmidt
Thesauri have proven to be a key technology to effective information access as they help to overcome some of the problems of free-text search by relating and grouping relevant terms in a specific domain as well as providing a controlled vocabulary for indexing information. A number of thesauri have been developed in different domains of expertise, examples from the area of medical information include MeSH and Elseviers life science thesaurus EMTREE. These thesauri are already used to access information sources (in particular document repositories) like EMBASE or Science Direct, however, currently there are no links between the different information sources and the specific thesauri used to index and query these sources. The aim of the DOPE project (Drug Ontology Project for Elsevier) is to investigate the possibility of providing access to multiple information sources in the area of life science through a single interface. In order to benefit from the advantages of thesaurus-based access this single interface should be based on thesauri like EMTREE. A first prototype of the DOPE system has been developed that provides a basic infrastructure for querying multiple information sources using the EMTREE thesaurus. In cooperation with Elsevier we offer different Masters projects that extend and improve the DOPE system. Possible topics are:
- An empirical comparison of thesaurus-based and keyword based search
For this project, the DOPE system has to be extended with a keyword-based search facility and a comparative study of the performance of the two search options on typical user queries has to be carried out.
- Multi-thesaurus access to life science resources
For this project, it has to be investigated, how additional thesauri different from EMTREE can be added to the system to improve the search results. An appropriate thesaurus has to be selected, and translated to RDF and a mapping between EMTREE and the other thesaurus has to be generated. It has to be shown in a case study, how the mapping information can be used to improve the query results.
- Thesaurus Evolution
This project deals with the problem of how to handle the regular updates of the EMTREE thesaurus. In order to be able to use terms from the new version to query information that is still described using the old version, an existing change detection algorithm has to be used to generate mappings between the old and the new version. The DOPE system has to be extended to use these mappings for retrieving information.