3.8: Meta Data

IR systems can use information about documents like bibliographic data, classifications, or access information that are not really part of the document. Such data are called Meta Data. The destinction between data and Meta Data is often fuzzy: in some cases the bibliographic information is printed on the title page of articles, or key terms and classification are given in printed versions.

Meta Data can be included in documents encoded in appropriate structured formats like SGML, or they can be provided separately like done in reference databases or by web search engines. For resource discovery Meta Data should be provided in machine readable form. There are many complex formats for various domains and types of documents. The diversity and complexity of these make it difficult to handle them in a uniform way.

