4.2.4 Allowance for surrogate files

The underlying concept is to view a document as a pair of files $(D,S)$. File $D$ is the display file that is presented to the user when the document matches a query and is retrieved. The file $S$ (the surrogate file) is what is indexed at the time of creating the database or when adding the document to the database. The key point is that the two files $D$ and $S$ need not be identical. Indeed, the surrogate file $S$ can have: The main reason for having a pair of files is that display files do not have all the textual (and thus indexable) contents needed, and, on the other hand, the surrogate files have contents that are not for display. In practice, D is a subset of S. File S is created at indexing time, and can later be discarded (after the index has been generated); only file D need be kept.

Technical Aspects of the Digital Library of Mathematical Functions 1
Bruce R. Miller - Abdou Youssef
Translated by Bruce R Miller on 2002-12-17
Comments? DLMF_feedback@nist.gov
Digital Library of Mathematical Functions