In traditional books and handbooks, and even in well-structured Web sites, an index and a table of contents are often adequate for finding information. In the DLMF Handbook and Web site, however, the contents contain so many formulas and other mathematical constructs that mere indexes and table of contents fall extremely short. Rather, a special search system has to be provided in order for users to quickly locate what they are looking for.
Furthermore, although text search and retrieval is a mature technology and many text search systems are available, math search presents new demands and issues that the text search community never had to face. To identify the major math search issues, it helps to consider the reasons why conventional search systems are inadequate for math search. We recognize three major reasons.
The first is that mathematical contents often involve non-alphabetical
symbols that are not understood by current search systems, or at least not
rightly interpreted. Terms like
Gamma(1/2), P_n(x), x**5, or d^2y/dx^2-x=0
are either meaningless or improperly read and processed by current systems.
The second and more challenging reason is that formulas and equations,
as well as other mathematical constructs, have rich structures
that convey much meaning. Current search engines are not ``aware'' of those
structures, do not capture or index them, and are thus unable to search for
information that involve structural relationships and patterns.
A query like sin(x + log x) is no different to a current
search system than sin x + log x. Similarly, x (y + z)
is misinterpreted as x y + z, if interpreted at all.
The third and most challenging reason is that the many equivalent ways in
which mathematical terms can be expressed, which correspond to synonyms in
text search, are often much more complex than textual synonyms,
and thus cannot be fully captured in a thesaurus. A summation or a product
of two or more terms can be expressed in many equivalent ways due to
commutativity and associativity laws. Numbers can be represented in multiple
forms (e.g., 1/2 vs. 0.5 vs 2^{-1}). Polynomials can
be expressed in many
factored and unfactored forms. Trigonometric terms can be easily substituted
by other equivalent trigonometric terms. Indeed, it can be argued
that a large part of Mathematics is about the different and equivalent ways
of expressing a concept or a quantity. Obviously, current search systems are
not equipped to recognize those equivalences and take them into account
when searching -- indeed, the problem is not solvable in general.
Therefore, the major math search issues can be summarized as follows:
The next subsection will discuss some of the approaches that are being taken to address those issues.
|
![]() |