Chapter 3 Architecture

As has been said,  consists of two main programs: latexml responsible for converting the  source into xml; and latexmlpost responsible for converting to target formats. See Figure 3.1 for illustration.

The casual user needs only a superficial understanding of the architecture. The programmer who wants to extend or customize  will, however, need a fairly good understanding of the process and the distinctions between text, Tokens, Boxes, Whatsits and xml, on the one hand, and Macros, Primitives and Constructors, on the other. In a way, the implementer of a  binding for a  package may need a better understanding than when implementing for  since they have to understand not only the -view, primarily just the macros and the intended appearance, but also the -view, with xml and representation questions, aw well.

Figure 3.1: Flow of data through ’s digestive tract.

The intention is that all semantics of the original document is preserved by latexml, or even inferred by parsing; latexmlpost is for formatting and conversion. Depending on your needs, the  document resulting from latexml may be sufficient. Alternatively, you may want to enhance the document by applying third party programs before postprocessing.