4.1 LaTeXML Customization

§ 4.1.1 Expansion & Macros

DefMacro($prototype,$replacement,%options)

Macros are defined using DefMacro, such as the pointless:

  DefMacro(’\mybold{}’,’\textbf{#1}’);

The two arguments to DefMacro we call the prototype and the replacement. In the prototype, the {} specifies a single normal TeX parameter. The replacement is here a string which will be tokenized and the #1 will be replaced by the tokens of the argument. Presumably the entire result will eventually be further expanded and or processed.

Whereas, TeX normally uses #1, and LaTeX has developed a complex scheme where it is often necessary to peek ahead token by token to recognize optional arguments, we have attempted to develop a suggestive, and easier to use, notation for parameters. Thus a prototype \foo{} specifies a single normal argument, wheere \foo[]{} would take an optional argument followed by a required one. More complex argument prototypes can be found in Package. As in TeX, the macro’s arguments are neither expanded nor digested until the expansion itself is further expanded or digested.

The macro’s replacement can also be Perl code, typically an anonymous sub, which gets the current Gullet followed by the macro’s arguments as its arguments. It must return a list of Token’s which will be used as the expansion of the macro. The following two examples show alternative ways of writing the above macro:

  DefMacro(’\mybold{}’, sub {
    my($gullet,$arg)=@_;
    (T_CS(’\textbf’),T_BEGIN,$arg,T_END); });

or alternatively

  DefMacro(’\mybold{}’, sub {
    Invocation(T_CS(’\textbf’),$_[1]); });

Generally, the body of the macro should not involve side-effects, assignments or other changes to state other than reading Token’s from the Gullet; of course, the macro may expand into control sequences which do have side-effects.

Tokens, Catcodes and friends

Functions that are useful for dealing with Tokens and writing macros include the following:

  • Constants for the corresponding TeX catcodes:

       CC_ESCAPE, CC_BEGIN,  CC_END,     CC_MATH,
       CC_ALIGN,  CC_EOL,    CC_PARAM,   CC_SUPER,
       CC_SUB,    CC_IGNORE, CC_SPACE,   CC_LETTER,
       CC_OTHER,  CC_ACTIVE, CC_COMMENT, CC_INVALID
  • Constants for tokens with the appropriate content and catcode:

      T_BEGIN, T_END,   T_MATH,  T_ALIGN, T_PARAM,
      T_SUB,   T_SUPER, T_SPACE, T_CR
  • T_LETTER($char), T_OTHER($char), T_ACTIVE($char), create tokens of the appropriate catcode with the given text content.

  • T_CS($cs) creates a control sequence token; the string $cs should typically begin with the slash.

  • Token($string,$catcode) creates a token with the given content and catcode.

  • Tokens($token,...) creates a (LaTeXML::Core::)Tokens object containing the list of Tokens.

  • Tokenize($string) converts the string to a Tokens, using TeX’s standard catcode assignments.

  • TokenizeInternal($string) like Tokenize, but treating as a letter.

  • Explode($string) converts the string to a Tokens where letter character are given catcode CC_OTHER.

  • Expand($tokens expands $tokens (a Tokens), returning a Tokens; there should be no expandable tokens in the result.

  • Invocation($cstoken,$arg,...) Returns a Tokens representing the sequence needed to invoke $cstoken on the given arguments (each are Tokens, or undef for an unsupplied optional argument).