Creating Class Attributes with XSLT

On Writing Extensible Stylesheets

Gerrit Imsieke [ˈɡɛʁɪt.ˈɪmziːkə] or [ɪmˈziːkə] (@gimsieke),
le-tex publishing services [ɛl.ˈeː.tɛç] (@letexml)

https://subversion.le-tex.de/common/presentations/2020-07_Balisage_XSLT_Class-Attributes/slides/index.html

What is special about class attributes?

  • HTML rendering of XML: Customizable by CSS or XSLT
  • CSS customization easier, but XPath/XSLT more powerful for selecting & transforming content
  • Letting XSLT generate additional class attributes/tokens can increase CSS’s mileage

How can you write extensible (= customizable) stylesheets?

  • How do you as a stylesheet author provide customization hooks so that people who import your stylesheet can tweak the class attribute generation, without replicating too much of your code?
  • Where else can these fine-grained customization hooks be useful?
  • How well do popular XML vocabularies & their default HTML renderers perform in this regard (DocBook XSLT 2.0, TEI XSL Stylesheets, JATS Preview Stylesheets)? ⇒ Paper
  • What about DocBook xslTNG?

Note: “Customization” in this presentation is understood as the idiomatic xsl:import/override approach, not so much as “using predefined customization parameters”

Simple DocBook Example

plus ad-hoc XSLT

Typical “modified identity template” approach

<para role="foo">Para</para>
<xsl:template match="db:para | db:simpara">
  <p>
    <xsl:apply-templates select="@* | node()"/>
  </p>
</xsl:template>
<xsl:template match="@role">
  <xsl:attribute name="class" select="."/>
</xsl:template>

<p class="foo">Para</p>

Requirement 1: Add the source element name to the @class attribute tokens

What if we want the source element name as a class token?

1. <para role="foo">Para</para>
2. <para>Para</para>
<xsl:template match="@role">
  <xsl:attribute name="class" separator=" "
                 select="local-name(..), ." />
</xsl:template>

1. <p class="para foo">Para</p>
2. <p>Para</p> ⚡

Possible Solution

If you always want the local-name() to appear, hard-wire xsl:attribute:

<xsl:template match="db:para | db:simpara">
  <p>
    <xsl:attribute name="class" separator=" ">
      <xsl:sequence select="local-name()"/>
      <xsl:apply-templates select="@role"/>
    </xsl:attribute>
    <xsl:apply-templates select="@* except @role | node()"/>
  </p>
</xsl:template>

1. <p class="para foo">Para</p>
2. <p class="para">Para</p>

Requirement 2: Consider the @condition attribute

Process @condition like @role, merge the resulting classes

<xsl:template match="db:para | db:simpara">
  <p>
    <xsl:variable name="transformed-atts" as="attribute(class)*">
      <xsl:apply-templates select="@role, @condition"/>
    </xsl:variable>
    <xsl:if test="exists($transformed-atts)">
      <xsl:attribute name="class" separator=" ">
        <xsl:sequence select="$transformed-atts"/>
      </xsl:attribute>
    </xsl:if>
    <xsl:apply-templates 
      select="@* except (@role | @condition) | node()"/>
  </p>
</xsl:template>
<xsl:template match="@role | @condition">
  <xsl:attribute name="class" select="." separator=" "/>
</xsl:template>

Results

1. <para role="foo">Para</para>
2. <para>Para</para>
3. <para role="foo" condition="web hidden">Para</para>
4. <para condition="web">Para</para>

1. <p class="foo">Para</p>
2. <p>Para</p>
3. <p class="foo web hidden">Para</p>
4. <p class="web">Para</p>

A bit inefficient since the individually created class attributes are cast to strings before they become part of a compound class attribute again.

Also, duplicate tokens possible.

Requirement 3: Add local name, but only for simpara;
consider @condition, but filter out the token 'web'

Like this:

1. <para role="foo">Para</para>
2. <para>Para</para>
3. <para role="foo" condition="web hidden">Para</para>
4. <para condition="web">Para</para>
5. <simpara>Simpara</simpara>

1. <p class="foo">Para</p>
2. <p>Para</p>
3. <p class="foo hidden">Para</p>
4. <p>Para</p>
5. <p class="simpara">Simpara</p>
<xsl:template match="@role" as="xs:string*">
  <xsl:sequence select="tokenize(.)"/>
</xsl:template>
<xsl:template match="@condition">
  <xsl:sequence select="tokenize(.)[not(. = 'web')]"/>
</xsl:template>
<xsl:template match="db:para | db:simpara">
  <p>
    <xsl:variable name="transformed-atts" as="xs:string*">
      <xsl:apply-templates select="@role, @condition"/>
      <xsl:sequence select="local-name()[self::simpara]"/>
    </xsl:variable>
    <xsl:if test="exists($transformed-atts)">
      <xsl:attribute name="class" separator=" ">
        <xsl:sequence select="$transformed-atts"/>
      </xsl:attribute>
    </xsl:if>
    <xsl:apply-templates 
      select="@* except (@role | @condition) | node()"/>
  </p>
</xsl:template>

It’s getting complex!

How the DocBook, TEI, and JATS renderers typically address this complexity

  • Encapsulate the attribute generation in functions (DocBook, TEI).
    • Apart from the context element that is the first argument, the functions have arguments that let you add or remove class tokens.
  • Generate defaults for certain contexts:
    • <gloss><span class="gloss">… for TEI
    • <simpara><p class="simpara">… for DocBook xslTNG
    • <p class="first">… for JATS  p and license-p elements without predecessor

However: If you use the function arguments for adding classes (DocBook, TEI) or transform p/@content-type to a @class attribute (JATS), you will overwrite these defaults. ⚡

How to manipulate class attributes in the popular renderers

  • Redefine the functions.
    • You may end up with many context-dependent <xsl:when>s.
    • Plus, you need to replicate the original function body.
  • Add the same change (for ex., add the element name) to potentially many matching templates in default mode.
  • Fork shared templates so that they match only <simpara>, for example.
    • Don’t forget there may also be significant complexity in the node processing, not just in the attribute processing
    • You need to replicate much of this node processing logic, too,
    • unless you use <xsl:next-match> to invoke a common node-processing template.

Q: Can’t we use <xsl:next-match> for the attributes, too, at least for the class attribute?
A: Yes, if we transform the element in a different mode that is for attribute creation only.

Introducing the class-att mode

Refactoring: Introduce a new mode, class-att

<xsl:template match="db:para | db:simpara">
  <p>
    <xsl:apply-templates select="." mode="class-att"/>
    <xsl:apply-templates select="@* | node()"/>
  </p>
</xsl:template>

<xsl:template match="@role | @condition"/>

<xsl:template match="*" mode="class-att" as="attribute(class)?">
  <xsl:call-template name="make-class">
    <xsl:with-param name="tokens" as="xs:string*">
      <xsl:apply-templates select="@role, @condition" mode="#current"/>
    </xsl:with-param>
  </xsl:call-template>
</xsl:template>

Introduce a new mode, class-att

class-att mode will make class attributes from elements and tokens from attributes (default: tokenize them)

<xsl:template match="@*" mode="class-att" as="xs:string+">
  <xsl:sequence select="tokenize(.)"/>
</xsl:template>

Auxiliary named template make-class:

<xsl:template name="make-class" as="attribute(class)?">
  <xsl:param name="tokens" as="xs:string*"/>
  <xsl:if test="exists($tokens[normalize-space()])">
    <xsl:attribute name="class" separator=" "
      select="distinct-values($tokens[normalize-space()])"/>
  </xsl:if>
</xsl:template>

Fine-grained, specific customizing in importing stylesheets

Add the source element name only for simpara:

<xsl:template match="db:simpara" mode="class-att" as="attribute(class)">
  <xsl:attribute name="class" separator=" ">
    <xsl:sequence select="local-name()"/>
    <xsl:next-match/>
  </xsl:attribute>
</xsl:template>

Remove 'web' for @condition:

<xsl:template match="@condition" mode="class-att" as="xs:string*">
  <xsl:variable name="next-match" as="xs:string*">
    <xsl:next-match/>
  </xsl:variable>
  <xsl:sequence select="$next-match[not(. = 'web')]"/>
</xsl:template>

Apart from the mode name and the expected output types, no knowledge about the inner workings of the imported stylesheet required

Results

1. <para role="foo">Para</para>
2. <para>Para</para>
3. <para role="foo" condition="web hidden">Para</para>
4. <para condition="web">Para</para>
5. <simpara>Simpara</simpara>

1. <p class="foo">Para</p>
2. <p>Para</p>
3. <p class="foo hidden">Para</p>
4. <p>Para</p>
5. <p class="simpara">Simpara</p>

Same as the Requirement 3 results, as it should be.

Possible Pitfall

  • Scenario: The basic stylesheet uses a class-att approach
  • Someone naively uses a classic modified identity template on the first customization level
  • On the next customization level, you want to eliminate a class token for a certain context
  • But it won’t disappear because it will be created by transforming @role to @class in default mode

⇒ Document the chosen approach in the basic stylesheet, and warn people never to create class attributes in default mode.

Other fine-grained hooks

We need not stop at class attributes

  • In the same breakout mode, you can attach IDs to elements that don’t have IDs.
  • In the same mode, you can add other attributes, such as style attributes.
    • DocBook xslTNG uses m:attributes as a breakout mode in which to create all attributes
  • You can also use a distinct mode for any of these attributes.
  • You can refactor monolithic functions that contain many context-dependent case switches: Invoke an xsl:apply-templates that transforms the context item in a specific mode.
    • Example: refactor 140-line tei:isInline() function that contains 128 xsl:when statements
  • You can map element names using xsl:apply-templates in a specific mode

Map element names in a distinct mode

People try to avoid repeating similar modified identity templates for each source element, like this:

<xsl:template match="*">
  <xsl:element name="{db:new-name(.)}">
    <xsl:apply-templates select="." mode="class-att"/>
    <xsl:apply-templates select="@* | node()"/>
  </xsl:element>
</xsl:template>

with a mapping function like this:

Map element names in a distinct mode

<xsl:function name="db:new-name" as="xs:string">
  <xsl:param name="elt" as="element(*)"/>
  <xsl:choose>
    <xsl:when test="local-name($elt) = ('para', 'simpara')">
      <xsl:sequence select="'p'"/>
    </xsl:when>
    <xsl:when test="local-name($elt) = ('emphasis') and $elt/@role = 'bold'">
      <xsl:sequence select="'b'"/>
    </xsl:when>
    <xsl:when test="local-name($elt) = ('emphasis')">
      <xsl:sequence select="'i'"/>
    </xsl:when>
    <!-- … -->
    <xsl:otherwise>
      <xsl:sequence select="'div'"/>
    </xsl:otherwise>
  </xsl:choose>
</xsl:function>

Map element names in a distinct mode

Problem: If you want to tweak the mapping, you’ll have to overwrite the whole function.

Solution: Refactor it, using a dedicated db:new-name mode:

<xsl:function name="db:new-name" as="xs:string">
  <xsl:param name="elt" as="element(*)"/>
  <xsl:apply-templates select="$elt" mode="db:new-name"/>
</xsl:function>

with these string-producing templates in db:new-name mode:

Mapping element names

<xsl:template match="*" mode="db:new-name" as="xs:string">
  <xsl:sequence select="'div'"/>
</xsl:template>

<xsl:template match="db:para | db:simpara" mode="db:new-name" as="xs:string">
  <xsl:sequence select="'p'"/>
</xsl:template>

<xsl:template match="db:emphasis[@role = 'bold']" mode="db:new-name" as="xs:string">
  <xsl:sequence select="'b'"/>
</xsl:template>

<xsl:template match="db:emphasis" mode="db:new-name" as="xs:string">
  <xsl:sequence select="'i'"/>
</xsl:template>

Customize the element name mapping in an importing stylesheet

<xsl:template match="db:emphasis[@role = 'bold']"
              mode="db:new-name" as="xs:string">
  <xsl:sequence select="'strong'"/>
</xsl:template>

<xsl:template match="db:emphasis" 
              mode="db:new-name" as="xs:string">
  <xsl:sequence select="'em'"/>
</xsl:template>

Result

1. <para role="foo">Para</para>
2. <para><emphasis role="bold">Para</emphasis></para>
3. <para role="foo" condition="web hidden">Para</para>
4. <para condition="web">Para</para>
5. <simpara><emphasis>Sim</emphasis>para</simpara>

1. <p class="foo">Para</p>
2. <p><strong class="bold">Para</strong></p>
3. <p class="foo hidden">Para</p>
4. <p>Para</p>
5. <p class="simpara"><em>Sim</em>para</p>

<strong class="bold">???

Suppress default class-att mapping for @role='bold'

<xsl:template match="db:emphasis/@role[. = 'bold']" 
              mode="class-att" as="xs:string?"/>

1. <p class="foo">Para</p>
2. <p><strong>Para</strong></p>
3. <p class="foo hidden">Para</p>
4. <p>Para</p>
5. <p class="simpara"><em>Sim</em>para</p>

DocBook xslTNG

Create all attributes at once in a distinct mode (by processing the element)

That’s what DocBook xslTNG does (in contrast to the other renderers in the paper)

<xsl:template match="db:para|db:simpara">
  <p>
    <xsl:apply-templates select="." mode="m:attributes"/>
    <xsl:apply-templates/>
  </p>
</xsl:template>
<xsl:template match="*" mode="m:attributes" as="attribute()*">
  <xsl:variable name="attr" as="attribute()*">
    <xsl:apply-templates select="@*"/>
    <xsl:sequence select="f:chunk(.)"/>
  </xsl:variable>
  <xsl:sequence select="f:attributes(., $attr)"/>
</xsl:template>

Not perfect though

  • DocBook xslTNG’s will create a 'simpara' token if no extra class is given. But this token will disappear if any other token is produced.
<xsl:function name="f:attributes" as="attribute()*">
  <xsl:param name="node" as="element()"/>
  <xsl:param name="attributes" as="attribute()*"/>
  <xsl:sequence select="f:attributes($node, $attributes, local-name($node), ())"/>
</xsl:function>
<xsl:function name="f:attributes" as="attribute()*">
  <xsl:param name="node" as="element()"/>
  <xsl:param name="attributes" as="attribute()*"/>
  <xsl:param name="extra-classes" as="xs:string*"/>
  <xsl:param name="exclude-classes" as="xs:string*"/>

Problem: Not all kool-aid was consumed. For some contexts, f:attributes() creates defaults, for others it won’t. Overriding this may result in the known redundancies, unless…

A small auxiliary template makes it “next-match friendly”

In principle, you can just transform @role, @condition, …, in default mode, and the resulting class tokens will be merged. Side effect: 'simpara' token disappears.

Solution: A db:modify-class template or function

<xsl:template match="db:para | db:simpara" mode="m:attributes">
    <xsl:call-template name="db:modify-class">
      <xsl:with-param name="atts" as="attribute()*">
        <xsl:next-match/>
      </xsl:with-param>
      <xsl:with-param name="add-tokens" select="tokenize(@condition)"/>
      <xsl:with-param name="remove-tokens" select="'web'"/>
    </xsl:call-template>
  </xsl:template>

This minimally invasive, black-box approach isn’t possible in DocBook XSLT 2.0 or TEI XSL.

db:modify-class template

<xsl:template name="db:modify-class" as="attribute()*">
  <xsl:param name="atts" as="attribute()*"/>
  <xsl:param name="add-tokens" as="xs:string*"/>
  <xsl:param name="remove-tokens" as="xs:string*"/>
  <xsl:variable name="existing-tokens" as="xs:string*"
    select="$atts[name() = 'class'] ! tokenize(.)" />
  <xsl:variable name="new-tokens" as="xs:string*" 
    select="distinct-values(($existing-tokens, $add-tokens)
                              [not(. = $remove-tokens)])"/>
  <xsl:sequence select="$atts[not(name() = 'class')]"/>
  <xsl:if test="exists($new-tokens)">
    <xsl:attribute name="class" select="$new-tokens" separator=" "/>
  </xsl:if>
</xsl:template>

Alternative: f:modify-class() function

<xsl:function name="f:modify-class" as="attribute(class)?">
  <xsl:param name="atts" as="attribute()*"/>
  <xsl:param name="add-tokens" as="xs:string*"/>
  <xsl:param name="remove-tokens" as="xs:string*"/>
  <xsl:variable name="existing-tokens" as="xs:string*"
    select="$atts[name() = 'class'] ! tokenize(.)" />
  <xsl:variable name="new-tokens" as="xs:string*" 
    select="distinct-values(($existing-tokens, $add-tokens)
                              [not(. = $remove-tokens)])"/>
  <xsl:sequence select="$atts[not(name() = 'class')]"/>
  <xsl:if test="exists($new-tokens)">
    <xsl:attribute name="class" select="$new-tokens" separator=" "/>
  </xsl:if>
</xsl:function>

Unless meant to be used in XPath expressions, a named template should be preferred because it can accept tunnel parameters.

A name for this software design pattern?

mhhm

<xsl:function name="db:new-name" as="xs:string">
  <xsl:param name="elt" as="element(*)"/>
  <xsl:apply-templates select="$elt" mode="db:new-name"/>
</xsl:function>

The best of both worlds:

  • A function that can be called in XPath expressions
  • whose functionality can be overridden finely-grained using matching templates

I call the xsl:apply-templates above a “mode hook” and mode="db:new-name" a “hook mode.”

Taken together, it’s the mhhm approach.

Thank you!

Questions?

DocBook XSLT 2.0, DocBook xslTNG, TEI XSL, JATS Preview Stylesheets – How extensible are they?

  • All these renderers turn out to be not perfectly extensible by default
  • but can be tweaked to offer more fine-grained customization hooks.
  • For example, tei:isInline(), a function that accepts an element as its argument and decides, using 128 xsl:when branches, can be refactored in a way so that people can more easily configure in which context a TEI note is inline or not.
  • DocBook xslTNG and JATS preview stylesheets are more override-friendly and thereby maintenance-friendly than DocBook XSLT 2.0 and TEI XSLT Stylesheets.
  • Read the paper for examples and suggested tweaks.