Bidirectional XML catalog resolution with Saxon CE

This will translate the URL of a file that is in our svn repository to the abstract, immutable URL that you may use for accessing the file from a transpect pipeline (for example, when importing an XProc or an XSLT file). This abstract URL is called the file’s canonical URL

In the other direction, it will serve as a catalog resolver that uses the transpect module catalog to give you the location of a file that is imported by its canonical URL.

A use case for this is: if a transpect module A depends on another module B, this dependency is declared within a <p:pipeinfo> instruction within A’s pipelines, by referring to B’s canonical base URL. The catalog resolver will then translate the canonical URL to the svn repository URL that may be used for specifying the svn:external location. (In addition, it will create a <nextCatalog> entry in the project’s central catalog file. This entry will point to the locally checked out catalog of module B.)

The motivations for these indirections are:

The Form

Repository URL:

Loading the catalog file(s)…

Canonical URL:

(try for example http://transpect.le-tex.de/xslt-util/xslt-based-catalog-resolver/resolve-uri-by-catalog.xsl)

Please note that if a URL does not resolve (or does not reversely resolve, respectively), the original URL will be used.

You may specify a different catalog location URL in the query string.

Please note that this page, the invoked XSLT, and the catalogs must be served from the same host. You can probably do something about it with CORS.

The Code

Apart from a thin Saxon CE adaption layer, it is the same XSLT stylesheet that is used for catalog resolution in transpect pipelines.


The XSLT-based catalog resolver was originally developed to overcome a limitation of Saxon’s default behavior: We tried to exploit the recursive wildcard search of collection() URLs, but discovered that this would only work for file: URLs, while the URLs in our code were abstract http: URLs. This would be ok if Saxon sent the URLs to the catalog resolver before deciding whether recursive wildcard search was feasible. (It should be, post-resolution, because they are file: URLs then.)

We might have written our own URI resolver in Java, but this would make deployment more difficult, and it would probably require that we run commercial versions of Saxon everywhere which would be to high a hurdle for the adoption of transpect as an open-source, (almost) ready-to-run framework.

So we wrote this XSLT-based resolver. It doesn’t implement the whole catalog standard, but the elements that are most important to our pipelines, namely uri, rewriteURI, and nextCatalog.

We usually rely on the standard catalog resolvers (Apache Common or Norman Walsh’s). We use the XSLT-based resolver only for collections with wildcard http: URLs, for reverse resolution, and when we just need the local name for a resource without actually retrieving the resource, for example when we prepare the packing list for an EPUB. Fonts in our font library are referenced by canonical URL but must be included from a working copy of their repository location.