This documentation describes a Docker container that provides a service to convert OMML and MathML formulas to images. The image formats currently supported are EPS, PDF, PNG, and JPEG. If needed support for SVG and TIFF can be added.
Inside the container, BaseX provides a HTTP API, accepts OMML input, transforms it to LaTeX using Saxon and then, using system calls, invokes latex, dvips, and possibly other commands in order to generate the required image formats.
The code is pulled from several publicly accessible repositories.
https://subversion.le-tex.de/common/math-renderer
You need to check it out with an svn client in order to get the externals automatically. The project’s svn externals are:
https://subversion.le-tex.de/common/basex (public, contains BaseX 9.4.6 and Saxon 10.3)
XSLT transformations that run in BaseX, using, Saxon, are able to use this project’s XML Catalog and the transpect libraries that are attached as externals
https://github.com/transpect/latex-math-images/, forked from https://gitlab.le-tex.de/lupino/formelbildgen/ (limited access, le-tex VPN necessary)
A directory with customer-specific styles may be copied into the latex-math-renderer/styles directory by a modified Dockerfile. See below how the name of this directory corresponds to a path component of the endpoints.
https://pkgs.alpinelinux.org/packages
For a list of Alpine Linux packages used, see the Dockerfile
From the main directory of the svn repo’s working copy:
docker build --tag=math-renderer:latest .
On Apple silicon (M1 or M2), you need to add --platform linux/amd64
:
docker build --platform linux/amd64 --tag=math-renderer:latest .
We might eventually provide an Arm64-native container that should run much faster on Apple Silicon.
Mostly due to the LaTeX-related packages, the container comprises approx. 1.2 GB.
If you receive a pre-built image backup as .tar.gz, you need to load it using docker load
.
It will eventually be made available from a container registry at le-tex.
docker run -d -p 127.0.0.1:9080:8984/tcp --name math-renderer -it math-renderer:latest
On Apple silicon:
docker run --platform linux/amd64 -d -p 127.0.0.1:9080:8984/tcp --name math-renderer -it math-renderer:latest
This will map the default BaseX HTTP port (8984) inside the container to 9080 on localhost, where Docker runs.
The container does not run with root privileges.
Before building the container afresh, it needs to be stopped and removed:
docker stop math-renderer
docker rm math-renderer
The API currently provides three endpoints, /eqimg/[customization]/render-omml
,
/eqimg/[customization]/render-mml
,
and /eqimg/[customization]/retrieve/[image]
.
[customization]
is a placeholder for an arbitrary string, for example, 'default'. It must
match a directory name below latex-math-images/styles. Without any customization, this directory
only contains a directory called 'default'.
/eqimg/[customization]/render-omml
accepts query parameter, format
.
Its values may be eps
(the default), png
, or jpg
. Further options to support
SVG, PDF, or TIFF may be included. It is also conceivable to request multiple formats at once. The JSON results
below will then contain multiple entries. The individual results will then be created on the fly when calling
/eqimg/[customization]/retrieve/[image]
.
Additional parameters are downscale
(an integer, defaults to 2)
for telling GhostScript how much to downscale the 1200 dpi pixel output for the PNG and JPEG formats;
tex=true|false
, whether the LaTeX code should be included in the JSON
result, with "
quoted as \"
, and mml=true|false
, whether the generated
MathML should be inserted into the EPS as a comment (this option is without effect for formats other than
eps
).
/eqimg/[customization]/render-mml
accepts MathML as input. The query parameters are the same, except
that mml
is called include=true|false
.
For the OMML endpoint, the distinction between display and inline formulas will be made according to the top-level
element, m:oMathPara
or m:oMath
, respectively. For the MML endpoint, the distinction will
be made according to the top-level mml:math
’s display=block|inline
attribute.
Make sure to submit well-formed OMML. In particular, it needs to contain all necessary namespace declarations
(xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math"
and
xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"
should suffice in most
cases) that would be missing if it were merely cut and pasted from a docx document.xml file.
The outputs will be automatically deleted in the container after 60 minutes.
curl -X POST -H "Content-Type: application/xml" -d @sample/oMathPara.xml 'http://localhost:9080/eqimg/default/render-omml?format=png&downscale=4'
The sample data is contained in the svn repo. Don’t forget to put the URL in single quotes on Linux if it contains an ampersand.
Output:
{ "image-depth-pt": "9.90988pt", "tex": "/eqimg/default/retrieve/eqimg-3236359421008546033.tex", "png": "/eqimg/default/retrieve/eqimg-3236359421008546033.png", "status": "success", "image-height-pt": "25.83641pt", "image-ratio": "0.38356", "texlog": "/eqimg/default/retrieve/eqimg-3236359421008546033.log", "image-width-pt": "71.50331pt" }
The output needs to be retrieved in a second request:
curl -o sample/oMathPara.png http://localhost:9080/eqimg/default/retrieve/eqimg-3236359421008546033.png
Likewise for the same formula as inline (see the Word source for the difference – Word considers a formula inline if there is other content in its paragraph):
$ curl -X POST -H "Content-Type: application/xml" -d @sample/oMath.xml http://localhost:9080/eqimg/default/render-omml?format=png
{ "image-depth-pt": "3.776pt", "tex": "/eqimg/default/retrieve/eqimg5263416324052372189.tex", "png": "/eqimg/default/retrieve/eqimg5263416324052372189.png", "status": "success", "image-height-pt": "13.37885pt", "image-ratio": "0.28224", "texlog": "/eqimg/default/retrieve/eqimg5263416324052372189.texlog", "image-width-pt": "79.7272pt" }
curl -o sample/oMath.png http://localhost:9080/eqimg/default/retrieve/eqimg5263416324052372189.png
The baseline shift is given in image-depth-pt
.
It can be applied using margin-bottom: -3.776pt
in the given example:
$ curl -X POST -H "Content-Type: application/xml" -d @sample/mml-block01.xml http://localhost:9080/eqimg/default/render-mml?format=eps
{ "image-depth-pt": "17.6173pt", "tex": "/eqimg/default/retrieve/eqimg-2599745313414269954.tex", "eps": "/eqimg/default/retrieve/eqimg-2599745313414269954.eps", "status": "success", "image-height-pt": "40.6518pt", "image-ratio": "0.43336", "texlog": "/eqimg/default/retrieve/eqimg-2599745313414269954.log", "image-width-pt": "122.40714pt" }
You can convert all OMML equations in a docx file:
$ curl -X POST -F docx=@sample/omml.docx http://localhost:9080/eqimg/default/extract-formula?format=png&downscale=4
The downloadable result will be a Zip file with all the images and their baseline shift etc. data in a large JSON structure. We will change this so that there is a single JSON file for each image, with the same base name.
If the service in the container is running on port 9080 on localhost, you can go to http://localhost:9080/eqimg/default/select and upload the files interactively, retrieve the results as Zip files and see a list of all conversions you ran in the last 3 hours.
This feature is currently available only for docx upload, not for OMML or MathML. Currently two alternative configurations are available, Noto Serif (http://localhost:9080/eqimg/notoserif/select) and Noto Sans (http://localhost:9080/eqimg/notosans/select). The path components correspond to the subdirectories of the styles directory in the latex-math-images repo.
Caution: When the the container is running during hibernation, it seems to clog the whole Vmmem virtualization system on Windows. This phenomenon needs to be examined more closely. As a precaution, stop the math-renderer container before hibernating Windows. If the container is unresponsive after hibernation and if it is then stopped and restarted, all Docker processes seem to be slowed down, as seen in longer conversion times for the same input files.
mfenced
instead of stretchy mo
sWhile we are aware that mfenced
won’t be part of MathML 4 any more and that we need to support
the new way, which is stretchy mo
s, the renderer does not yet handle this correctly.
So instead of using the markup of sample/mml-block01-nofence.xml, please use the markup of sample/mml-block01.xml for the time being. We might eventually apply heuristic preprocessing to it.