Gerrit Imsieke (@gimsieke), le-tex publishing services (@letexml)

Also:
DITA OT Day 2018 video (~2:30–4:25), xsl-list post

“breaking up XML on page break element”
(Geert’s 2014-07-04 XSL Mailing list post)
An “overlapping markup” problem (page division vs. document hierarchy)
Martin Luther’s translation of the New Testament into German (1522), TEI P5 XML from Deutsches Textarchiv

452 pb milestone elements at varying depths
<pbs>
<pb path="/TEI/text/body/div/div/p/pb" count="238"/>
<pb path="/TEI/text/body/div/div/pb" count="91"/>
<pb path="/TEI/text/body/pb" count="52"/>
<pb path="/TEI/text/body/div/pb" count="47"/>
<pb path="/TEI/text/front/pb" count="11"/>
<pb path="/TEI/text/body/div/p/pb" count="10"/>
<pb path="/TEI/text/front/div/p/pb" count="3"/>
</pbs>




/TEI in split modewith a tunneled $restricted-to parameter

(omitting xsl:result-document here for brevity)







teiHeader would be missing……if it weren’t for this template:

FO block splitting
Nested grouping (group-starting-with for <two-col-start>,
group-ending-with for <two-col-end>)
Split at line breaks
Avoid splitting at line breaks in embedded list items or footnotes by
split to #default mode when processing
list item / footnote contentLuther’s 1522 New Testament translation:
pb elements: 452Hypothesis:
Surprise: 1st 10 pages: milliseconds, 1st 375 pages: minutes
⇒ Need to measure dependence on chunk size at constant doc length
Repeatedly removing every other pb results in fewer chunks
Culprit: Conditional Identity Template
When chunk length grows…
The number of Conditional Identity Template invocations cannot be reduced
Pass generated IDs instead of nodes, compare
generate-id() = $restricted-to instead of exists(. intersect $restricted-to)
⇒ 20-fold acceleration for large chunks
No. Michael Kay wrote in 2014:
“... the real problem is that the logic is going down to descendants, then up to their ancestors, and then down again, and that's intrinsically not processing nodes in document order, which is a precondition for streaming.”
Even if it were feasible, the scaling with chunk size would be detrimental.
Can we have a configurable splitter for JATS, DocBook, TEI, HTML etc.?
final entry template
and private internal templates 