XProc next CG
 

XProc 3.0 — An overview

As of 2018-02-10

Short reminder on XProc

  • XProc is an XML pipeline language
  • Create XML centric workflows like
    • Expand document with XInclude
    • Validate document with RelaxNG, Schematron or ..
    • Transform to HTML or PDF with XSLT or XSL-FO
  • W3C recommendation in 2010 - Working group closed in 2016
  • W3C community group took over in September 2016: Meetings in Amsterdam, Prague, London and Aachen (2017) and Prague (2018)
  • About 15 active members
  • Four editors of the specs

What is new in XProc 3.0?

Overall Goal: Improve usability of XProc

  • New document model for XML and non-XML documents
  • Text value templates and attribute value templates
  • Typed variables and options using XDM 3.1
  • Import functions from XSLT or XQuery
  • Parameter input ports replaced by maps
  • Variables everywhere
  • Enhanced try-catch
  • Lots of useful shortcuts

New document model

The only type of document processable in XProc (1.0) were well-formed XML documents.

The new model: A document is a tuple of a representation and its properties

  • Properties are metadata of the document, e.g. content-type, base-uri, user defined properties
  • The representation is the data structure the processor uses to refer to the content.

New document model: Document types

XProc 3.0 currently knows four document types

  • XML documents (XDM document model)
  • Text documents (A document node with a text node as a child)
  • Binary documents (implementation defined for efficiency reasons)
  • JSON documents (adapting XPath 3.1’s concept here!)

Text and attribute value templates

  • Text value templates (as known from XSLT):
    <p:inline><doc>{$a}</doc><p:inline>
    will put the content of $a as children into <doc/>
  • Attribute value templates: Instead of
    <p:add-attribute attribute-name="foo">
      <p:with-option name="attribute-value" 
       select="concat($a,$b)" />
    </p:with-option
    You can now simply write:
    <p:add-attribute attribute-name="foo"
      attribute-value="{concat($a,$b)}" />
  • p:template is not needed any more

Typed variables and options

  • Variables and options in XProc 1.0 were untyped atomics or strings
  • XProc 3.0 uses sequences as defined in XDM 3.1 as basis, including maps and arrays
  • Sequence type syntax can be used to declare and check expected types
  • Variables now can hold nodes or even documents
    <p:variable name="theDoc" select="doc('my-doc')"/>
    <p:identity>
       <p:with-input>
         <wrapper>{$theDoc}</wrapper>
       </p:with-input>
    <p:identity>

Import functions

  • In XProc 1.0 you could only use built-in functions from XPath and XProc
  • Often not handy when you had to write the same XPath expression over and over again
  • New in XProc 3.0:
    <p:import-functions href="my-lib" 
      content-type="application/xslt+xml" 
      namespace="http://example.com/foo"/>
    will import all functions (and variables?) within the given namespace from the URI

Further global enhancements

  • Variable declarations everywhere. Variables shadowed by name
  • No more parameters ports. Maps are used instead to provide values to XSLT, XQuery etc.
  • Maps are also used as bags for serialization parameters
  • New compound step: <p:if /> No need to declare the same output ports on every branch of try/catch or <p:choose/> any more
  • Enhanced try/catch: Multiple catch clauses instead of just one; p:finally
  • Renamed port binding to p:with-input to mark the difference between declaring and binding ports
  • Ever wondered what <p:namespaces/> is for? Don't worry, it is gone!

Plenty of syntactic sugar

XProc 1.0:

<p:input port="source">
 <p:inline>
  <doc />
 </p:inline>
</p:input>

XProc 3.0

<p:with-input><doc /><p:with-input>

Plenty of syntactic sugar

XProc 1.0:

<p:input port="source">
  <p:document href="some-thing" />
</p:input>

XProc 3.0

<p:with-input href="some-thing" />

Plenty of syntactic sugar

XProc 1.0:

<p:input port="source">
  <p:pipe step="some" port="thing" />
</p:input>

XProc 3.0

<p:with-input pipe="thing@some" />

or

<p:with-input pipe="@some" />

or

<p:with-input pipe="thing" />

or

<p:with-input pipe="port1@step1 port2@step2 port3@step3 />

Roadmap:

  • Specs split into three documents: core, (required) step library and (optional) step library
  • Last call draft (of the core spec) “this spring”
  • Core spec and step spec “candidate releases” ready by June
  • XML Calabash 2 and MorganaXProc 2 implementations commensurate with CR by June
  • Erik Siegel’s Programmer Reference for 3.0 also planned for June

What we ask from you

  • Feedback to the drafts
  • Input, use-cases, proposals for steps dealing with non-XML documents
  • Questions, objections, cheers, …

Resources