xproc-util

Renames file references in a XML document and its physical manifestations. The fileref attributes are matched by a regex pattern. The filerefs are replaced with the value of the replace option.

Example: rename file extensions


<tr:batch-rename-files>
  <p:with-option name="attribute-name" select="'fileref'"/>
  <p:with-option name="regex-match" select="'^(.+)\.tif$'"/>
  <p:with-option name="regex-replace" select="'$1.jpg'"/>
</tr:batch-rename-files>

Example: replace whitespace


<tr:batch-rename-files>
  <p:with-option name="attribute-name" select="'fileref'"/>
  <p:with-option name="regex-match" select="'\s'"/>
  <p:with-option name="regex-replace" select="''"/>
</tr:batch-rename-files>

Import

<p:import href="http://transpect.io/xproc-util/batch-rename-files/xpl/batch-rename-files.xpl"/>

Dependencies

xslt-util

Synopsis

<tr:batch-rename-files xmlns:tr="http://transpect.io">
  <p:input port="source"/>
  <p:output port="result" sequence="true" primary="true"/>
  <p:output port="report" primary="false"/>
  <p:option name="attribute-name" select="'fileref'"/>
  <p:option name="regex-match" required="true"/>
  <p:option name="regex-replace" required="true"/>
</tr:batch-rename-files>

The behavior can be partly overridden by using @hub:target-fileref attributes on the same element as the @fileref (or whatever $fileref-attribute-name-regex matches) attribute. If @hub:target-fileref is a relative URI, it will be resolved wrt $target-dir-uri. If it is an absolute URI, it will have precedence over $target-dir-uri. If @hub:target-fileref is present, the original @fileref attribute will not be changed. Libraries such as hub2docx should prefer @hub:target-fileref in order to determine the location.

Import

<p:import href="http://transpect.io/xproc-util/copy-files/xpl/copy-files.xpl"/>

Dependencies

xproc-util transpect.github.io

Synopsis

<tr:copy-files xmlns:tr="http://transpect.io">
  <p:input port="source" sequence="false" primary="true"/>
  <p:output port="result" sequence="true" primary="true"/>
  <p:option name="retain-subpaths" required="false" select="'false'"/>
  <p:option name="target-dir-uri" required="true"/>
  <p:option name="change-uri" required="false" select="'yes'"/>
  <p:option name="change-uri-new-subpath" required="false" select="'media'"/>
  <p:option name="fileref-attribute-name-regex" required="false" select="'^fileref$'"/>
  <p:option name="fileref-hosting-element-name-regex" required="false" select="'^(audiodata|imagedata|textdata|videodata)$'"/>
  <p:option name="fileref-attribute-value-regex" required="false" select="'^.+$'"/>
  <p:option name="fail-on-error" required="false" select="'false'"/>
  <p:option name="debug" select="'yes'"/>
  <p:option name="debug-dir-uri" select="'debug'"/>
  <p:option name="status-dir-uri" required="false" select="'debug/status?enabled=false'"/>
</tr:copy-files>

The purpose of this XProc pipeline is to create saxon call of the entire xsltmode-conversion runable in a shell. With all params and optionally a saxon configuration plus collection file.

All output documents are written to disc. The executable file + invocation call is messaged to console.

Import

<p:import href="http://transpect.io/xproc-util/debugging/xpl/xsltmode-as-saxon-command.xpl"/>

Synopsis

<tr:xsltmode-as-saxon-command xmlns:tr="http://transpect.io">
  <p:input port="source" sequence="true" primary="true"/>
  <p:input port="stylesheet" sequence="false"/>
  <p:input port="xslt-params" sequence="true"/>
  <p:output port="result" primary="true"/>
  <p:option name="mode" required="true"/>
  <p:option name="saxon-call-base-uri" required="true"/>
  <p:option name="saxon-executable" required="false" select="'saxon'"/>
  <p:option name="run-immediately" required="false" select="'no'"/>
</tr:xsltmode-as-saxon-command>

This step converts mml to tex. (via https://github.com/transpect/mml2tex) The mml2tex module must be available on URI http://transpect.io/mml2tex regardless of the value of the option 'type'.

Import

<p:import href="http://transpect.io/xproc-util/evolve-mml/xpl/evolve-mml.xpl"/>

Dependencies

Synopsis

<tr:evolve-mml xmlns:tr="http://transpect.io">
  <p:input port="source" primary="true"/>
  <p:input port="conf" primary="false"/>
  <p:input port="stylesheet"/>
  <p:input port="paths" sequence="true" primary="true"/>
  <p:output port="result" primary="true"/>
  <p:option name="output-dir" required="true"/>
  <p:option name="type" required="false" select="'mml'"/>
  <p:option name="outfile-prefix" required="false" select="'ltx-created-eq-'"/>
  <p:option name="extension" required="false" select="'gif'"/>
  <p:option name="fail-on-error" required="false" select="'no'"/>
  <p:option name="preprocessing" required="false" select="'no'"/>
  <p:option name="debug" required="false" select="'no'"/>
  <p:option name="debug-dir-uri" select="'debug'"/>
  <p:option name="texmap" select="'http://transpect.io/mml2tex/texmap/texmap.xml'"/>
  <p:option name="texmap-upgreek" select="'http://transpect.io/mml2tex/texmap/texmap-upgreek.xml'"/>
  <p:option name="context" required="false" select="false()"/>
  <p:option name="display-equation-table-role" required="false" select="'equation-table'"/>
  <p:option name="store-plain-tex" select="'false'"/>
  <p:option name="pad-position" select="'false'"/>
  <p:option name="pad" select="'3'"/>
  <p:option name="set-math-style" select="'no'"/>
</tr:evolve-mml>

Import

<p:import href="http://transpect.io/xproc-util/extract-cssa-rules/xpl/extract-cssa-rules.xpl"/>

Synopsis

<tr:extract-cssa-rules xmlns:tr="http://transpect.io">
  <p:input port="source" primary="true"/>
  <p:input port="previous" sequence="true"/>
  <p:output port="result" primary="true"/>
  <p:option name="only-layout-type-and-name-attributes" select="'no'"/>
  <p:option name="debug" required="false" select="'no'"/>
  <p:option name="debug-dir-uri"/>
</tr:extract-cssa-rules>

Extends the pxp:unzip step to extract files from a jar file.

Import

<p:import href="http://transpect.io/xproc-util/extract-from-jar/xpl/extract-from-jar.xpl"/>

Synopsis

<tr:extract-from-jar xmlns:tr="http://transpect.io">
  <p:output port="result"/>
  <p:option name="debug" select="'no'"/>
  <p:option name="debug-dir-uri" select="'debug'"/>
  <p:option name="href" required="true"/>
  <p:option name="dest-dir" required="true"/>
</tr:extract-from-jar>

Import

<p:import href="http://transpect.io/xproc-util/file-uri/xpl/escape-for-uri.xpl"/>

Dependencies

xslt-util

Synopsis

<tr:escape-for-uri xmlns:tr="http://transpect.io">
  <p:output port="result" primary="true"/>
  <p:option name="path"/>
</tr:escape-for-uri>

This step accepts either a file system path or a URL in its 'filename' option. It will normalize them so that both a file system path and a file: URL are available. If filename starts with http: or https:, the file will be retrieved and stored locally. Please note that this retrieval will not work for remote directories.

Its primary uses are

giving users the liberty to either specify a URL or an OS-specific path for input file parameters;
making XML catalog resolution available to any URI, not just when accessing resources through catalog-enabled methods such as doc();
if, after optional catalog resolution, the 'filename' URI is still http:/https:, p:http-request will be used to store the file locally.

Examples for 'filename' values

C:/temp/file.docx,
c:\temp\file.docx,
file:/C:/temp/file.docx,
file:///C:/temp/file.docx,
/tmp/file.docx,
subdir/file.docx,
https://github.com/me/myrepo/blob/master/file.docx?raw=true

Relative Paths

Relative paths will be resolved against the current working directory, which is better than the static base uri most of the time but which might not always be what the user wants. It is a good idea to absolutize paths, as in $(readlink -f subdir/file.docx) or $(cygpath -ma subdir/file.docx).

XML Catalogs

If a catalog is provided on the catalog port and an XSLT stylesheet for catalog resolution is supplied on the resolver port, http:/https: URIs will be catalog-resolved first, see below.

Storage Location for HTTP Downloads

It is possible to specify a temporary directory in the 'tmpdir' option. By default, it will be the subdir 'tmp' of the user’s home directory. The 'tmpdir' option accepts both a file: URL and an OS path, thanks to this normalization step.

Please note that temporary files will not be deleted by this step.

Unique File Names for HTTP Downloads

If the option 'make-unique' is true (which it is by default), the files that are fetched by p:http-request will get a random string like _0fa8d348 appended to their base name.

Output format

The output is a c:result element with the following attributes:

os-path: OS-specific path. This is always present except when there is error-status
local-href: file: URI. This is always present except when there is error-status
error-status: This may only happen if the 'filename' was an HTTP URI and if there was an error retrieving the resource
href: The post catalog-resolution URI of the resource (if it is an HTTP URI)
orig-href: The pre catalog-resolution URI of the resource (if different from post catalog)
lastpath: For ordinary files, the non-directory part including suffix. For directories, the last path component without trailing slash. lastpath is URL-escaped, that is, it is taken from local-href.
lastpath-os: The same as lastpath, but without URL escaping.

Import

<p:import href="http://transpect.io/xproc-util/file-uri/xpl/file-uri.xpl"/>

Dependencies

xslt-util

Synopsis

<tr:file-uri xmlns:tr="http://transpect.io">
  <p:input port="source" primary="true"/>
  <p:input port="catalog"/>
  <p:input port="resolver"/>
  <p:output port="result" primary="true"/>
  <p:option name="filename" required="true"/>
  <p:option name="make-unique" required="false" select="'true'"/>
  <p:option name="fetch-http" required="false" select="'true'"/>
  <p:option name="check-http" required="false" select="'true'"/>
  <p:option name="tmpdir" required="false" select="''"/>
  <p:option name="use-filename-from-http-response" required="false" select="'no'"/>
</tr:file-uri>

Import

<p:import href="http://transpect.io/xproc-util/file-uri/xpl/unescape-for-os-path.xpl"/>

Dependencies

xslt-util

Synopsis

<tr:unescape-uri xmlns:tr="http://transpect.io">
  <p:input port="source"/>
  <p:output port="result" primary="true"/>
  <p:option name="uri" select="''"/>
  <p:option name="attribute-names" select="''"/>
</tr:unescape-uri>

This is an XProc wrapper to convert PostScript to PDF utilizing GhostScript, which needs to be installed on your system.

Import

<p:import href="http://transpect.io/xproc-util/ghostscript/xpl/ghostscript.xpl"/>

Dependencies

xproc-util transpect.github.io

Synopsis

<tr:ghostscript xmlns:tr="http://transpect.io">
  <p:output port="result" primary="true"/>
  <p:output port="report" sequence="true" primary="false"/>
  <p:option name="href" required="true"/>
  <p:option name="outdir" select="'converted'"/>
  <p:option name="format" select="'pdf'"/>
  <p:option name="options" select="'-sDEVICE=pdfwrite -dEPSCrop'"/>
  <p:option name="install-path" select="''"/>
  <p:option name="debug" select="'no'"/>
  <p:option name="debug-dir-uri" select="'debug'"/>
  <p:option name="fail-on-error" select="'true'"/>
</tr:ghostscript>

This step tries to embed external resources such as images, CSS and JavaScript as data URI, as XML or as plain text into the HTML document.

Consider the example below.

<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <title/>
  </head>
  <body>
    <div>
      <img alt="a blue square" src="image.png" />
    </div>
  </body>
</html>

After processing the HTML, the image is embedded as data URI.

<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <title/>
  </head>
  <body>
    <div>
      <img alt="a blue square" src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAMAAAC6sdbXAAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJ
bWFnZVJlYWR5ccllPAAAAyJpVFh0WE1MOmNvbS5hZG9iZS54bXAAAAAAADw/eHBhY2tldCBiZWdp
bj0i77u/IiBpZD0iVzVNME1wQ2VoaUh6cmVTek5UY3prYzlkIj8+IDx4OnhtcG1ldGEgeG1sbnM6
eD0iYWRvYmU6bnM6bWV0YS8iIHg6eG1wdGs9IkFkb2JlIFhNUCBDb3JlIDUuMy1jMDExIDY2LjE0
NTY2MSwgMjAxMi8wMi8wNi0xNDo1NjoyNyAgICAgICAgIj4gPHJkZjpSREYgeG1sbnM6cmRmPSJo
dHRwOi8vd3d3LnczLm9yZy8xOTk5LzAyLzIyLXJkZi1zeW50YXgtbnMjIj4gPHJkZjpEZXNjcmlw
dGlvbiByZGY6YWJvdXQ9IiIgeG1sbnM6eG1wPSJodHRwOi8vbnMuYWRvYmUuY29tL3hhcC8xLjAv
IiB4bWxuczp4bXBNTT0iaHR0cDovL25zLmFkb2JlLmNvbS94YXAvMS4wL21tLyIgeG1sbnM6c3RS
ZWY9Imh0dHA6Ly9ucy5hZG9iZS5jb20veGFwLzEuMC9zVHlwZS9SZXNvdXJjZVJlZiMiIHhtcDpD
cmVhdG9yVG9vbD0iQWRvYmUgUGhvdG9zaG9wIENTNiAoV2luZG93cykiIHhtcE1NOkluc3RhbmNl
SUQ9InhtcC5paWQ6NjExNUU3Q0RFNkQ1MTFFNUE4MThFMjY3QjgwODYwQ0UiIHhtcE1NOkRvY3Vt
ZW50SUQ9InhtcC5kaWQ6NjExNUU3Q0VFNkQ1MTFFNUE4MThFMjY3QjgwODYwQ0UiPiA8eG1wTU06
RGVyaXZlZEZyb20gc3RSZWY6aW5zdGFuY2VJRD0ieG1wLmlpZDo2MTE1RTdDQkU2RDUxMUU1QTgx
OEUyNjdCODA4NjBDRSIgc3RSZWY6ZG9jdW1lbnRJRD0ieG1wLmRpZDo2MTE1RTdDQ0U2RDUxMUU1
QTgxOEUyNjdCODA4NjBDRSIvPiA8L3JkZjpEZXNjcmlwdGlvbj4gPC9yZGY6UkRGPiA8L3g6eG1w
bWV0YT4gPD94cGFja2V0IGVuZD0iciI/PjJf70IAAAAGUExURQCe4AAAAB0uYYYAAAAOSURBVHja
YmDABwACDAAAHgABzCCyiwAAAABJRU5ErkJggg==
" />
    </div>
  </body>
</html>

Import

<p:import href="http://transpect.io/xproc-util/html-embed-resources/xpl/html-embed-resources.xpl"/>

Dependencies

xproc-util transpect.github.io

Synopsis

<tr:html-embed-resources xmlns:tr="http://transpect.io">
  <p:input port="source" primary="true"/>
  <p:input port="catalog"/>
  <p:output port="result" primary="true"/>
  <p:option name="exclude" select="''"/>
  <p:option name="include-class-only" select="''"/>
  <p:option name="exclude-by-fileext" select="''"/>
  <p:option name="max-base64-encoded-size-kb" select="1000"/>
  <p:option name="unavailable-resource-message" select="'no'"/>
  <p:option name="debug" select="'no'"/>
  <p:option name="fail-on-error" select="'true'"/>
</tr:html-embed-resources>

This step performs a simple p:http-request and checks whether the result exceeds the limit of the base64 encoded size. If this check fails, the original fileref markup is reproduced.

Import

<p:import href="http://transpect.io/xproc-util/html-embed-resources/xpl/html-embed-resources.xpl"/>

Synopsis

<tr:get-data-uri xmlns:tr="http://transpect.io">
  <p:input port="source" primary="true"/>
  <p:input port="fileref" primary="false"/>
  <p:input port="file-uri" primary="false"/>
  <p:output port="result"/>
  <p:option name="max-base64-encoded-size-kb" select="'1000'"/>
</tr:get-data-uri>

Uses validator.nu that is bundled with Calabash to parse HTML5 (both HTML and XML serializations) files. The files must have a single top-level element. They don’t need to have html as their top-level element though. body, section etc. are also acceptable.

Import

<p:import href="http://transpect.io/xproc-util/html5/xpl/load-html5.xpl"/>

Dependencies

xproc-util transpect.github.io

Synopsis

<tr:load-html5 xmlns:tr="http://transpect.io">
  <p:output port="result" primary="true"/>
  <p:option name="file" required="true"/>
  <p:option name="debug" select="'no'"/>
  <p:option name="debug-dir-uri" select="'debug'"/>
</tr:load-html5>

This is an XProc wrapper for ImageMagick. The ImageMagick executable needs to be installed on your system.

Import

<p:import href="http://transpect.io/xproc-util/imagemagick/xpl/imagemagick.xpl"/>

Dependencies

xproc-util transpect.github.io

Synopsis

<tr:imagemagick xmlns:tr="http://transpect.io">
  <p:output port="result" primary="true"/>
  <p:output port="report" sequence="true" primary="false"/>
  <p:option name="href" required="true"/>
  <p:option name="outdir" select="'converted'"/>
  <p:option name="format" select="'jpg'"/>
  <p:option name="imagemagick-options" select="''"/>
  <p:option name="imagemagick-path" select="''"/>
  <p:option name="prefix" select="''"/>
  <p:option name="debug" select="'no'"/>
  <p:option name="debug-dir-uri" select="'debug'"/>
  <p:option name="fail-on-error" select="'false'"/>
</tr:imagemagick>

This step inserts the XPath location of any element as attribute.

Consider this example:

<root>
  <element>Text</element>
</root>

After applying the step, each element includes a srcpath attribute containing its XPath location.

<root srcpath="/root">
  <element srcpath="/root/element">Text</element>
</root>

Import

<p:import href="http://transpect.io/xproc-util/insert-srcpaths/xpl/insert-srcpaths.xpl"/>

Dependencies

xslt-util

Synopsis

<tr:insert-srcpaths xmlns:tr="http://transpect.io">
  <p:input port="source"/>
  <p:output port="result"/>
  <p:option name="schematron-like-paths" select="'no'"/>
  <p:option name="exclude-elements" select="''"/>
  <p:option name="exclude-descendants" select="'yes'"/>
  <p:option name="prepend" select="''"/>
</tr:insert-srcpaths>

This step loads a file via http-request

Import

<p:import href="http://transpect.io/xproc-util/load/xpl/load-data.xpl"/>

Dependencies

xproc-util transpect.github.io

Synopsis

<tr:load-data xmlns:tr="http://transpect.io">
  <p:output port="result" primary="true"/>
  <p:option name="href" required="true"/>
  <p:option name="content-type-override" select="''"/>
  <p:option name="encoding" select="'base64'"/>
  <p:option name="fail-on-error" select="'false'"/>
</tr:load-data>

This step implements TagSoup to load even not well-formed HTML files. To use TagSoup with Calabash, you must include the TagSoup JAR file in your Java classpath and use a Calabash configuration file.

      
        <cc:xproc-config xmlns:cc="http://xmlcalabash.com/ns/configuration" xmlns:tr="http://transpect.io">
          <cc:html-parser value="tagsoup"/>
        </cc:xproc-config>

Import

<p:import href="http://transpect.io/xproc-util/load/xpl/load-html.xpl"/>

Dependencies

xproc-util transpect.github.io

Synopsis

<tr:load-html xmlns:tr="http://transpect.io">
  <p:output port="result" primary="true"/>
  <p:option name="href" required="true"/>
  <p:option name="fail-on-error" select="'false'"/>
</tr:load-html>

Documents identified by their base URIs should be selected from the documents on the source port. The selected documents are then passed to the result port. If there is no matching document on the source port for a given URI, the document should instead be loaded from the location specified by the base URI.

Import

<p:import href="http://transpect.io/xproc-util/load/xpl/load-sources.xpl"/>

Synopsis

<tr:load-sources xmlns:tr="http://transpect.io">
  <p:input port="source" sequence="true" primary="true"/>
  <p:output port="result" sequence="true"/>
  <p:option name="uris"/>
  <p:option name="add-xml-base" select="'false'"/>
</tr:load-sources>

Replacement for p:load. Uses the file-uri util to load any file without using resolve-uri or other inconveniend ways. A relative file (param href) will be loaded relative to the current working directory. Please note, there is no input port. The document is loaded via href option.

Import

<p:import href="http://transpect.io/xproc-util/load/xpl/load.xpl"/>

Dependencies

xproc-util transpect.github.io

Synopsis

<tr:load xmlns:tr="http://transpect.io">
  <p:output port="result" primary="true"/>
  <p:option name="href" required="true"/>
  <p:option name="dtd-validate" select="'false'"/>
  <p:option name="fail-on-error" select="'yes'"/>
</tr:load>

This step expects a sequence of Hub files and merges them to one single file.

Import

<p:import href="http://transpect.io/xproc-util/merge-hub/xpl/merge-hub.xpl"/>

Dependencies

Synopsis

<tr:merge-hub xmlns:tr="http://transpect.io">
  <p:input port="source" sequence="true" primary="true"/>
  <p:output port="result" primary="true"/>
  <p:output port="report"/>
  <p:option name="debug" required="false" select="'no'"/>
  <p:option name="debug-dir-uri" select="'debug'"/>
  <p:option name="move-dir-components-to-srcpath" required="false" select="0"/>
  <p:option name="space-separated-docVar-merge" required="false" select="''"/>
</tr:merge-hub>

Converts a <c:param-set> into text (CSV or whitespace separated).

Import

<p:import href="http://transpect.io/xproc-util/params2text/xpl/params2text.xpl"/>

Dependencies

xproc-util transpect.github.io

Synopsis

<tr:params2text xmlns:tr="http://transpect.io">
  <p:input port="source" primary="true"/>
  <p:output port="result" primary="true"/>
  <p:option name="mode" select="'one-line-as-comment'"/>
  <p:option name="include" select="'*'"/>
  <p:option name="exclude" select="'-'"/>
  <p:option name="separator" select="' '"/>
  <p:option name="debug" required="false" select="'no'"/>
  <p:option name="debug-dir-uri" select="'debug'"/>
</tr:params2text>

An XProc wrapper for Poppler's pdfinfo. This step needs Poppler to be installed on the system.

Import

<p:import href="http://transpect.io/xproc-util/pdf-info/xpl/pdf-info.xpl"/>

Dependencies

xproc-util transpect.github.io

Synopsis

<tr:pdf-info xmlns:tr="http://transpect.io">
  <p:output port="result" sequence="true"/>
  <p:option name="file" required="true"/>
  <p:option name="debug" select="'yes'"/>
  <p:option name="debug-dir-uri" select="'debug'"/>
  <p:option name="status-dir-uri" select="concat($debug-dir-uri, '/status')"/>
</tr:pdf-info>

Removal/normalization of generated IDs and accidental file system URIs for generating reference output (and for generating input for diff in order to compare it to the reference output). This will work for many debugging outputs of transpect pipelines for different output formats such as EPUB, IDML, docx or several XML dialects (JATS/BITS/STS, DocBook/Hub, …). You can extend it by making it more configurable and in order to support more output formats, or you can just copy it to a9s/common/xpl and modify it according to your project’s needs.

Import

<p:import href="http://transpect.io/xproc-util/prepare-diff/xpl/prepare-diff.xpl"/>

Synopsis

<tr:prepare-diff xmlns:tr="http://transpect.io">
  <p:input port="source" sequence="false" primary="true"/>
  <p:input port="stylesheet"/>
  <p:option name="out-uri-prefix" required="false"/>
  <p:option name="strip-generated" required="false" select="'all'"/>
</tr:prepare-diff>

This should be the penultimate step before writing back the result document.

Import

<p:import href="http://transpect.io/xproc-util/re-attach-out-of-doc-PIs/xpl/re-attach-out-of-doc-PIs.xpl"/>

Dependencies

xslt-util

Synopsis

<tr:re-attach-out-of-doc-PIs xmlns:tr="http://transpect.io">
  <p:input port="source" primary="true"/>
  <p:output port="result" sequence="true" primary="true"/>
  <p:option name="file-uri"/>
  <p:option name="separator" select="'
'"/>
</tr:re-attach-out-of-doc-PIs>

Copied from http://xproc.org/library/recursive-directory-list.xpl

Copyright situation unclear.

Changed the namespace prefix from l to tr (and the namespaces accordingly).

Prepended a cxf:info step because the step would fail sometimes with Calabash 1.1.4 even if there was a try/catch around it.

In addition, in order to deal with a similar error, replaced p:value-available() with default values (empty strings) for include-filter and exclude-filter.

Import

<p:import href="http://transpect.io/xproc-util/recursive-directory-list/xpl/recursive-directory-list.xpl"/>

Synopsis

<tr:recursive-directory-list xmlns:tr="http://transpect.io">
  <p:output port="result"/>
  <p:option name="path" required="true"/>
  <p:option name="include-filter" select="''"/>
  <p:option name="exclude-filter" select="''"/>
  <p:option name="depth" select="-1"/>
</tr:recursive-directory-list>

The purpose of this identity transformation is to remove all namespace declarations. The step prevents that XProc writes all prefixes declared in the pipeline are written into the output.

Import

<p:import href="http://transpect.io/xproc-util/remove-ns-decl-and-xml-base/xpl/remove-ns-decl-and-xml-base.xpl"/>

Synopsis

<tr:remove-ns-decl-and-xml-base xmlns:tr="http://transpect.io">
  <p:input port="source"/>
  <p:output port="result"/>
</tr:remove-ns-decl-and-xml-base>

This step takes a c:param-set document as input. Parameters which follow the syntax ${name} are resolved with matching parameters from this document. For example the parameter ${isbn} will be replaced with the @value of a c:param element which contains a matching @name attribute.

Given this input document:

<c:param-set xmlns:c="http://www.w3.org/ns/xproc-step">
  <param name="isbn" value="(97[89]){1}\d{9}"/>
  <param name="epub-filename" value="{$isbn}\.epub"/>
</c:param-set>

This step would resolve the isbn parameter in c:param[@name eq 'epub-filename'] and generates this output:

<c:param-set xmlns:c="http://www.w3.org/ns/xproc-step">
  <param name="isbn" value="(97[89]){1}\d{9}"/>
  <param name="epub-filename" value="(97[89]){1}\d{9}\.epub"/>
</c:param-set>

Import

<p:import href="http://transpect.io/xproc-util/resolve-params/xpl/resolve-params.xpl"/>

Dependencies

xslt-util

Synopsis

<tr:resolve-params xmlns:tr="http://transpect.io">
  <p:input port="source"/>
  <p:output port="result"/>
</tr:resolve-params>

Creates XML messages as expected by cx:send-mail step. There are two possibilities to include attachments: 1. Use the input port="attachments" for XML files. 2. Provide a whitespace separated list of file URIs with option/@name=attachments. These files will be treated as binary data.

Import

<p:import href="http://transpect.io/xproc-util/send-mail/xpl/send-mail.xpl"/>

Dependencies

xproc-util transpect.github.io

Synopsis

<tr:mailing xmlns:tr="http://transpect.io">
  <p:input port="attachments" sequence="true"/>
  <p:output port="result" sequence="true" primary="true"/>
  <p:option name="from" required="true"/>
  <p:option name="from-name" select="''"/>
  <p:option name="to" required="true"/>
  <p:option name="content" select="''"/>
  <p:option name="subject" select="'no subject'"/>
  <p:option name="attachments" select="''"/>
  <p:option name="debug" select="'no'"/>
  <p:option name="debug-dir-uri" select="'debug'"/>
</tr:mailing>

This step stores status messages as plain text files and prints them to the standard output (GI 2018-09-13: The latter doesn’t seem to be true). The step can be used everywhere in your pipeline. The input will be simply forwarded to the output without any transformations.

The input port entitled "msgs" expects a c:message XML document. The status messages must be wrapped in c:message elements.

For localized messages, you can use multiple c:message elements each including a xml:lang attribute. The attribute value must be a language code according to ISO 639-1.

<tr:simple-progress-msg file="trdemo-paths.txt">
  <p:input port="msgs">
    <p:inline>
      <c:messages>
        <c:message xml:lang="en">Generating File Paths</c:message>
        <c:message xml:lang="de">Generiere Dateisystempfade</c:message>
      </c:messages>
    </p:inline>
  </p:input>
  <p:with-option name="status-dir-uri" select="$status-dir-uri"/>
</tr:simple-progress-msg>

Sometimes you might want to switch off storing of messages altogether. You can do this by appending '?enabled=false' to the URI.

Import

<p:import href="http://transpect.io/xproc-util/simple-progress-msg/xpl/simple-progress-msg.xpl"/>

Synopsis

<tr:simple-progress-msg xmlns:tr="http://transpect.io">
  <p:input port="source" sequence="true" primary="true"/>
  <p:input port="msgs"/>
  <p:output port="result" sequence="true" primary="true"/>
  <p:option name="file"/>
  <p:option name="status-dir-uri" select="'status?enabled=false'"/>
</tr:simple-progress-msg>

This step redirects an error to a status text file and prints a cx:message. If option fail-on-error is set to true, the error is reproduced with an attached error code.

Import

<p:import href="http://transpect.io/xproc-util/simple-progress-msg/xpl/simple-progress-msg.xpl"/>

Synopsis

<tr:propagate-caught-error xmlns:tr="http://transpect.io">
  <p:input port="source" primary="true"/>
  <p:output port="result" primary="true"/>
  <p:option name="fail-on-error" required="false" select="'false'"/>
  <p:option name="rule-family" required="false" select="'internal'"/>
  <p:option name="code" required="false" select="'tr:UNSP01'"/>
  <p:option name="step-type" required="false"/>
  <p:option name="severity" required="false" select="'fatal-error'"/>
  <p:option name="msg-file" required="false" select="'unspecified-error.txt'"/>
  <p:option name="status-dir-uri" required="false" select="'debug/status?enabled=false'"/>
</tr:propagate-caught-error>

Import

<p:import href="http://transpect.io/xproc-util/store-debug/xpl/store-debug.xpl"/>

Synopsis

<tr:store-debug xmlns:tr="http://transpect.io">
  <p:input port="source" sequence="true"/>
  <p:output port="result" sequence="true"/>
  <p:option name="active" required="false" select="'no'"/>
  <p:option name="pipeline-step" required="true"/>
  <p:option name="default-uri" required="false" select="resolve-uri('debug')"/>
  <p:option name="base-uri" required="false" select="''"/>
  <p:option name="extension" required="false" select="''"/>
  <p:option name="indent" required="false" select="'true'"/>
</tr:store-debug>

Import

<p:import href="http://transpect.io/xproc-util/store-zip/xpl/store-zip.xpl"/>

Dependencies

xproc-util transpect.github.io

Synopsis

<tr:store-zip xmlns:tr="http://transpect.io">
  <p:input port="source" sequence="true" primary="true"/>
  <p:output port="result" sequence="false" primary="true"/>
  <p:option name="target-zip-uri" required="true"/>
  <p:option name="default-compression-method" required="false" select="'deflated'"/>
  <p:option name="default-compression-level" required="false" select="'default'"/>
  <p:option name="default-command" required="false" select="'update'"/>
  <p:option name="additional-file-uris-to-zip-root" required="false" select="''"/>
  <p:option name="debug" select="'yes'"/>
  <p:option name="debug-dir-uri" select="'debug'"/>
</tr:store-zip>

Converts text-based configuration files (yaml, in particular) to <c:param-set>s.

It does not yet cope with hashes or arrays as found in yaml files. This will be adjourned to a time when params will be expressed as maps.

Import

<p:import href="http://transpect.io/xproc-util/text2params/xpl/text2params.xpl"/>

Dependencies

xproc-util transpect.github.io

Synopsis

<tr:text2params xmlns:tr="http://transpect.io">
  <p:output port="result" primary="true"/>
  <p:option name="file" required="true"/>
  <p:option name="debug" required="false" select="'no'"/>
  <p:option name="debug-dir-uri" select="'debug'"/>
</tr:text2params>

Send local files to Virus Total and get a Schematron report.

This XProc step use the Total Virus API to upload and scan local files. Scan reports are retrieved with p:http-request and validated with Schematron.

Note: requires a Virus Total account, please see option api-key.

$ sh calabash/calabash.sh -Xtransparent-json -Xjson-flavor=marklogic \
      -o report=svrl.xml virustotal.xpl href=test.txt api-key=myRandomKey

Get more information on Virus Total at https://www.virustotal.com

Import

<p:import href="http://transpect.io/xproc-util/virustotal/xpl/virustotal.xpl"/>

Synopsis

<tr:virustotal xmlns:tr="http://transpect.io">
  <p:output port="result" primary="true"/>
  <p:output port="report" sequence="true" primary="false"/>
  <p:option name="scan-url" select="'https://www.virustotal.com/vtapi/v2/file/scan'"/>
  <p:option name="report-url" select="'https://www.virustotal.com/vtapi/v2/file/report'"/>
  <p:option name="href"/>
  <p:option name="api-key" select="''"/>
</tr:virustotal>

Import

<p:import href="http://transpect.io/xproc-util/xml-model/xpl/prepend-hub-xml-model.xpl"/>

Synopsis

<tr:prepend-hub-xml-model xmlns:tr="http://transpect.io">
  <p:input port="source" primary="true"/>
  <p:output port="result" sequence="true" primary="true"/>
  <p:option name="hub-version" required="true"/>
</tr:prepend-hub-xml-model>

Import

<p:import href="http://transpect.io/xproc-util/xml-model/xpl/prepend-xml-model.xpl"/>

Synopsis

<tr:prepend-xml-model xmlns:tr="http://transpect.io">
  <p:input port="source" sequence="true" primary="true"/>
  <p:input port="models" sequence="true"/>
  <p:output port="result" sequence="true" primary="true"/>
  <p:option name="hub-version" required="false" select="''"/>
</tr:prepend-xml-model>

Import

<p:import href="http://transpect.io/xproc-util/xslt-mode/xpl/xslt-mode.xpl"/>

Dependencies

xproc-util transpect.github.io

Synopsis

<tr:xslt-mode xmlns:tr="http://transpect.io">
  <p:input port="source" sequence="true" primary="true"/>
  <p:input port="stylesheet"/>
  <p:input port="models" sequence="true"/>
  <p:input port="parameters" sequence="true" primary="true"/>
  <p:output port="result" sequence="true" primary="true"/>
  <p:output port="secondary" sequence="true" primary="false"/>
  <p:output port="report" sequence="true" primary="false"/>
  <p:option name="mode" required="true"/>
  <p:option name="prefix" required="false" select="'default'"/>
  <p:option name="msg" required="false" select="'no'"/>
  <p:option name="debug" required="false" select="'no'"/>
  <p:option name="indent" required="false" select="'true'"/>
  <p:option name="debug-dir-uri" required="true"/>
  <p:option name="debug-indent" required="false" select="'true'"/>
  <p:option name="status-dir-uri" select="concat(replace($debug-dir-uri, '^(.+)\?.*$', '$1'), '/status')"/>
  <p:option name="fail-on-error" select="'no'"/>
  <p:option name="store-secondary" select="'yes'"/>
  <p:option name="secondary-serialization-method" select="''"/>
  <p:option name="adjust-doc-base-uri" select="'yes'"/>
  <p:option name="hub-version" required="false" select="''"/>
</tr:xslt-mode>

Extends the pxp:zip step to check whether all in a manifest referenced items are available.

Import

<p:import href="http://transpect.io/xproc-util/zip/xpl/zip.xpl"/>

Dependencies

xproc-util transpect.github.io

Synopsis

<tr:zip xmlns:tr="http://transpect.io">
  <p:input port="source" primary="true"/>
  <p:output port="result"/>
  <p:output port="report" sequence="true"/>
  <p:option name="debug" select="'no'"/>
  <p:option name="debug-dir-uri" select="'debug'"/>
  <p:option name="command" required="false" select="'create'"/>
  <p:option name="compression-method" required="false" select="'deflated'"/>
  <p:option name="compression-level" required="false" select="'default'"/>
  <p:option name="href" required="true"/>
</tr:zip>

Git URL	`https://github.com/transpect/xproc-util.git`
SVN URL	`https://github.com/transpect/xproc-util`
Base URI	`http://transpect.io/xproc-util/`

xproc-util

XProc utilities for transpect

tr:batch-rename-files

Example: rename file extensions

Example: replace whitespace

Import

Dependencies

Synopsis

tr:copy-files

Import

Dependencies

Synopsis

tr:xsltmode-as-saxon-command

Import

Synopsis

tr:evolve-mml

Import

Dependencies

Synopsis

tr:extract-cssa-rules

Import

Synopsis

tr:extract-from-jar

Import

Synopsis

tr:escape-for-uri

Import

Dependencies

Synopsis

tr:file-uri

Examples for 'filename' values

Relative Paths

XML Catalogs

Storage Location for HTTP Downloads

Unique File Names for HTTP Downloads

Output format

Import

Dependencies

Synopsis

tr:unescape-uri

Import

Dependencies

Synopsis

tr:ghostscript

Import

Dependencies

Synopsis

tr:html-embed-resources

Import

Dependencies

Synopsis

tr:get-data-uri

Import

Synopsis

tr:load-html5

Import

Dependencies

Synopsis

tr:imagemagick

Import

Dependencies

Synopsis

tr:insert-srcpaths

Import

Dependencies

Synopsis

tr:load-data

Import

Dependencies

Synopsis

tr:load-html

Import

Dependencies

Synopsis

tr:load-sources

Import

Synopsis

tr:load

Import

Dependencies