epub:convert epub-convert
epubtools/xpl/epub-convert.xpl
Import URI: http://transpect.io/epubtools/xpl/epub-convert.xpl
This step takes a HTML file as input and converts it to an epub file. You need a configuration for the HTML splitting and the OPF metadata. Examples can be found in the sample directory. Invoke this step on the command line with:
calabash/calabash.sh -i source=sample/b978-3-646-92351-3.xhtml
-i conf=sample/hierarchy.xml -i meta=sample/epub-config.xml epub-convert.xpl
Note that it’s advisable to make all file inputs absolute URIs, by using cygpath
on Cygwin or readlink -f
on Unixy systems. For bash, this is, e.g., source=file:/$(cygpath -ma sample/b978-3-646-92351-3.xhtml)
Input Ports
Name | Documentation | Connections |
---|---|---|
sourceⓅ | An XHTML file (Version number irrelevant; will be output as 1.1), either loaded from a physical location on disk or, alternatively, with its base URI set in an /*/@xml:base attribute. It is important that the source document have a base URI because the locations of all referenced files (CSS, images) will be determined relative to this base URI. | |
confⓈ | /hierarchy, config file for HTML splitter (see sample/hierarchy.xml). May be included in meta port doc as /epub-config/hierarchy so you don’t have to submit an extra document to this port | |
meta | /epub-config – an EPUB file’s metadata and other configuration settings (see sample/epub-config.xml for an example). Please note that the name “meta” is misleading since the file contains more than just metadata. | |
schematron |
| |
custom-schematronⓈ | Additional Schematron checks. See debug/epubtools/input-for-schematron.xml for an example of the input format (after running this once with debug=yes). The Schematron files should have a /*/@tr:rule-family attribute that identifies the schema’s rule set for the purpose of report generation. |
Output Ports
Name | Documentation | Connections |
---|---|---|
resultⓅ | ||
chunks | ||
opf | ||
files | ||
reportⓈ | ||
html | ||
baseuri | ||
input-for-schematron |
Options
Name | Documentation | Default |
---|---|---|
target | '' | |
terminate-on-error | 'yes' | |
clean-target-dir | Whether to erase the target directory prior to splitting etc. Otherwise, files from previous conversions might be included in the resulting zip file. | 'no' |
debug | 'no' | |
use-svg | '' | |
debug-dir-uri | 'debug' | |
status-dir-uri | 'status' |
Subpipeline
Step | Inputs | Outputs | Options | ||
---|---|---|---|---|---|
p:variable wrap-cover-in-svg | meta on epub-convert | ($use-svg[not(. = '')], /epub-config/cover/@svg, 'true')[1] | |||
p:variable target-format | meta on epub-convert | ($target[not(. = '')], /epub-config/@format, 'EPUB3')[1] | |||
tr:simple-progress-msg start-msg |
| result | file = 'epub-convert-start.txt' status-dir-uri = $status-dir-uri | ||
tr:file-uri base-uri The output files are stored relative to the base-uri of the document on the primary input port. | result | filename = (base-uri(/*), static-base-uri())[1] | |||
epub:create-ocf create-ocf |
| result | base-uri = /c:result/@local-href debug = $debug debug-dir-uri = $debug-dir-uri | ||
p:sink d250e163 |
| ||||
p:label-elements srcpaths For the epubtools Schematron checks, we need to add srcpaths on elements that don’t have them yet. |
| result | attribute = 'srcpath' replace = 'false' match = '*[local-name() = ( 'p', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'div', 'nav', 'section', 'main', 'ol', 'ul', 'li', 'dd', 'dt', 'td', 'th', 'em', 'span', 'b', 'i', 'strong', 'code', 'pre', 'a', 'img')]' | ||
tr:store-debug d250e174 | result | pipeline-step = 'epubtools/add-srcpaths-to-input' extension = 'xhtml' active = $debug base-uri = $debug-dir-uri | |||
epub:create-ops create-ops |
| result | base-uri = /c:result/@local-href target = $target-format use-svg = $wrap-cover-in-svg terminate-on-error = $terminate-on-error debug = $debug debug-dir-uri = $debug-dir-uri | ||
epub:create-opf create-opf |
| result | base-uri = /c:result/@local-href target = $target-format terminate-on-error = $terminate-on-error use-svg = $wrap-cover-in-svg debug = $debug debug-dir-uri = $debug-dir-uri | ||
p:sink d250e243 |
| ||||
p:choose | |||||
$target-format = ('EPUB2', 'KF8') | |||||
nav.xhtml is only carried along for creating the guide element in EPUB2 |
| match = '/*/c:file[matches(@name, 'nav\.xhtml$')]' | |||
p:otherwise | |||||
p:identity d250e266 |
| result | |||
p:sink d250e275 |
| ||||
p:choose | |||||
$target-format = 'EPUB2' | |||||
nav.xhtml is only carried along for creating the guide element in EPUB2 |
| match = '/*/html:html[matches(@xml:base, 'nav\.xhtml$')]' | |||
p:otherwise | |||||
p:identity d250e298 |
| result | |||
p:sink d250e307 |
| ||||
epub:zip-package zip-package |
| result | base-uri = /c:result/@local-href debug = $debug debug-dir-uri = $debug-dir-uri | ||
cxf:info zip-info | href = /c:zipfile/@href | ||||
p:set-attributes insert-zip-info |
| result | match = '/*' | ||
tr:file-uri output-file-name |
| result | filename = /c:zipfile/@href | ||
p:sink d250e365 |
| ||||
p:wrap-sequence wrap-for-schematron | result | wrapper = 'c:wrap' | |||
tr:store-debug d250e413 |
| result | pipeline-step = 'epubtools/input-for-schematron' active = $debug base-uri = $debug-dir-uri | ||
p:sink d250e419 | |||||
p:for-each schematrons | |||||
| result | ||||
p:sink d250e451 | |||||
p:add-attribute d250e453 | result | match = '/*' attribute-name = 'tr:rule-family' attribute-value = (/*/@tr:rule-family, 'epubtools-custom')[1] | |||
p:add-attribute sch | result | match = '/*' attribute-name = 'tr:step-name' attribute-value = string-join( ( 'epubtools', ( /opf:package/opf:metadata/dc:identifier[@id = ../@unique-identifier], /opf:package/opf:metadata/dc:identifier, /opf:package/opf:metadata/dc:title )[1] ), ' ' ) | |||
p:sink d250e475 |
| ||||
tr:simple-progress-msg success-msg |
| result | file = 'epub-convert-success.txt' status-dir-uri = $status-dir-uri | ||
p:sink d250e495 |
|