epub:convert epub-convert

epubtools/xpl/epub-convert.xpl

Import URI: http://transpect.io/epubtools/xpl/epub-convert.xpl

This step takes a HTML file as input and converts it to an epub file. You need a configuration for the HTML splitting and the OPF metadata. Examples can be found in the sample directory. Invoke this step on the command line with:

calabash/calabash.sh -i source=sample/b978-3-646-92351-3.xhtml 
  -i conf=sample/hierarchy.xml -i meta=sample/epub-config.xml epub-convert.xpl 

Note that it’s advisable to make all file inputs absolute URIs, by using cygpath on Cygwin or readlink -f on Unixy systems. For bash, this is, e.g., source=file:/$(cygpath -ma sample/b978-3-646-92351-3.xhtml)

Input Ports

NameDocumentationConnections

source

An XHTML file (Version number irrelevant; will be output as 1.1), either loaded from a physical location on disk or, alternatively, with its base URI set in an /*/@xml:base attribute. It is important that the source document have a base URI because the locations of all referenced files (CSS, images) will be determined relative to this base URI.

conf

/hierarchy, config file for HTML splitter (see sample/hierarchy.xml).

May be included in meta port doc as /epub-config/hierarchy so you don’t have to submit an extra document to this port

meta

/epub-config – an EPUB file’s metadata and other configuration settings (see sample/epub-config.xml for an example).

Please note that the name “meta” is misleading since the file contains more than just metadata.

schematron

  • Default document: ../schematron/epub.sch.xml

custom-schematron

Additional Schematron checks. See debug/epubtools/input-for-schematron.xml for an example of the input format (after running this once with debug=yes). The Schematron files should have a /*/@tr:rule-family attribute that identifies the schema’s rule set for the purpose of report generation.

Output Ports

NameDocumentationConnections

result

chunks

opf

files

report

html

baseuri

input-for-schematron

Options

NameDocumentationDefault

target

''

terminate-on-error

'yes'

clean-target-dir

Whether to erase the target directory prior to splitting etc. Otherwise, files from previous conversions might be included in the resulting zip file.

'no'

debug

'no'

use-svg

''

debug-dir-uri

'debug'

status-dir-uri

'status'

Subpipeline

StepInputsOutputsOptions

p:variable wrap-cover-in-svg

meta on epub-convert

($use-svg[not(. = '')], /epub-config/cover/@svg, 'true')[1]

p:variable target-format

meta on epub-convert

($target[not(. = '')], /epub-config/@format, 'EPUB3')[1]

tr:simple-progress-msg start-msg

source

source on epub-convert

msgs

 <c:messages>
   <c:message xml:lang="en">Starting EPUB generation</c:message>
   <c:message xml:lang="de">Beginne EPUB-Erzeugung</c:message>
 </c:messages>

result

file = 'epub-convert-start.txt'

status-dir-uri = $status-dir-uri

tr:file-uri base-uri

The output files are stored relative to the base-uri of the document on the primary input port.

source

result on start-msg

result

filename = (base-uri(/*), static-base-uri())[1]

epub:create-ocf create-ocf

meta

meta on epub-convert

result

base-uri = /c:result/@local-href

debug = $debug

debug-dir-uri = $debug-dir-uri

p:sink d250e163

source

result on create-ocf

p:label-elements srcpaths

For the epubtools Schematron checks, we need to add srcpaths on elements that don’t have them yet.

source

source on epub-convert

result

attribute = 'srcpath'

replace = 'false'

match = '*[local-name() = ( 'p', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'div', 'nav', 'section', 'main', 'ol', 'ul', 'li', 'dd', 'dt', 'td', 'th', 'em', 'span', 'b', 'i', 'strong', 'code', 'pre', 'a', 'img')]'

tr:store-debug d250e174

source

result on srcpaths

result

pipeline-step = 'epubtools/add-srcpaths-to-input'

extension = 'xhtml'

active = $debug

base-uri = $debug-dir-uri

epub:create-ops create-ops

source

result on d250e174

conf

conf on epub-convert

meta

meta on epub-convert

result

base-uri = /c:result/@local-href

target = $target-format

use-svg = $wrap-cover-in-svg

terminate-on-error = $terminate-on-error

debug = $debug

debug-dir-uri = $debug-dir-uri

epub:create-opf create-opf

source

files on create-ops

result on create-ops

meta

meta on epub-convert

result

base-uri = /c:result/@local-href

target = $target-format

terminate-on-error = $terminate-on-error

use-svg = $wrap-cover-in-svg

debug = $debug

debug-dir-uri = $debug-dir-uri

p:sink d250e243

source

result on create-opf

p:choose conditionally-remove-nav-from-filelist-if-epub2

$target-format = ('EPUB2', 'KF8')

p:delete discard-epub2-nav-html

nav.xhtml is only carried along for creating the guide element in EPUB2

source

files on create-ops

result

match = '/*/c:file[matches(@name, 'nav\.xhtml$')]'

p:otherwise

p:identity d250e266

source

files on create-ops

result

p:sink d250e275

source

p:choose conditionally-remove-nav-from-chunks-if-epub2

$target-format = 'EPUB2'

p:delete discard-epub2-nav

nav.xhtml is only carried along for creating the guide element in EPUB2

source

result on create-ops

result

match = '/*/html:html[matches(@xml:base, 'nav\.xhtml$')]'

p:otherwise

p:identity d250e298

source

result on create-ops

result

p:sink d250e307

source

epub:zip-package zip-package

ocf-filerefs

files on create-ocf

ops-filerefs

result on conditionally-remove-nav-from-filelist-if-epub2

opf-fileref

files on create-opf

meta

meta on epub-convert

result

base-uri = /c:result/@local-href

debug = $debug

debug-dir-uri = $debug-dir-uri

cxf:info zip-info

href = /c:zipfile/@href

p:set-attributes insert-zip-info

source

result on zip-package

attributes

result on zip-info

result

match = '/*'

tr:file-uri output-file-name

source

result on insert-zip-info

result

filename = /c:zipfile/@href

p:sink d250e365

source

result on output-file-name

p:wrap-sequence wrap-for-schematron

source

meta on epub-convert

result on create-opf

result on conditionally-remove-nav-from-filelist-if-epub2

html on create-ops

splitting-report on create-ops

result on conditionally-remove-nav-from-chunks-if-epub2

result on insert-zip-info

result

wrapper = 'c:wrap'

tr:store-debug d250e413

source

result on wrap-for-schematron

result

pipeline-step = 'epubtools/input-for-schematron'

active = $debug

base-uri = $debug-dir-uri

p:sink d250e419

source

result on d250e413

p:for-each schematrons

schematron on epub-convert

custom-schematron on epub-convert

tr:oxy-validate-with-schematron sch0

source

result on wrap-for-schematron

schema

current on schematrons

parameters

p:empty

result

p:sink d250e451

source

result on sch0

p:add-attribute d250e453

source

report on sch0

result

match = '/*'

attribute-name = 'tr:rule-family'

attribute-value = (/*/@tr:rule-family, 'epubtools-custom')[1]

p:add-attribute sch

source

result on d250e453

result

match = '/*'

attribute-name = 'tr:step-name'

attribute-value = string-join( ( 'epubtools', ( /opf:package/opf:metadata/dc:identifier[@id = ../@unique-identifier], /opf:package/opf:metadata/dc:identifier, /opf:package/opf:metadata/dc:title )[1] ), ' ' )

p:sink d250e475

source

tr:simple-progress-msg success-msg

source
msgs

 <c:messages>
   <c:message xml:lang="en">EPUB generation finished (see the HTML report though – errors will be reported there)</c:message>
   <c:message xml:lang="de">EPUB-Erzeugung abgeschlossen (bitte im HTML-Report nachsehen, ob fehlerfrei)</c:message>
 </c:messages>

result

file = 'epub-convert-success.txt'

status-dir-uri = $status-dir-uri

p:sink d250e495

source

result on success-msg