The Code4Lib Journal – EPUB as Publication Format in Open Access Journals: Tools and Workflow
2. Transform the result document from step 1 to HTML.
3. Post-process the HTML document from step 2 to XHTML.
In our case, we have to go through all these steps, since our main goal is to
produce EPUB files, and EPUB requires XHTML 1.1 for the main content file.
The journals use APA as citation style, so we chose APA transformation as the
base XSLT style sheet for references in the pipeline.
The main XSLT style sheet is jats-html.xsl, the web preview style sheet.
There is also a CSS style sheet for the web presentation, jats-preview.css.
None of them fit all our needs, so we had to customize them, just as we had to
do with the APA pre-processing style sheet. We had to recreate the look-and-
feel of the journal on the web and in the EPUB version.
We have used a trial-and-error approach in the project. We succeeded in the
end, eventually realizing that merging changes directly into the NCBI style sheets
is not the proper way of doing it. Piez calls this the ‘monolithic’ customization
method . All custom changes will be lost when upgrading the toolset. In his
article, Piez recommends ‘vertical customization’, which means creating separate
XSLT and CSS style sheets with custom templates and styles and then importing
the NCBI master style sheets into them. The custom templates and styles will
then override the master when needed, otherwise falling back on the master style
sheet. In XSLT, the xsl:import instruction does the trick; in CSS the @import
rule achieves the same.
Our custom style sheets—hioa-citations-prep-APAcit.xsl, hioa-
xhtml.xsl and hioa-web.css—are provided with the article. The particular
adjustments we have made will probably be of no use to others, but we would
like to stress the importance of creating a vertical customization from the start. It
will save you hours of work compared to separating them from the master style
It is also worth mentioning that the reference pre-processing style sheets from
NCBI are using XSLT 2.0, which XML Copy Editor does not support. To run
these transformations in tests before the final XProc pipeline, we had to
download an XSLT 2.0 transformation engine. We chose Saxon-HE.
Packaging the EPUB file
The file structure of an EPUB file is described in the previous section titled
Output: EPUB. All the mandatory files in the file structure are either static files or
results of XSLT transformations of the JATS source file.
The XHTML version of the article is the output from the transformation described
in the previous section. We use our customized CSS style sheet for presentation.
These and any image files (GIF, JPEG, PNG) must be copied to the OEBPS
folder or subfolders. mimetype and container.xml are static files. Only two
files, content.opf (the EPUB root file) and toc.ncx (the table of contents)
depend on the content of the particular EPUB file.
We can generate both of these files from the source JATS file using XSLT. We
provide two style sheets: epub-content.opf.xsl for the EPUB root file and
epub-toc.ncx.xsl for the table of contents file. They should be applicable for
any other journal using the same process as we are describing here. The XSLT
style sheet fetches the variable file content (title, author, metadata etc.) from the