61
PDF Converter Services - User & Developer Guide
PDF Converter Services - User & Developer Guide - Version 8.0 - 07/12/15
© Copyright 2015, Muhimbi Ltd
Page 103 of 124
- {dim} is dimension. Either empty (meaning inches) or "mm", "in",
"in.", "inch" or "inches".
o
PageOrientation: The orientation used by the TOC. Portrait,
Landscape or Default. The Default option uses the same orientation as
the page following (or preceding) the TOC depending on the value
specified in Location.
o
PaperSize: A named paper size such as A4 or Letter (See MSDN
) or a
custom size in "{width}{dim}{sep}{height}{dim}" format where:
- {width} and {height} are numerical values (please use a colon '.' as
the decimal separator) .
- {dim} is the dimension which can be 'mm', 'in.' or 'inches'. (It defaults
to inches when nothing is specified)
- {sep} separates the width and the height, either 'by', comma (,) or the
letter 'x' Example: "8.5 in. by 6 in."
o
Properties: Optional properties to pass to the XSL template for display
or processing purposes. For details see below.
o
Template: The XSL template (See 8.3) to use for formatting purposes.
This can either be a string containing all the XSL, a path - local to the
server running the conversion service - to the location of the XSL file, or
a URL to the XSL file on a web (or SharePoint) server.
NameValuePair: A single value that can be passed into the XSL using
TOCSettings.Properties.
TOCLocation: Used by TOCSettings.Location to determine where the
TOC should go.
BookmarkGenerationOption: As explained in XML Source Data (8.2), the
TOC system is based on the content and structure of PDF Bookmarks. It is
therefore essential that during the conversion of the source documents
ConversionSettings.GenerateBookmarks is set to Automatic.
Based on the previously described list of classes and properties, adding a TOC
may sound complex, but nothing could be further from the truth. The easiest
way to get started is to take our sample code
, add the following code and then
pass tocSettings into either ConversionSettings.TOCSettings or
MergeSettings.TOCSettings.
//** Create any custom properties that need to be passed into the TOC.
NameValuePair[] properties = new NameValuePair[2];
properties[0] = new NameValuePair() { Name = "title", Value = "Development Guide" };
properties[1] = new NameValuePair() { Name = "status", Value = "Draft" };
// ** Specify the various TOCSettings
TOCSettings tocSettings = new TOCSettings
{
MinimumEntries = 0,
Bookmark = "Table Of Contents",
Location = TOCLocation.Front,
Properties = properties,
Template = @"C:\templates\toc.xsl",
};
// ** Pass the TOC Settings into the conversion
conversionSettings.TOCSettings = tocSettings;
62
PDF Converter Services - User & Developer Guide
PDF Converter Services - User & Developer Guide - Version 8.0 - 07/12/15
© Copyright 2015, Muhimbi Ltd
Page 104 of 124
You are not limited to our sample code, but it is a good starting point. It is even
possible to pass the tocSettings to both ConversionSettings.TOCSettings AND
MergeSettings.TOCSettings to generate TOCs for each individual document in
a merge operation, and then add an overall TOC for the entire merged
document.
The big question is what to specify in the Template property. Read on for
details.
8.2 XML Source Data
To determine what entries to include in the TOC, the conversion service looks
at the Bookmarks present in the PDF file. If the source file is not already in PDF
format, it will be converted to PDF and
–
where possible
–
generate PDF
bookmarks based on the internal structure of the document. For example,
when converting an MS-Word file the various headings determine the structure
of the PDF Bookmarks.
Although in most cases it is not important for our customers to have any
knowledge about the internals of the Muhimbi Conversion Service, in this
particular case - and by design - it is. Internally, an XML document is generated
that represents the content and structure of the PDF Bookmarks, this XML
document is then transformed using XSL into HTML. It is this HTML
–
the
language that underpins every website on the internet
–
that determines the
formatting of the TOC. Developers have full control over the XSL, providing an
enormous amount of flexibility.
Let’s take our
Administration Guide as an example. When converted to PDF a
set of nested PDF bookmarks are created, which internally generates the
following XML (truncated as it is several pages long ).
<?xml version="1.0" encoding="utf-8"?>
<toc>
<topics>
<topic title="Administration Guide - TOC" target="[GUID]" level="0" page="1" />
<topic title="1 Introduction" target="[GUID]" level="0" page="8">
<topic title="1.1 Prerequisites" target="[GUID]" level="1" page="10" />
<topic title="1.2 Solution architecture" target="[GUID]" level="1" page="11" />
</topic>
....
<topic title="Appendix - Licensing" target="[GUID]" level="0" page="69" />
</topics>
<properties>
<property name="title">Some Document Title</property>
</properties>
</toc>
The generated XML is fairly straight forward, a number of nested topic
elements make up the structure. Each element has a descriptive title attribute,
a level attribute (which matches the nesting level), a page attribute containing
the page number, and a target attribute which is used for internal processing
purposes (this example shows [GUID] as it is not relevant).
Please note: All page numbers in the TOC reflect the physical page number of
that page in the generated PDF, including the addition of the TOC page itself. If
the source document(s) already display page numbers, then these may no
longer be the same as the page number listed in the TOC or their actual page
68
PDF Converter Services - User & Developer Guide
PDF Converter Services - User & Developer Guide - Version 8.0 - 07/12/15
© Copyright 2015, Muhimbi Ltd
Page 105 of 124
number in the generated PDF. If you wish to change the page numbers
displayed in the footer of a document then please use our watermarking
facilities (see chapter 3.5).
The list of topic elements is followed by a properties section. This section, and
its contents, consists of a number of optional values that may have been
passed into the request. This allows, for example, the addition of information to
the TOC to display the document's status, author, title or any other kind of
information. In this example we are passing in the title of the document.
8.3 XSL Transformation
Although the XM
L document’s content may differ between requests, the
structure is always the same. As a result we can use the XSL industry standard
to convert the XML into an attractive looking HTML document. Although XSL
may look daunting to the uninitiated, the following sample (download
) is a good
starting point and can be amended to suit your particular needs (or used as is).
1 <?xml version="1.0" encoding="utf-8"?>
2 <xsl:stylesheet version="1.0"
3 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
4 xmlns:msxsl="urn:schemas-microsoft-com:xslt"
5 exclude-result-prefixes="msxsl">
6
7 <xsl:output method="html" indent="yes"/>
8
9 <xsl:template match="/toc">
10 <html>
11 <head>
12 <style type="text/css">
13 ul.toc
14 {
15 margin: 0;
16 padding: 0;
17 list-style: none;
18 }
19 ol.toc
20 {
21 margin: 0;
22 padding: 0;
23 margin-left: 10px;
24 list-style: none;
25 }
26 ul.toc li
27 {
28 clear: both;
29 overflow: hidden;
30 }
31 ol.toc li
32 {
33 overflow: hidden;
34 }
35 span.title
36 {
37 float: left;
38 padding-right: 4px;
39 }
40 span.page
41 {
42 float: right;
43 padding-left: 4px;
44 }
45 span.dots
46 {
47 font-size: 0px;
48 width:100%;
75
PDF Converter Services - User & Developer Guide
PDF Converter Services - User & Developer Guide - Version 8.0 - 07/12/15
© Copyright 2015, Muhimbi Ltd
Page 106 of 124
49 border-bottom: 2px dotted black;
50 }
51 a.toc
52 {
53 text-decoration: none;
54 color: #000;
55 }
56 </style>
57 </head>
58 <body>
59 <h1>
60 <xsl:value-of select="properties/property[@name='title']"/>
61 </h1>
62 <br/>
63 <br/>
64 <xsl:apply-templates/>
65 </body>
66 </html>
67 </xsl:template>
68
69 <xsl:template match="topics">
70 <ul class="toc">
71 <xsl:apply-templates/>
72 </ul>
73 </xsl:template>
74
75 <!-- Empty template so properties are not appearing -->
76 <xsl:template match="properties"></xsl:template>
77
78 <xsl:template match="topic[@level='0']">
79 <li>
80 <xsl:element name="a">
81 <xsl:attribute name="href">
82 <xsl:value-of select="@target"/>
83 </xsl:attribute>
84 <xsl:attribute name="class">toc</xsl:attribute>
85 <span class="title" style="font-weight: 900;">
86 <xsl:value-of select="@title"/>
87 </span>
88 <span class="page">
89 <xsl:value-of select="@page"/>
90 </span>
91 <span class="dots"></span>
92 </xsl:element>
93 </li>
94 <ol class="toc">
95 <xsl:apply-templates/>
96 </ol>
97 </xsl:template>
98
99 <xsl:template match="topic">
100 <li>
101 <xsl:element name="a">
102 <xsl:attribute name="href">
103 <xsl:value-of select="@target"/>
104 </xsl:attribute>
105 <xsl:attribute name="class">toc</xsl:attribute>
106 <span class="title">
107 <xsl:value-of select="@title"/>
108 </span>
109 <span class="page">
110 <xsl:value-of select="@page"/>
111 </span>
112 <span class="dots"></span>
113 </xsl:element>
114 </li>
115 <ol class="toc">
116 <xsl:apply-templates/>
117 </ol>
118 </xsl:template>
119 </xsl:stylesheet>
27
PDF Converter Services - User & Developer Guide
PDF Converter Services - User & Developer Guide - Version 8.0 - 07/12/15
© Copyright 2015, Muhimbi Ltd
Page 107 of 124
Although this is a standard XSL file, the following sections are of particular
interest:
Lines 12-56: Standard HTML CSS style sheet which controls the look of
the generated HTML.
Line 60: Insert a custom property passed into the conversion request. In
our example the document’s title.
Line 76: An empty template for the properties element to prevent this
information from being displayed as a plain list.
Lines 78-97: XSL template for generating HTML associated with all Level 0
topics. If you wish to control the generated HTML for a specific level then
copy the
topic[@level=’0’]
template and change the level number to match
to appropriate nesting level.
Lines 99-118: XSL Template for all topic levels that do not have an explicit
template defined.
If your experience with XML and XSL is limited then we recommend using the
XSL sample provided above. As can be seen below, the results look very good.
14
PDF Converter Services - User & Developer Guide
PDF Converter Services - User & Developer Guide - Version 8.0 - 07/12/15
© Copyright 2015, Muhimbi Ltd
Page 108 of 124
8.4 Testing & Troubleshooting
Although it is only a basic application, the PDF Converter comes with a handy
Diagnostics Tool (including full source code) to test the Table Of Contents
facility. While this might be merely a handy test tool, not the official user
interface for the TOC facility, it can be incredibly helpful in quickly testing
various XSL template designs before integrating them into your solution.
To test the XSL and TOC output, enable the Table of Content as per the
screenshot above, modify the XSL template if needed, specify any optional
properties, select a file or folder in the WS Convert tab and choose either the
Convert or Merge button.
44
PDF Converter Services - User & Developer Guide
PDF Converter Services - User & Developer Guide - Version 8.0 - 07/12/15
© Copyright 2015, Muhimbi Ltd
Page 109 of 124
9 Troubleshooting
Although the MDCS is a robust and efficient solution, some questions may
arise during the day to day operation of the software. This section provides
some pointers to answer common questions.
If you still have questions after reading this chapter then please check out the
links in chapter 1 Introduction as well as our comprehensive Knowledge Base
..
9.1 Problems parsing the WSDL
By default the Conversion Service uses the host name of the local system as
the base address. Most web service client libraries deal with this correctly,
however if the service is exposed using a different machine name then you
may need to update the base address
to the system’s IP
-address.
In order to change this, modify the baseAddress attribute in the configuration
file and restart the service. For details see this Knowledge Base article
.
9.2 Converting documents takes a long time
In general the PDF Converter performs extremely well. However, depending on
the size and complexity of the documents that are being converted, the
conversion process may take some time to execute.
If conversion requests timeout then please have a look at the Administration
Guide, section 2.4.4.
9.3 The PDF file does not look the same as the source file
Although the MDCS converts documents with very high fidelity and reliability,
there are some situations that may cause the converted documents to look
different from the source files. The main reasons for this are as follows:
1. One or more fonts used by the document are not installed on the Document
Conversion Server. Ask your Administrator to install the correct fonts.
2.
The spacing of the characters in InfoPath documents doesn’t look correct.
Unfortunately InfoPath 2007 does not deal well with certain fonts, even
when these fonts have been installed on the server. Try using a different
font, creating a separate InfoPath Print View or switching to InfoPath 2010.
9.4 An evaluation message is displayed in each converted
document
When an evaluation message is displayed in each converted document then
something may be wrong with your license or your license has not been
installed. Please see section 2.3 of the Administration Guide for more details
about installing the license.
13
PDF Converter Services - User & Developer Guide
PDF Converter Services - User & Developer Guide - Version 8.0 - 07/12/15
© Copyright 2015, Muhimbi Ltd
Page 110 of 124
9.5 InfoPath Forms fail to convert
When InfoPath documents fail to convert then please consult Appendix - Using
InfoPath with External Data Sources in the Administration Guide or visit
http://support.muhimbi.com/entries/21278696-my-infopath-form-fails-to-
convert-how-can-i-troubleshoot-this
9.6 Converting non supported files
The PDF Converter supports a large number of source file formats. Support for
additional formats can be added by following the instructions in the
Administration Guide under Appendix - Creating Custom Converters.
Documents you may be interested
Documents you may be interested