60
PDF 32000-1:2008
36
©
Adobe Systems Incorporated 2008 – All rights reserved
This filter shall only be applied to image XObjects, and not to inline images (see 8.9, "Images"). It is suitable
both for images that have a single colour component and for those that have multiple colour components. The
colour components in an image may have different numbers of bits per sample. Any value from 1 to 38 shall be
allowed.
NOTE 2
From a single JPEG2000 data stream, multiple versions of an image may be decoded. These different
versions form progressions along four degrees of freedom: sampling resolution, colour depth, band, and
location. For example, with a resolution progression, a thumbnail version of the image may be decoded from
the data, followed by a sequence of other versions of the image, each with approximately four times as many
samples (twice the width times twice the height) as the previous one. The last version is the full-resolution
image.
NOTE 3
Viewing and printing applications may gain performance benefits by using the resolution progression. If the
full-resolution image is densely sampled, the application may be able to select and decode only the data
making up a lower-resolution version, thereby spending less time decoding. Fewer bytes need be processed, a
particular benefit when viewing files over the Web. The tiling structure of the image may also provide benefits if
only certain areas of an image need to be displayed or printed.
NOTE 4
Information on these progressions is encoded in the data; no decode parameters are needed to describe them.
The decoder deals with any progressions it encounters to deliver the correct image data. Progressions that are
of no interest may simply have performance consequences.
The JPEG2000 specifications define two widely used formats, JP2 and JPX, for packaging the compressed
image data. JP2 is a subset of JPX. These packagings contain all the information needed to properly interpret
the image data, including the colour space, bits per component, and image dimensions. In other words, they
are complete descriptions of images (as opposed to image data that require outside parameters for correct
interpretation). The JPXDecode filter shall expect to read a full JPX file structure—either internal to the PDF file
or as an external file.
NOTE 5
To promote interoperability, the specifications define a subset of JPX called JPX baseline (of which JP2 is also
a subset). The complete details of the baseline set of JPX features are contained in ISO/IEC 15444-2,
Information Technology—JPEG 2000 Image Coding System: Extensions (see the Bibliography). See also
<http://www.jpeg.org/jpeg2000/>.
Data used in PDF image XObjects shall be limited to the JPX baseline set of features, except for enumerated
colour space 19 (CIEJab). In addition, enumerated colour space 12 (CMYK), which is part of JPX but not JPX
baseline, shall be supported in a PDF.
A JPX file describes a collection of channels that are present in the image data. A channel may have one of
three types:
•
An ordinary channel contains values that, when decoded, shall become samples for a specified colour
component.
•
An opacity channel provides samples that shall be interpreted as raw opacity information.
•
A premultiplied opacity channel shall provide samples that have been multiplied into the colour samples of
those channels with which it is associated.
Opacity and premultiplied opacity channels shall be associated with specific colour channels. There shall not
be more than one opacity channel (of either type) associated with a given colour channel.
EXAMPLE
It is possible for one opacity channel to apply to the red samples and another to apply to the green and
blue colour channels of an RGB image.
NOTE 6
The method by which the opacity information is to be used is explicitly not specified, although one possible
method shows a normal blending mode.
In addition to using opacity channels for describing transparency, JPX files also have the ability to specify
chroma-key transparency. A single colour may be specified by giving an array of values, one value for each
colour channel. Any image location that matches this colour shall be considered to be completely transparent.
60
©
Adobe Systems Incorporated 2008 – All rights reserved
37
PDF 32000-1:2008
Images in JPX files may have one of the following colour spaces:
•
A predefined colour space, chosen from a list of enumerated colour spaces. (Two of these are actually
families of spaces and parameters are included.)
•
A restricted ICC profile. These are the only sorts of ICC profiles that are allowed in JP2 files.
•
An input ICC profile of any sort defined by ICC-1.
•
A vendor-defined colour space.
More than one colour space may be specified for an image, with each space being tagged with a precedence
and an approximation value that indicates how well it represents the preferred colour space. In addition, the
image’s colour space may serve as the foundation for a palette of colours that are selected using samples
coming from the image’s data channels: the equivalent of an Indexed colour space in PDF.
There are other features in the JPX format beyond describing a simple image. These include provisions for
describing layering and giving instructions on composition, specifying simple animation, and including generic
XML metadata (along with JPEG2000-specific schemas for such data). Relevant metadata should be
replicated in the image dictionary’s Metadata stream in XMP format (see 14.3.2, "Metadata Streams").
When using the JPXDecode filter with image XObjects, the following changes to and constraints on some
entries in the image dictionary shall apply (see 8.9.5, "Image Dictionaries" for details on these entries):
•
Width and Height shall match the corresponding width and height values in the JPEG2000 data.
•
ColorSpace shall be optional since JPEG2000 data contain colour space specifications. If present, it shall
determine how the image samples are interpreted, and the colour space specifications in the JPEG2000
data shall be ignored. The number of colour channels in the JPEG2000 data shall match the number of
components in the colour space; a conforming writer shall ensure that the samples are consistent with the
colour space used.
•
Any colour space other than Pattern may be specified. If an Indexed colour space is used, it shall be
subject to the PDF limit of 256 colours. If the colour space does not match one of JPX’s enumerated colour
spaces (for example, if it has two colour components or more than four), it should be specified as a vendor
colour space in the JPX data.
•
If ColorSpace is not present in the image dictionary, the colour space information in the JPEG2000 data
shall be used. A JPEG2000 image within a PDF shall have one of: the baseline JPX colorspaces; or
enumerated colorspace 19 (CIEJab) or enumerated colorspace 12 (CMYK); or at least one ICC profile that
is valid within PDF. Conforming PDF readers shall support the JPX baseline set of enumerated colour
spaces; they shall also be responsible for dealing with the interaction between the colour spaces and the
bit depth of samples.
•
If multiple colour space specifications are given in the JPEG2000 data, a conforming reader should
attempt to use the one with the highest precedence and best approximation value. If the colour space is
given by an unsupported ICC profile, the next lower colour space, in terms of precedence and
approximation value, shall be used. If no supported colour space is found, the colour space used shall be
DeviceGray, DeviceRGB, or DeviceCMYK, depending on the whether the number of channels in the
JPEG2000 data is 1,3, or 4.
•
SMaskInData specifies whether soft-mask information packaged with the image samples shall be used
(see 11.6.5.3, "Soft-Mask Images"); if it is, the SMask entry shall not be present. If SMaskInData is
nonzero, there shall be only one opacity channel in the JPEG2000 data and it shall apply to all colour
channels.
•
Decode shall be ignored, except in the case where the image is treated as a mask; that is, when
ImageMask is true. In this case, the JPEG2000 data shall provide a single colour channel with 1-bit
samples.
52
PDF 32000-1:2008
38
©
Adobe Systems Incorporated 2008 – All rights reserved
7.4.10
Crypt Filter
The Crypt filter (PDF 1.5) allows the document-level security handler (see 7.6, "Encryption") to determine
which algorithms should be used to decrypt the input data. The Name parameter in the decode parameters
dictionary for this filter (see Table 14) shall specify which of the named crypt filters in the document (see 7.6.5,
"Crypt Filters") shall be used. The Crypt filter shall be the first filter in the Filter array entry.
In addition, the decode parameters dictionary may include entries that are private to the security handler.
Security handlers may use information from both the crypt filter decode parameters dictionary and the crypt
filter dictionaries (see Table 25) when decrypting data or providing a key to decrypt data.
NOTE
When adding private data to the decode parameters dictionary, security handlers should name these entries in
conformance with the PDF name registry (see Annex E).
If a stream specifies a crypt filter, then the security handler does not apply "Algorithm 1: Encryption of data
using the RC4 or AES algorithms" in 7.6.2, "General Encryption Algorithm," to the key prior to decrypting the
stream. Instead, the security handler shall decrypt the stream using the key as is. Sub-clause 7.4, "Filters,"
explains how a stream specifies filters.
7.5
File Structure
7.5.1
General
This sub-clause describes how objects are organized in a PDF file for efficient random access and incremental
update. A basic conforming PDF file shall be constructed of following four elements (see Figure 2):
•
A one-line header identifying the version of the PDF specification to which the file conforms
•
A body containing the objects that make up the document contained in the file
•
A cross-reference table containing information about the indirect objects in the file
•
A trailer giving the location of the cross-reference table and of certain special objects within the body of the
file
This initial structure may be modified by later updates, which append additional elements to the end of the file;
see 7.5.6, "Incremental Updates," for details.
Table 14 – Optional parameters for Crypt filters
Key
Type
Value
Type
name
(Optional) If present, shall be CryptFilterDecodeParms for a Crypt
filter decode parameter dictionary.
Name
name
(Optional) The name of the crypt filter that shall be used to decrypt this
stream. The name shall correspond to an entry in the CF entry of the
encryption dictionary (see Table 20) or one of the standard crypt filters
(see Table 26).
Default value: Identity.
56
©
Adobe Systems Incorporated 2008 – All rights reserved
39
PDF 32000-1:2008
Figure 2 – Initial structure of a PDF file
As a matter of convention, the tokens in a PDF file are arranged into lines; see 7.2, "Lexical Conventions."
Each line shall be terminated by an end-of-line (EOL) marker, which may be a CARRIAGE RETURN (0Dh), a
LINE FEED (0Ah), or both. PDF files with binary data may have arbitrarily long lines.
NOTE
To increase compatibility with compliant programs that process PDF files, lines that are not part of stream
object data are limited to no more than 255 characters, with one exception. Beginning with PDF 1.3, the
Contents string of a signature dictionary (see 12.8, "Digital Signatures") is not subject to the restriction on line
length.
The rules described here are sufficient to produce a basic conforming PDF file. However, additional rules apply
to organizing a PDF file to enable efficient incremental access to a document’s components in a network
environment. This form of organization, called Linearized PDF, is described in Annex F.
7.5.2
File Header
The first line of a PDF file shall be a he
ader consisting of the 5 characters %PDF
–
followed by a version
number of the form 1.N, where N is a digit between 0 and 7.
A conforming reader shall accept files with any of the following headers:
%PDF
–
1. 0
%PDF
–
1. 1
%PDF
–
1. 2
%PDF
–
1. 3
%PDF
–
1. 4
%PDF
–
1. 5
%PDF
–
1. 6
%PDF
–
1. 7
Beginning with PDF 1.4, the Version entry in the document’s catalog dictionary (located via the Root entry in
the file’s trailer, as described in 7.5.5, "File Trailer"), if present, shall be used instead of the version specified in
the Header.
Header
Body
Cross-reference
table
Trailer
52
PDF 32000-1:2008
40
©
Adobe Systems Incorporated 2008 – All rights reserved
NOTE
This allows a conforming writer to update the version using an incremental update (see 7.5.6, "Incremental
Updates").
Under some conditions, a conforming reader may be able to process PDF files conforming to a later version
than it was designed to accept. New PDF features are often introduced in such a way that they can safely be
ignored by a conforming reader that does not understand them (see I.2, "PDF Version Numbers").
This part of ISO 32000 defines the Extensions entry in the document’s catalog dictionary. If present, it shall
identify any developer-defined extensions that are contained in this PDF file. See 7.12, “Extensions Dictionary”.
If a PDF file contains binary data, as most do (see 7.2, "Lexical Conventions"), the header line shall be
immediately followed by a comment line containing at least four binary characters—that is, characters whose
codes are 128 or greater. This ensures proper behaviour of file transfer applications that inspect data near the
beginning of a file to determine whether to treat the file’s contents as text or as binary.
7.5.3
File Body
The body of a PDF file shall consist of a sequence of indirect objects representing the contents of a document.
The objects, which are of the basic types described in 7.3, "Objects," represent components of the document
such as fonts, pages, and sampled images. Beginning with PDF 1.5, the body can also contain object streams,
each of which contains a sequence of indirect objects; see 7.5.7, "Object Streams."
7.5.4
Cross-Reference Table
The cross-reference table contains information that permits random access to indirect objects within the file so
that the entire file need not be read to locate any particular object. The table shall contain a one-line entry for
each indirect object, specifying the byte offset of that object within the body of the file. (Beginning with PDF 1.5,
some or all of the cross-reference information may alternatively be contained in cross-reference streams; see
7.5.8, "Cross-Reference Streams.")
NOTE 1
The cross-reference table is the only part of a PDF file with a fixed format, which permits entries in the table to
be accessed randomly.
The table comprises one or more cross-reference sections. Initially, the entire table consists of a single section
(or two sections if the file is linearized; see Annex F). One additional section shall be added each time the file is
incrementally updated (see 7.5.6, "Incremental Updates").
Each cross-reference section shall begin with a line containing the keyword xref. Following this line shall be
one or more cross-reference subsections, which may appear in any order. For a file that has never been
incrementally updated, the cross-reference section shall contain only one subsection, whose object numbering
begins at 0.
NOTE 2
The subsection structure is useful for incremental updates, since it allows a new cross-reference section to be
added to the PDF file, containing entries only for objects that have been added or deleted.
Each cross-reference subsection shall contain entries for a contiguous range of object numbers. The
subsection shall begin with a line containing two numbers separated by a SPACE (20h), denoting the object
number of the first object in this subsection and the number of entries in the subsection.
EXAMPLE 1
The following line introduces a subsection containing five objects numbered consecutively from 28 to 32.
28 5
A given object number shall not have an entry in more than one subsection within a single section.
Following this line are the cross-reference entries themselves, one per line. Each entry shall be exactly 20
bytes long, including the end-of-line marker. There are two kinds of cross-reference entries: one for objects that
are in use and another for objects that have been deleted and therefore are free. Both types of entries have
50
©
Adobe Systems Incorporated 2008 – All rights reserved
41
PDF 32000-1:2008
similar basic formats, distinguished by the keyword n (for an in-use entry) or f (for a free entry). The format of
an in-use entry shall be:
nnnnnnnnnn ggggg
n
eol
where:
nnnnnnnnnn shall be a 10-digit byte offset in the decoded stream
ggggg shall be a 5-digit generation number
n shall be a keyword identifying this as an in-use entry
eol shall be a 2-character end-of-line sequence
The byte offset in the decoded stream shall be a 10-digit number, padded with leading zeros if necessary,
giving the number of bytes from the beginning of the file to the beginning of the object. It shall be separated
from the generation number by a single SPACE. The generation number shall be a 5-digit number, also padded
with leading zeros if necessary. Following the generation number shall be a single SPACE, the keyword n, and
a 2-character end-of-line sequence consisting of one of the following: SP CR, SP LF, or CR LF. Thus, the
overall length of the entry shall always be exactly 20 bytes.
The cross-reference entry for a free object has essentially the same format, except that the keyword shall be f
instead of n and the interpretation of the first item is different:
nnnnnnnnnn ggggg
f
eol
where:
nnnnnnnnnn shall be the 10-digit object number of the next free object
ggggg shall be a 5-digit generation number
f shall be a keyword identifying this as a free entry
eol shall be a 2-character end-of-line sequence
There are two ways an entry may be a member of the free entries list. Using the basic mechanism the free
entries in the cross-reference table may form a linked list, with each free entry containing the object number of
the next. The first entry in the table (object number 0) shall always be free and shall have a generation number
of 65,535; it is shall be the head of the linked list of free objects. The last free entry (the tail of the linked list)
links back to object number 0. Using the second mechanism, the table may contain other free entries that link
back to object number 0 and have a generation number of 65,535, even though these entries are not in the
linked list itself.
Except for object number 0, all objects in the cross-reference table shall initially have generation numbers of 0.
When an indirect object is deleted, its cross-reference entry shall be marked free and it shall be added to the
linked list of free entries. The entry’s generation number shall be incremented by 1 to indicate the generation
number to be used the next time an object with that object number is created. Thus, each time the entry is
reused, it is given a new generation number. The maximum generation number is 65,535; when a cross-
reference entry reaches this value, it shall never be reused.
The cross-reference table (comprising the original cross-reference section and all update sections) shall
contain one entry for each object number from 0 to the maximum object number defined in the file, even if one
or more of the object numbers in this range do not actually occur in the file.
EXAMPLE 2
The following shows a cross-reference section consisting of a single subsection with six entries: four that
are in use (objects number 1, 2, 4, and 5) and two that are free (objects number 0 and 3). Object number
3 has been deleted, and the next object created with that object number is given a generation number of 7.
59
PDF 32000-1:2008
42
©
Adobe Systems Incorporated 2008 – All rights reserved
xref
0 6
0000000003 65535 f
0000000017 00000 n
0000000081 00000 n
0000000000 00007 f
0000000331 00000 n
0000000409 00000 n
EXAMPLE 3
The following shows a cross-reference section with four subsections, containing a total of five entries. The
first subsection contains one entry, for object number 0, which is free. The second subsection contains
one entry, for object number 3, which is in use. The third subsection contains two entries, for objects
number 23 and 24, both of which are in use. Object number 23 has been reused, as can be seen from the
fact that it has a generation number of 2. The fourth subsection contains one entry, for object number 30,
which is in use.
xref
0 1
0000000000 65535 f
3 1
0000025325 00000 n
23 2
0000025518 00002 n
0000025635 00000 n
30 1
0000025777 00000 n
See H.7, "Updating Example", for a more extensive example of the structure of a PDF file that has been
updated several times.
7.5.5
File Trailer
The trailer of a PDF file enables a conforming reader to quickly find the cross-reference table and certain
special objects. Conforming readers should read a PDF file from its end. The last line of the file shall contain
only the end-of-file marker, %%EOF. The two preceding lines shall contain, one per line and in order, the
keyword startxref and the byte offset in the decoded stream from the beginning of the file to the beginning of
the xref keyword in the last cross-reference section. The startxref line shall be preceded by the trailer
dictionary, consisting of the keyword trailer followed by a series of key-value pairs enclosed in double angle
brackets (<< … >>) (using LESS-THAN SIGNs (3Ch) and GREATER-THAN SIGNs (3Eh)). Thus, the trailer has
the following overall structure:
trailer
<< key
1
value
1
key
2
value
2
…
key
n
value
n
>>
startxref
Byte_offset_of_last_cross-reference_section
%%EOF
Documents you may be interested
Documents you may be interested