42
1. The document type must be defined using one of the implementation methods defined by
the W3C. Currently this is limited to XML DTDs, but XML Schema will be available soon.
The rest of this section refers to "DTDs" although other implementations are possible.
2. The DTD which defines the document type must have a unique identifier as defined in
Naming Rules [p.18] that uses the string "XHTML" NOT in its first token of the public text
description.
3. The DTD which defines the document type must include, at a minimum, the Hypertext, Text,
and List modules defined in this specification.
4. For each of the W3C-defined modules that are included, all of the elements, attributes,
types of attributes (including any required enumerated lists), and any required minimal
content models must be included (and optionally extended) in the document type's content
model. When content models are extended, all of the elements and attributes (along with
their types or any required enumerated value lists) required in the original content model
must continue to be required.
5. The DTD which defines the document type may define additional elements and attributes.
However, these must be in their own XML namespace [XMLNAMES] [p.170] .
3.3. XHTML Family Module Conformance
This specification defines a method for defining XHTML-conforming modules. A module
conforms to this specification when it meets all of the following criteria:
1. The document type must be defined using one of the implementation methods defined by
the W3C. Currently this is limited to XML DTDs, but XML Schema will be available soon.
The rest of this section refers to "DTDs" although other implementations are possible.
2. The DTD which defines the module must have a unique identifier as defined in Naming
Rules [p.18] .
3. When the module is defined using an XML DTD, the module must insulate its parameter
entity names through the use of unique prefixes or other, similar methods.
4. The module definition must have a prose definition that describes the syntactic and
semantic requirements of the elements, attributes, and/or content models that it declares.
5. The module definition must not reuse any element names that are defined in other
W3C-defined modules, except when the content model and semantics of those elements
are either identical to the original or an extension of the original, or when the reused
element names are within their own namespace (see below).
6. The module definition's elements and attributes must be part of an XML namespace
[XMLNAMES] [p.170] . If the module is defined by an organization other than the W3C, this
namespace must NOT be the same as the namespace in which other W3C modules are
defined.
3.4. XHTML Family Document Conformance
A conforming XHTML family document is a valid instance of an XHTML Host Language
Conforming Document Type.
- 16 -
Modularization of XHTML
3.3. XHTML Family Module Conformance
42
3.5. XHTML Family User Agent Conformance
A conforming user agent must meet all of the following criteria (as defined in [XHTML1] [p.170] ):
1. In order to be consistent with the XML 1.0 Recommendation [XML] [p.170] , the user agent
must parse and evaluate an XHTML document for well-formedness. If the user agent claims
to be a validating user agent, it must also validate documents against their referenced DTDs
according to [XML] [p.170] .
2. When the user agent claims to support facilities defined within this specification or required
by this specification through normative reference, it must do so in ways consistent with the
facilities' definition.
3. When a user agent processes an XHTML document as generic [XML] [p.170] , it shall only
recognize attributes of type ID (e.g., the id attribute on most XHTML elements) as
fragment identifiers.
4. If a user agent encounters an element it does not recognize, it must continue to process the
children of that element. If the content is text, the text must be presented to the user.
5. If a user agent encounters an attribute it does not recognize, it must ignore the entire
attribute specification (i.e., the attribute and its value).
6. If a user agent encounters an attribute value it doesn't recognize, it must use the default
attribute value.
7. If it encounters an entity reference (other than one of the predefined entities) for which the
user agent has processed no declaration (which could happen if the declaration is in the
external subset which the user agent hasn't read), the entity reference should be rendered
as the characters (starting with the ampersand and ending with the semi-colon) that make
up the entity reference.
8. When rendering content, user agents that encounter characters or character entity
references that are recognized but not renderable should display the document in such a
way that it is obvious to the user that normal rendering has not taken place.
9. White space is handled according to the following rules. The following characters are
defined in [XML] [p.170] as white space characters:
SPACE ( )
HORIZONTAL TABULATION (	)
CARRIAGE RETURN (
)
LINE FEED (
)
The XML processor normalizes different systems' line end codes into one single LINE
FEED character, that is passed up to the application.
The user agent must process white space characters in the data received from the XML
processor as follows:
All white space surrounding block elements should be removed.
Comments are removed entirely and do not affect white space handling. One white
space character on either side of a comment is treated as two white space characters.
- 17 -
3.5. XHTML Family User Agent Conformance
Modularization of XHTML
43
When the 'xml:space' attribute is set to 'preserve', white space characters must be
preserved and consequently LINE FEED characters within a block must not be
converted.
When the 'xml:space' attribute is not set to 'preserve', then:
Leading and trailing white space inside a block element must be removed.
LINE FEED characters must be converted into one of the following characters: a
SPACE character, a ZERO WIDTH SPACE character (​), or no character
(i.e. removed). The choice of the resulting character is user agent dependent and
is conditioned by the script property of the characters preceding and following the
LINE FEED character.
A sequence of white space characters without any LINE FEED characters must be
reduced to a single SPACE character.
A sequence of white space characters with one or more LINE FEED characters
must be reduced in the same way as a single LINE FEED character.
White space in attribute values is processed according to [XML] [p.170] .
Note (informative): In determining how to convert a LINE FEED character a user agent
should consider the following cases, whereby the script of characters on either side of the
LINE FEED determines the choice of the replacement. Characters of COMMON script (such
as punctuation) are treated as the same as the script on the other side:
1. If the characters preceding and following the LINE FEED character belong to a script in
which the SPACE character is used as a word separator, the LINE FEED character
should be converted into a SPACE character. Examples of such scripts are Latin,
Greek, and Cyrillic.
2. If the characters preceding and following the LINE FEED character belong to an
ideographic-based script or writing system in which there is no word separator, the
LINE FEED should be converted into no character. Examples of such scripts or writing
systems are Chinese, Japanese.
3. If the characters preceding and following the LINE FEED character belong to a non
ideographic-based script in which there is no word separator, the LINE FEED should be
converted into a ZERO WIDTH SPACE character (​) or no character.
Examples of such scripts are Thai, Khmer.
4. If none of the conditions in (1) through (3) are true, the LINE FEED character should be
converted into a SPACE character.
The Unicode [UNICODE] [p.170] technical report TR#24 (Script Names) provides an
assignment of script names to all characters.
3.6. Naming Rules
XHTML Host Language document types must adhere to strict naming conventions so that it is
possible for software and users to readily determine the relationship of document types to
XHTML. The names for document types implemented as XML document type definitions are
defined through Formal Public Identifiers (FPIs). Within FPIs, fields are separated by double
- 18 -
Modularization of XHTML
3.6. Naming Rules
39
slash character sequences (//). The various fields must be composed as follows:
1. The leading field must be "-" to indicate a privately defined resource.
2. The second field must contain the name of the organization responsible for maintaining the
named item. There is no formal registry for these organization names. Each organization
should define a name that is unique. The name used by the W3C is, for example, W3C.
3. The third field contains two constructs: the public text class followed by the public text
description. The first token in the third field is the public text class which should adhere to
ISO 8879 Clause 10.2.2.1 Public Text Class. Only XHTML Host Language conforming
documents should begin the public text description with the token XHTML. The public text
description should contain the string XHTML if the document type is Integration Set
conforming. The field must also contain an organization-defined unique identifier (e.g.,
MyML 1.0). This identifier should be composed of a unique name and a version identifier
that can be updated as the document type evolves.
4. The fourth field defines the language in which the item is developed (e.g., EN).
Using these rules, the name for an XHTML Host Language conforming document type might be
-//MyCompany//DTD XHTML MyML 1.0//EN. The name for an XHTML family conforming
module might be -//MyCompany//ELEMENTS XHTML MyElements 1.0//EN. The name for
an XHTML Integration Set conforming document type might be -//MyCompany//DTD
Special Markup with XHTML//EN.
3.7. XHTML Module Evolution
Each module defined in this specification is given a unique identifier that adheres to the naming
rules in the previous section. Over time, a module may evolve. A logical ramification of such
evolution may be that some aspects of the module are no longer compatible with its previous
definition. To help ensure that document types defined against modules defined in this
specification continue to operate, the identifiers associated with a module that changes will be
updated. Specifically, the Formal Public Identifier and System Identifier of the module will be
changed by modifying the version identifier included in each. Document types that wish to
incorporate the updated functionality will need to be similarly updated.
In addition, the earlier version(s) of the module will continue to be available via its earlier, unique
identifier(s). In this way, document types developed using XHTML modules will continue to
function seamlessly using their original definitions even as the collection expands and evolves.
Similarly, document instances written against such document types will continue to validate
using the earlier module definitions.
Other XHTML Family Module and Document Type authors are encouraged to adopt a similar
strategy to ensure the continued functioning of document types based upon those modules and
document instances based upon those document types.
- 19 -
3.7. XHTML Module Evolution
Modularization of XHTML
3
- 20 -
Modularization of XHTML
3.7. XHTML Module Evolution
Documents you may be interested
Documents you may be interested