Major Changes in XML 1.1

XML Tutorials - Herong's Tutorial Examples

∟Major Changes in XML 1.1

This section describes major changes introduced in XML 1.1 comparing with XML 1.0.

What are the major changes in XML 1.1? According the Extensible Markup Language (XML) 1.1 specification (Second Edition, published in September 2006), here are my understanding of major changes introduced in XML 1.1 over XML 1.0:

1. Definition of Name Characters - This change is described in the XML 1.1 specification as: "The overall philosophy of names has changed since XML 1.0. Whereas XML 1.0 provided a rigid definition of names, wherein everything that was not permitted was forbidden, XML 1.1 names are designed so that everything that is not forbidden (for a specific reason) is permitted. Since Unicode will continue to grow past version 4.0, further changes to XML can be avoided by allowing almost any character, including those not yet assigned, in names."

It sounds to me that XML 1.0 defines a fixed set of valid characters for creating names, and XML 1.1 defines an open set of valid characters. I need to find an example of an element name with a character that is valid in XML 1.1 but not valid in XML 1.0.

2. End-of-Line Handling - The end-of-line normalization rule has been expanded to include the New Line (NEL) character, #x85, used on IBM mainframe computers and the Unicode line separator character, #x2028. The end-of-line normalization can be summarized as:

#x0D#x0A   --> #x0A
#x0D#x85   --> #x0A
#x85       --> #x0A
#x2028     --> #x0A
#x0D (not followed by #x0A or #x85)   --> #x0A

3. Control Characters - Control characters from #x01 to #x1F are now allowed in XML 1.1 as character references. Control characters from #x7F to #x9F are now not allowed in XML 1.1 directly, but still allowed as character references.

4. Full Normalization - XML document creators should adhere to full normalization constraints defined in XML 1.1. And document processors should verify them. Using fully normalized documents ensures that identity comparisons of names, attribute values, and character content can be made correctly by simple binary comparison of Unicode strings.