"string" Datatype Values and Representations

This section describes the most commonly used built-in datatype, 'string'. Whitespace characters are preserved in 'string' values. But XML entity references are parsed in 'string' lexical representations.

The "string" datatype and its derived datatypes are the most commonly used built-in datatypes in XML documents. Let's take a closer look at the "string" datatype first.

"string" is a primitive datatype with a value set of all possible sequences of Unicode characters. A "string" value can be expressed in an XML document using a sequence of characters.

Parsed XML entity references are allowed in "string" lexical representations. But they will be parsed to obtain final "string" values. For example, 3 XML elements below are all valid and represent the same "string" value:

    <String>PI &gt; 3.14159</String>
    <String>PI > 3.14159</String>
    <String><![CDATA[PI > 3.14159]]></String>

Another note on "string" values is that whitespace characters, '\t', '\r', '\n' and ' ', are preserved in "string" values. For example, 3 XML elements below are all valid and represent 3 different "string" values:

    <String>Herong Yang</String>
    <String>Herong
Yang</String>
    <String>Herong
            Yang</String>

A close example of using "string" datatype in XML is the <PRE> element in HTML documents, where all whitespace characters are preserved.

Here is a sample XSD document that defines a sub element <String> to use "string" values:

<?xml version="1.0"?>
<!-- string_datatype_test.xsd
 - Copyright (c) 2013, HerongYang.com, All Rights Reserved.
-->
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="String_Datatype_Test">
  <xs:complexType>
    <xs:sequence>
      <xs:element name="String" type="xs:string" 
        maxOccurs="unbounded"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>
</xs:schema>

Here is a sample XML document that contains <String> elements to test this XSD document:

<?xml version="1.1"?>
<!-- string_datatype_test.xml
 - Copyright (c) 2013, HerongYang.com, All Rights Reserved.
-->
<String_Datatype_Test>
<!-- 3 valid "string" elements represent the same value -->
  <String>PI &gt; 3.14159</String>
  <String>PI > 3.14159</String>
  <String><![CDATA[PI > 3.14159]]></String>

<!-- 3 valid "string" elements represent different values -->
  <String>Herong Yang</String>
  <String>Herong
Yang</String>
  <String>Herong
          Yang</String>
          
<!-- 1 invalid "string" elements -->
  <String> Hello <b>Herong</b>! </String>
</String_Datatype_Test>

Compile and run XsdSchemaValidator.java. You see 1 error on the invalid "string" element:

c:\Progra~1\Java\jdk1.7.0_07\bin\java.exe XsdSchemaValidator 
   string_datatype_test.xsd string_datatype_test.xml

Error:
   Line number: 19
   Column number: 42
   Message: cvc-type.3.1.2: Element 'String' is a simple type, so it
   must have no element information item [children].

Failed with errors: 1

You can modify this example to try other "string" lexical representations and values.

Last update: 2013.

Table of Contents

 About This Book

 Introduction to XML Schema

 XML Editor and Schema Processor - XMLPad

 Java API for XML Processing - JAXP

 JAXP - XML Schema (XSD) Validation

 Xerces2 Java Parser - Java API of XML Parsers

 Using Xerces2 Java API

 XML Schema Language - Basics

 Introduction of XSD Built-in Datatypes

"string" and Its Derived Datatypes

"string" Datatype Values and Representations

 "normalizedString" Datatype Values and Representations

 "token" Datatype Values and Representations

 "language" Datatype Values and Representations

 "language" Datatype Values - Invalid Inputs

 "Name" Datatype Values and Representations

 "NMTOKEN" Datatype Values and Representations

 "NCName" Datatype Values and Representations

 "ENTITY" Datatype Values and Representations

 "ID" Datatype Values and Representations

 "IDREF" Datatype Values and Representations

 "decimal" and Its Derived Datatypes

 "dateTime" and Its Related Datatypes

 Miscellaneous Built-in Datatypes

 Facets, Constraining Facets and Restriction Datatypes

 "simpleType" - Defining Your Own Simple Datatypes

 Complex Element Declaration

 Identity-Constraints: unique, key and keyref

 Assertion as Custom Validation Rules

 XML Schema Location and Namespace in XML Documents

 Overriding Element Types in XML Documents

 Linking Multiple Schema Documents Together

 Glossary

 References

 PDF Printing Version