XSD Tutorials - Herong's Tutorial Examples - v5.23, by Herong Yang
"language" Datatype Values and Representations
This section describes the first derived datatype of 'token': 'language'. Input strings are converted to 'token' values before they are matched against the '[a-zA-Z]{1,8}(-[a-zA-Z0-9]{1,8})*' pattern.
In XSD 1.1 specification, "token" is used to derive several other built-in datatypes for various specific applications, because it has a clean value set.
The first built-in datatype derived from "token" is "language". Let's look at it now.
"language" is a datatype derived from "token" datatype by limiting values to those satisfy this regular expression pattern: "[a-zA-Z]{1,8}(-[a-zA-Z0-9]{1,8})*".
With this definition, not all sequences of characters are valid "language" lexical representations. To validate and evaluate "language" lexical representations, you can use these 2 steps:
"language" datatype is designed primarily to support the "lang" attribute in XML 1.1 specification to all users to specify the language in which the XML element is written. Here is an example of a <p> element written in Great Britain English and US English:
<p xml:lang="en-GB">What colour is it?</p> <p xml:lang="en-US">What color is it?</p>
Other supported language codes are defined in the "IETF BCP 47" standard. Some examples are listed below:
en-US For US English fr-CA For Canadian French pt-BR For Brazilian Portuguese zh-Hans For Chinese written in Simplified Chinese script zh-Hant For Chinese written in Traditional Chinese script nan-Hant-TW For Min Nan Chinese as spoken in Taiwan
The global attribute "lang" in HTML document is a good example of using "language" values.
Here is a sample XSD document that defines a sub element <Language> to use "language" values:
<?xml version="1.1"?> <!-- language_datatype_test.xsd - Copyright (c) 2002-2013 HerongYang.com. All Rights Reserved. --> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="Language_Datatype_Test"> <xs:complexType> <xs:sequence> <xs:element name="Language" type="xs:language" maxOccurs="unbounded"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>
Here is a sample XML document that contains <Language> elements conforming with that definition:
<?xml version="1.1"?> <!-- language_datatype_test.xml - Copyright (c) 2002-2013 HerongYang.com. All Rights Reserved. --> <Language_Datatype_Test> <!-- 3 valid "Language" elements represent the same value --> <Language>en-US</Language> <Language> en-US</Language> <Language>en-US </Language> <!-- 4 valid "token" elements represent different values --> <Language>en</Language> <Language>fr-FR</Language> <Language>nan-Hant-TW</Language> <Language>xx-Xxxx-XXXXXX</Language> </Language_Datatype_Test>
Table of Contents
XML Editor and Schema Processor - XMLPad
Java API for XML Processing - JAXP
JAXP - XML Schema (XSD) Validation
Xerces2 Java Parser - Java API of XML Parsers
Introduction of XSD Built-in Datatypes
►"string" and Its Derived Datatypes
"string" Datatype Values and Representations
"normalizedString" Datatype Values and Representations
"token" Datatype Values and Representations
►"language" Datatype Values and Representations
"language" Datatype Values - Invalid Inputs
"Name" Datatype Values and Representations
"NMTOKEN" Datatype Values and Representations
"NCName" Datatype Values and Representations
"ENTITY" Datatype Values and Representations
"ID" Datatype Values and Representations
"IDREF" Datatype Values and Representations
"decimal" and Its Derived Datatypes
"dateTime" and Its Related Datatypes
Miscellaneous Built-in Datatypes
Facets, Constraining Facets and Restriction Datatypes
"simpleType" - Defining Your Own Simple Datatypes
Identity-Constraints: unique, key and keyref
Assertion as Custom Validation Rules
XML Schema Location and Namespace in XML Documents
Overriding Element Types in XML Documents