XML-to-JSON Conversion Rules

This section provides a simple set of XML-to-JSON Conversion Rules that preserves the structure of the original XML document. You can build your own converter with those rules.

If you are a developer, you may want to implement your own XML-to-JSON converter so that you have the full control of the output.

To implement own XML-to-JSON converter, you need define a set of conversion rules first. Here is my suggestion to build a simple converter that preserves the original XML document structure.

1. Convert the XML root element as a JSON object with a single property to record the root element tag, content, and attributes. The root element tag is used as the property name.

2. For a simple XML element, its content will be converted as a JSON primitive String value. If the XML document has XML Schema Instance types added, you may try to parse the element content as different JSON primitive types: String, Number, Boolean, and Null.

3. For a complex XML element, its body, including attributes, child elements, and text segments are converted as a JSON object.

4. With in a complex XML element, convert element's attributes first sequentially. Each attribute is converted as a JSON property with attribute's name as the property name prefixed with "@_".

5. With in a complex XML element, convert element's content after attributes sequentially looping through the mix of child elements and text segments.

6. With in a complex XML content, a child element is converted as a JSON property with child element name as the property name prefixed with "$n_" where n is the index of the child position.

7. With in a complex XML content, a child text segment is converted as a JSON property with a special the property name of "#n_text" where n is the index of the child text segment.

Here is a sample XML document, xml.html, that we can use to validate our conversion rules. Note that some elements and attributes are deprecated in latest HTML specifications. But it is still a valid XML document.

<html>
<body bgcolor="#dddddd">
<h4>My XML Web Page</h4>
<p>A simple text paragraph.</p>
<p>A complex paragraph with <em>highlight text</em>, <br/>line breaks,
  <font color="#ff0000" size="+1">font changes</font>, etc.
</p>
</body>
</html>

If you apply our conversion rules, the converted JSON should be:

{
  "html": {
    "$1_body": {
      "@_bgcolor": "#dddddd",
      "$1_h4": "My XML Web Page",
      "$2_p": "A simple text paragraph.",
      "$3_p": {
        "#1_text": "A complex paragraph with ",
        "$2_em": "highlight text",
        "#3_text": ", ",
        "$4_br": "",
        "#5_text": "line breaks,\n",
        "$6_font": {
          "@_color": "#ff0000",
          "@_size": "+1",
          "#1_text": "font changes"
        },
        "#7_text": ", etc.\n"
      }
    }
  }
}

I think the output is perfect! The original XML document structure is exactly preserved in the output JSON document. You can easily convert it back to the original XML document. What do you think?

Exercise: Build an XML-to-JSON converter in your favorite language using above conversion rules.

Table of Contents

 About This Book

 Introduction of XML (eXtensible Markup Language)

 XML File Syntax

 XML File Browsers

XML-JSON Document Conversion

 What Is JSON (JavaScript Object Notation)

 Convert XML Document to JSON Document

XML-to-JSON Conversion Rules

 XML-to-JSON Conversion Tool at onlinexmltools.com

 XML-to-JSON Conversion Library for Java

 XML-to-JSON Conversion Module for Python

 Convert JSON Document to XML Document

 JSON-to-XML Conversion Rules

 JSON-to-XML Conversion Tool at onlinexmltools.com

 JSON-to-XML Conversion Library for Java

 JSON-to-XML Conversion Module for Python

 DOM (Document Object Model) Programming Interface

 SAX (Simple API for XML) Programming Interface

 DTD (Document Type Definition) Introduction

 Syntaxes of DTD Statements

 Validating an XML Document against the Specified DTD Document Type

 XSD (XML Schema Definition) Introduction

 Syntaxes of XSD Statements

 Validating XML Documents Against Specified XML Schemas

 XSL (Extensible Stylesheet Language) Introduction

 Java Implementation of XSLT

 XSLT (XSL Transformations) Introduction

 XPath (XML Path) Language

 XSLT Elements as Programming Statements

 Control and Generate XML Element in the Result

 PHP Extensions for XML Manipulation

 Processing XML with Python Scripts

 XML Notepad - XML Editor

 XML Tools Plugin for Notepad++

 XML Plugin Packages for Atom Editor

 XML 1.1 Changes and Parsing Examples

 Archived Tutorials

 References

 Full Version in PDF/EPUB