DOCTYPE Element in HTML Documents

This section provides a quick introduction on the DOCTYPE elements in HTML documents.

According to the HTML specification, every HTML document should have a "DOCTYPE" element located before the "html" element. The "DOCTYPE" element is used to specify which DTD (Document Type Definition) schema this HTML document follows.

For example, the following HTML document example, HTML-Exsmples.html, uses a "DOCTYPE" element to specify HTML 4.0 Loose DTD schema:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"
 "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body bgcolor="#ddddff"><p>Hello World!</p></body></html>

If you dump the DOMNode object tree, you should see the "DOCTYPE" element is represented as the first child node of the HTML document.

herong> php Traverses-HTML-DOM-Tree.php HTML-Example.html

#document= -1, 2, 13 other: (...)
 html= -1, -1, 10 other: (...)
 html= 0, 1, 1 other: (...)
  body= 1, 1, 1 other: (...)
   @bgcolor= -1, 1, 2 attribute: (#ddddff)
   @ #text= -1, -1, 3 text: (#ddddff)
   p= 0, 1, 1 other: (...)
    #text= -1, -1, 3 text: (Hello World!)

As you can see from the output:

By the way, the loadHTML() will automatically add a default "DOCTYPE" element into the DOMNode object tree, if the original HTML document string does not have the "DOCTYPE" element. See the example below:

herong> type Hello-Formatted.html

<html>
  <head>
    <title>
      Hello
    </title>
  </head>
  <body bgcolor="#ddddff">
    <p>
      Hello World!
    </p>
  </body>
</html>

herong$> php Remove-Whitespaces-in-HTML.php Hello-Formatted.html

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" ...>
<html><head><title>Hello</title></head><body bgcolor="#ddddff">\
<p>Hello World!</p></body></html>

Table of Contents

 About This Book

 Introduction and Installation of PHP

 Managing PHP Engine and Modules on macOS

 Managing PHP Engine and Modules on CentOS

 MySQLi Module - Accessing MySQL Server

DOM Module - Parsing HTML Documents

 DOM (Document Object Model) Module

 Parse and Traverse HTML Documents

 Build New HTML Documents

 Load HTML Documents with LIBXML_NOBLANKS

 Remove Whitespaces in HTML Documents

DOCTYPE Element in HTML Documents

 Remove Dummy Elements in HTML Documents

 Install DOM Extension on CentOS

 GD Module - Manipulating Images and Pictures

 Zip Module - Managing ZIP Archive Files

 SOAP Module - Creating and Calling Web Services

 SOAP Module - Server Functions and Examples

 References

 Full Version in PDF/EPUB