Chinese Character String with GB18030 Encoding

This section providing information on handling Chinese character string literals in GB18030 encoding. header() function must be called to override the HTTP response header 'Content-type' setting to GB18030 in PHP 7.

To troubleshoot the issue presented in the previous tutorial, we should follow the PHP String Literal travel path:

P1. Key Sequences from keyboard
      |
      |- Text editor
      v
P2. PHP Script
      |
      |- PHP-CGI
      v
P3. HTML Document
      |
      |- Web server
      v
P4. HTTP Response
...

1. Open the PHP script file, String-GB18030.php, again with a different text editor. No issue found in those Chinese characters.

2. Run PHP-CGI.exe to process the PHP script and generate the HTML document:

C:\herong> \local\php\php-cgi String-GB18030.php
X-Powered-By: PHP/7.3.0
Content-type: text/html; charset=UTF-8

<html>
<meta http-equiv="Content-Type" content="text/html; charset=gb18030"/>
<body>
<b>Chinese string in GB18030 in PHP</b><br/>
??????????????????????????<br/>
</body>
</html>

I think I know the root cause of the issue. PHP-CGI.exe in PHP 7 engine automatically added "charset=UTF-8" in the Content-Type HTTP response header line.

Content-type: text/html; charset=UTF-8

And the Web browser takes the encoding setting from HTTP response header instead of from the HTML document header.

I had no problem with PHP 5, because it did not specify any encoding in the HTTP response header:

C:\herong> \local\php-5\php-cgi String-GB18030.php

Content-type: text/html
X-Powered-By: PHP/5.0.4

<html>
<meta http-equiv="Content-Type" content="text/html; charset=gb18030"/>
<body>
<b>Chinese string in GB18030 in PHP</b><br/>
??????????????????????????<br/>
</body>
</html>

In order to fix the issue, we need to call the header() function to override the default header line very early in the PHP script before outputting text in the HTML document:

<?php 
#- String-GB18030-Fixed.php
#- Copyright (c) 2005 HerongYang.com. All Rights Reserved.
#
  header('Content-Type: text/html; charset=GB18030');
  $help = '?????????????';
  print('<html>');
  print('<meta http-equiv="Content-Type"'.
    ' content="text/html; charset=gb18030"/>');
  print('<body>');
  print('<b>Chinese string in GB18030 in PHP - Fixed</b><br/>');
  print($help.'<br/>');
  print('</body>');
  print('</html>');
?>

Now copy String-GB18030-Fixed.php to \local\apache\htdocs.

Run Internet Explorer (IE) again with http://localhost/String-GB18030-Fixed.php. You should see Chinese characters displayed correctly:

Chinese Web Page Generated by PHP using GB18030
Chinese Web Page Generated by PHP using GB18030

This proves that the editor: notepad, the CGI program: PHP CGI, the Web server: Apache, and the Web browser: IE, all worked correctly with Chinese characters in GB18030 encoding.

Table of Contents

 About This Book

 PHP Installation on Windows Systems

 Integrating PHP with Apache Web Server

 charset="*" - Encodings on Chinese Web Pages

Chinese Characters in PHP String Literals

 String Data Type, Literals and Functions

 String Literal Travel Path

 Chinese Character String with UTF-8 Encoding

 Chinese Character String with GB18030 Encoding Error

Chinese Character String with GB18030 Encoding

 Chinese Character String with Big5 Encoding

 UTF-8 Encoding Pages with Big5 Characters

 Multibyte String Functions in UTF-8 Encoding

 Input Text Data from Web Forms

 Input Chinese Text Data from Web Forms

 MySQL - Installation on Windows

 MySQL - Connecting PHP to Database

 MySQL - Character Set and Encoding

 MySQL - Sending Non-ASCII Text to MySQL

 Retrieving Chinese Text from Database to Web Pages

 Input Chinese Text Data to MySQL Database

 Chinese Text Encoding Conversion and Corruptions

 Archived Tutorials

 References

 Full Version in PDF/EPUB