Non-ASCII Test Result Analysis

This section provides a detailed analysis of the test results of a PHP script storing non-ASCII characters.

In previous sections, we reviewed MySQL's character set encoding conversions happened between client programs and MySQL server. Now I want to look again at the PHP test script that stores non-ASCII characters to Comment_Mixed table's text columns defined with different encodings.

<?php 
#- MySQL-Insert-Non-ASCII-Review.php
#- Copyright (c) 2005 HerongYang.com. All Rights Reserved.
#
  $con = mysqli_connect("localhost", "Herong", "TopSecret");
  $ok = mysqli_select_db($con, "HerongDB");
  $test_name = "Non-ASCII Test";

# Show character set encoding variables
  $sql = "SHOW VARIABLES LIKE 'character_set_%'";
  $res = mysqli_query($con, $sql);
  while ($row = mysqli_fetch_array($res)) {
    print($row['Variable_name']." = ".$row['Value']."\n");
  }  
  mysqli_free_result($res);

# Delete the record
  $sql = "DELETE FROM Comment_Mixed WHERE Test_Name ='$test_name'";
  mysqli_query($con, $sql);
  print("\nNumber of rows deleted: ".mysqli_affected_rows($con)."\n");

# Build the SQL INSERT statement
  $sql = <<<END_OF_MESSAGE
INSERT INTO Comment_Mixed (Test_name, String_ASCII, 
    String_Latin1, String_UTF8, String_GBK, String_Big5)
  VALUES ('$test_name', 
     'Television', 
    X'54E96CE9766973696E6F', 
    X'E794B5E8A786E69CBA2FE99BBBE8A696E6A99F', 
    X'B5E7CAD3BBFA', 
    X'B971B5F8BEF7');
END_OF_MESSAGE;

# Run the SQL statement
  if (mysqli_query($con, $sql)) {
    print("\nNumber of rows inserted: ".mysqli_affected_rows($con)."\n");
  } else {
    print("SQL statement failed.\n");
    print(mysqli_errno($con).": ".mysqli_error($con)."\n"); 
  }

# Get the recode back
  $sql = "SELECT * FROM Comment_Mixed"
    . " WHERE Test_Name = '$test_name'";
  $res = mysqli_query($con, $sql);
  if ($row = mysqli_fetch_array($res)) {
    print("\nTest Name = ".$row['Test_Name']."\n");
    print("   String_ASCII: 0x".bin2hex($row['String_ASCII'])."\n");
    print("   String_Latin1: 0x".bin2hex($row['String_Latin1'])."\n");
    print("   String_UTF8: 0x".bin2hex($row['String_UTF8'])."\n");
    print("   String_GBK: 0x".bin2hex($row['String_GBK'])."\n");
    print("   String_Big5: 0x".bin2hex($row['String_Big5'])."\n");
  }  
  mysqli_free_result($res);

  mysqli_close($con); 
?>

The test PHP script was modified to show the current character set settings. Here is the output from PHP 7 and MySQL 8:

C:\herong> \local\php\php MySQL-Insert-Non-ASCII-Review.php

character_set_client = utf8mb4
character_set_connection = utf8mb4
character_set_database = utf8mb4
character_set_filesystem = binary
character_set_results = utf8mb4
character_set_server = utf8mb4
character_set_system = utf8
character_sets_dir = \local\mysql\share\charsets\

Number of rows deleted: 1

Number of rows inserted: 1

Test Name = Non-ASCII Test
   String_ASCII: 0x54656c65766973696f6e
   String_Latin1: 0x54c3a96cc3a9766973696e6f
   String_UTF8: 0xe794b5e8a786e69cba2fe99bbbe8a696e6a99f
   String_GBK: 0xe794b5e8a786e69cba
   String_Big5: 0xe99bbbe8a696e6a99f

We know that the test failed on 3 columns: String_Latin1, String_GBK, and String_Big5. Based on MySQL character set variable settings, the INSERT statement was executed successfully and non-ASCII characters (represented as HEX values) were stored correctly. Here is my explanation:

But when the SELECT statement was executed, non-ASCII characters fetched from those non-ASCII columns were not properly transferred from MySQL server to PHP. Remember that all text data fetched from database columns must be converted to "character_set_results" encoding, utf8mb4. That means:

By the way, if you are using different version of MySQL, you may have different default values from character set variables. That will result different conversion issues.

Table of Contents

 About This Book

 PHP Installation on Windows Systems

 Integrating PHP with Apache Web Server

 charset="*" - Encodings on Chinese Web Pages

 Chinese Characters in PHP String Literals

 Multibyte String Functions in UTF-8 Encoding

 Input Text Data from Web Forms

 Input Chinese Text Data from Web Forms

 MySQL - Installation on Windows

 MySQL - Connecting PHP to Database

MySQL - Character Set and Encoding

 Specifying Character Set for Text Columns

 Creating a Table with Multiple Character Sets

 Checking Character Set Setting

 Storing ASCII Characters in Non-ASCII Columns

 Storing Non-ASCII Characters with Encoded Bytes

 Transmitting Non-ASCII Characters between PHP and MySQL

 Viewing Character Set Variables

Non-ASCII Test Result Analysis

 Fetching Non-ASCII Text from MySQL

 MySQL Encoding Issue Example

 MySQL - Sending Non-ASCII Text to MySQL

 Retrieving Chinese Text from Database to Web Pages

 Input Chinese Text Data to MySQL Database

 Chinese Text Encoding Conversion and Corruptions

 Archived Tutorials

 References

 Full Version in PDF/EPUB