GB2312 Location Codes and Native Codes

GB2312 Location Codes represent locations of characters in the GB2312 table. GB2312 Native Codes are 2 7-bit bytes derived from Location Codes.

As part of the GB2312 standard, each character has been assigned with 2 codes:

Here are more detailed descriptions of Location Code and Native Code:

1. What IS Location Code? - The Location Code of a GB2312 character is the combination of the row number () and the column number () of the location of the character in the GB2312 table.

For example, the Chinese character is located at row 16 and column 1 in the GB2312 table. So the Location Code of is (16,1).

Since there are 94 rows and 94 columns in the GB2312 table, Location Codes will be in the range of (1,1) and (94,94).

2. What Is Native Code? - The Native Code of a GB2312 character is a sequence of 2 bytes represents the character in computer systems. The first byte of the code is called the high byte, and the second byte of the code is called the low byte.

The high byte is derived from the row number of the character by adding 32 to the row number value.

The low byte is derived from the column number of the character by adding 32 to the column number value.

For example, the Chinese character has a Location Code of (16,01). So its high byte is 0x10, because 16 + 32 = 48, or 0x30. Its low byte is 0x21, because 1 + 32 = 33, or 0x21. Putting them together, the Native Code of is 0x3021.

I guess the reason to add 32 on both the row number and the column number is for resulting byte values to not fall into the low byte value range. In computer systems, low value bytes are usually reserved to represent controlling commands.

Native Codes will be in the range of (0x21,0x21) and (0x7E,0x7E), Since there are only 94, or 0x5E rows and 94, or 0x5E columns in the GB2312 table.

GB2312 Native Codes are perfectly good for storing Chinese documents as computer files and transmitting them over computer networks without any problem, because:

However, GB2312 Native Codes are not compatible with ASCII Codes. In other words, GB2312 Native Codes and ASCII Codes can not be mixed together in a single file. This is because there is no way to differentiate if a byte is an ASCII Code or a high/low byte of a GB2312 Native Code.

For example, the byte 0x30 in a GB2312 Native Code and ASCII Code mixed file could be the ASCII '0' character, or the high byte of GB2312 character.

The next section describes some solutions to this problem.

Table of Contents

 About This Book

Introduction to GB2312

 What Is GB2312 Character Set

GB2312 Location Codes and Native Codes

 GB2312 Encodings

 GB2312 vs. Unicode

 GB2312, GBK and GB18030

 GB2312 Usage Trends

 GB2312Unicode.java - GB2312 to Unicode Mapping

 GB2312 to Unicode Mapping - Non-Chinese Characters

 GB2312 to Unicode Mapping - Level 1 Characters

 GB2312 to Unicode Mapping - Level 2 Characters

 UnicodeGB2312.java - Unicode to GB2312 Mapping

 Unicode to GB2312 Mapping - All 7,445 Characters

 References of This Book - GB2312 Tutorials

 Full Version in PDF/ePUB