Unicode Tutorials - Herong's Tutorial Examples - v5.32, by Herong Yang
Unicode Tutorials - Herong's Tutorial Examples
https://www.herongyang.com/Unicode
Copyright © 1995-2024 Herong Yang. All rights reserved.
This Unicode tutorial book is a collection of notes and sample codes written by the author while he was learning Unicode himself. Topics include Character Sets and Encodings; GB2312/GB18030 Character Set and Encodings; JIS X0208 Character Set and Encodings; Unicode Character Set; Basic Multilingual Plane (BMP); Unicode Transformation Formats (UTF); Surrogates and Supplementary Characters; Unicode Character Blocks; Python Support of Unicode Characters; Java Character Set and Encoding; Java Encoding Maps, Counts and Conversion. Updated in 2024 (Version v5.32) with minor changes.
Table of Contents
Commonly Used Character Sets and Encodings
ASCII Character Set and Encoding
Listing of ASCII Characters and Encoded Bytes
GB2312 Character Set and Encoding
GB2312 Character Set for Chinese Characters
GB2312 Encoding for GB2312 Character Set
Relation of GB2312 and Unicode
GB18030 Character Set and Encoding
GB18030 Encoding for GB18030 Character Set
JIS X0208 Character Set and Encodings
JIS X0208 Character Set for Japanese Characters
JIS X0208 Character Code Values
Examples of Unicode Characters
Unicode 13.0 Character Samples
UTF-8 (Unicode Transformation Format - 8-Bit)
UTF-16, UTF-16BE and UTF-16LE Encodings
UTF-32, UTF-32BE and UTF-32LE Encodings
Python Language and Unicode Characters
Summary of Unicode Support in Python
Unicode Support on "str" Data Type
Unicode Character Encoding and Decoding
"unicodedata" Module for Unicode Properties
Java Language and Unicode Characters
Unicode Versions Supported in Java History
'int' and 'String' - Basic Data Types for Unicode
"Character" Class with Unicode Utility Methods
Character.toChars() - "char" Sequence of Code Point
Character.getNumericValue() - Numeric Value of Code Point
"String" Class with Unicode Utility Methods
String.length() Is Not Number of Characters
String.toCharArray() Returns the UTF-16BE Sequence
String Literals and Source Code Encoding
List of Supported Character Encodings in Java
EncodingSampler.java - Testing encode() Methods
Examples of CP1252 and ISO-8859-1 Encodings
Examples of US-ASCII, UTF-8, UTF-16 and UTF-32 Encodings
Character Set Encoding Map Analyzer
Character Set Encoding Maps - US-ASCII and ISO-8859-1/Latin 1
Character Set Encoding Maps - CP1252/Windows-1252
Character Set Encoding Maps - Unicode UTF-8
Character Set Encoding Maps - Unicode UTF-16, UTF-16BE, UTF-16LE
Character Set Encoding Maps - Unicode UTF-32, UTF-32BE, UTF-32LE
Character Counter Program for Any Given Encoding
Character Set Encoding Comparison
Encoding Conversion Programs for Encoded Text Files
\uxxxx - Entering Unicode Data in Java Programs
HexWriter.java - Converting Encoded Byte Sequences to Hex Values
EncodingConverter.java - Encoding Conversion Sample Program
Viewing Encoded Text Files in Web Browsers
Unicode Signs in Different Encodings
Using Notepad as a Unicode Text Editor
Byte Order Mark (BOM) - FEFF - EFBBBF
Saving Files in "Unicode Big Endian" Option
Saving Files in "Unicode" Option
Supported Save and Open File Formats
Using Microsoft Word as a Unicode Text Editor
Saving Files in "Unicode (UTF-8)" Option
Saving Files in "Unicode (Big-Endian)" Option
Saving Files in Unicode Option
Supported Save and Open File Formats
Using Microsoft Excel as a Unicode Text Editor
Saving Files in "Unicode Text (*.txt)" Option
Supported Save and Open File Formats
Downloading and Installing GNU Unifont
Keywords: Unicode, Universal, Character, Encoding, Tutorial, Book