Unicode Tutorials - Herong's Tutorial Examples - v5.32, by Herong Yang
Character Set Encoding Maps - CP1252/Windows-1252
This section provides a tutorial example of analyzing and printing character set encoding maps for encoding: CP1252/Windows-1252, the default encoding for Java SE on Windows systems.
Here is the output of my sample program, EncodingAnalyzer2.java, for CP1252/Windows-1252 encoding with Java SE 7:
C:\herong>java EncodingAnalyzer2 CP1252 CP1252 encoding: 00000000 > 00 - 0000007F > 7F 00000080 > 3F - 0000009F > 3F: Invalid range 000000A0 > A0 - 000000FF > FF 00000100 > 3F - 00000151 > 3F: Invalid range 00000152 > 8C - 00000152 > 8C 00000153 > 9C - 00000153 > 9C 00000154 > 3F - 0000015F > 3F: Invalid range 00000160 > 8A - 00000160 > 8A 00000161 > 9A - 00000161 > 9A 00000162 > 3F - 00000177 > 3F: Invalid range 00000178 > 9F - 00000178 > 9F 00000179 > 3F - 0000017C > 3F: Invalid range 0000017D > 8E - 0000017D > 8E 0000017E > 9E - 0000017E > 9E 0000017F > 3F - 00000191 > 3F: Invalid range 00000192 > 83 - 00000192 > 83 00000193 > 3F - 000002C5 > 3F: Invalid range 000002C6 > 88 - 000002C6 > 88 000002C7 > 3F - 000002DB > 3F: Invalid range 000002DC > 98 - 000002DC > 98 000002DD > 3F - 00002012 > 3F: Invalid range 00002013 > 96 - 00002014 > 97 00002015 > 3F - 00002017 > 3F: Invalid range 00002018 > 91 - 00002019 > 92 0000201A > 82 - 0000201A > 82 0000201B > 3F - 0000201B > 3F: Invalid range 0000201C > 93 - 0000201D > 94 0000201E > 84 - 0000201E > 84 0000201F > 3F - 0000201F > 3F: Invalid range 00002020 > 86 - 00002021 > 87 00002022 > 95 - 00002022 > 95 00002023 > 3F - 00002025 > 3F: Invalid range 00002026 > 85 - 00002026 > 85 00002027 > 3F - 0000202F > 3F: Invalid range 00002030 > 89 - 00002030 > 89 00002031 > 3F - 00002038 > 3F: Invalid range 00002039 > 8B - 00002039 > 8B 0000203A > 9B - 0000203A > 9B 0000203B > 3F - 000020AB > 3F: Invalid range 000020AC > 80 - 000020AC > 80 000020AD > 3F - 00002121 > 3F: Invalid range 00002122 > 99 - 00002122 > 99 00002123 > 3F - 0010FFFF > 3F: Invalid range Code Point > Byte Sequence - Code Point > Byte Sequence
The encoding map of CP1252/Windows-1252, which is the default encoding used by Java SE for Windows systems, is not so simple:
Table of Contents
ASCII Character Set and Encoding
GB2312 Character Set and Encoding
GB18030 Character Set and Encoding
JIS X0208 Character Set and Encodings
UTF-8 (Unicode Transformation Format - 8-Bit)
UTF-16, UTF-16BE and UTF-16LE Encodings
UTF-32, UTF-32BE and UTF-32LE Encodings
Python Language and Unicode Characters
Java Language and Unicode Characters
Character Set Encoding Map Analyzer
Character Set Encoding Maps - US-ASCII and ISO-8859-1/Latin 1
►Character Set Encoding Maps - CP1252/Windows-1252
Character Set Encoding Maps - Unicode UTF-8
Character Set Encoding Maps - Unicode UTF-16, UTF-16BE, UTF-16LE
Character Set Encoding Maps - Unicode UTF-32, UTF-32BE, UTF-32LE
Character Counter Program for Any Given Encoding
Character Set Encoding Comparison
Encoding Conversion Programs for Encoded Text Files
Using Notepad as a Unicode Text Editor
Using Microsoft Word as a Unicode Text Editor