Unicode Signs in Different Encodings

This section provides a tutorial example on how to write sample programs to create some Unicode signs in various encodings and view them in a Web browser.

I wanted to play with my utility programs mentioned in this chapter one more time with some Unicode signs. So I copied UnicodeHello.java and created UnicodeSign.java:

/**
 * UnicodeSign.java
 - Copyright (c) 2009, HerongYang.com, All Rights Reserved.
 *
 * This program is a simple tool to allow you to enter several lines of
 * text, and writh them into a file with of the specified encoding 
 * (charset name). The input text lines uses Java string convention, 
 * which allows you to enter ASCII characters directly, and any non
 * ASCII characters with escape sequences.
 *
 * This version of the program is to write out some interesting signs.
 */
import java.io.*;
class UnicodeSign {
   public static void main(String[] a) {
      // The following Array contains text to be saved into the output
      // File. To enter your own text, just replace this Array.
      String[] text = {
"U+005C(\\)REVERSE SOLIDUS", //\u005C is '\', cannot be entered directly
"U+007E(\u007E)TILDE",
"U+00A2(\u00A2)CENT SIGN",
"U+00A3(\u00A3)POUND SING",
"U+00A5(\u00A5)YEN SIGN",
"U+00A6(\u00A6)BROKEN BAR",
"U+00A7(\u00A7)SECTION SIGN",
"U+00A9(\u00A9)COPYRIGHT SIGN",
"U+00AC(\u00AC)NOT SIGN",
"U+00AE(\u00AE)REGISTERED SIGN",
"U+2022(\u2022)BULLET",
"U+2023(\u2023)TRIANGULAR BULLET",
"U+203B(\u203B)REFERENCE MARK",
"U+2043(\u2043)HYPHEN BULLET",
"U+FF04(\uFF04)FULLWIDTH DOLLAR SIGN",
"U+FF05(\uFF05)FULLWIDTH PERCENT SIGN",
"U+FF08(\uFF08)FULLWIDTH LEFT PARENTHESIS",
"U+FF09(\uFF09)FULLWIDTH RIGHT PARENTHESIS",
"U+FF10(\uFF10)FULLWIDTH DIGIT ZERO",
"U+FF11(\uFF11)FULLWIDTH DIGIT ONE",
"U+FF21(\uFF21)FULLWIDTH LATIN CAPITAL LETTER A",
"U+FF22(\uFF22)FULLWIDTH LATIN CAPITAL LETTER B",
"U+FF41(\uFF41)FULLWIDTH LATIN SMALL LETTER A",
"U+FF42(\uFF42)FULLWIDTH LATIN SMALL LETTER B",
"U+FFE0(\uFFE0)FULLWIDTH CENT SIGN",
"U+FFE1(\uFFE1)FULLWIDTH POND SIGN",
"U+FFE5(\uFFE5)FULLWIDTH YEN SIGN"
      };
      String outFile = "sign.utf-16be";
      if (a.length>0) outFile = a[0];
      String outCharsetName = "utf-16be";
      if (a.length>1) outCharsetName = a[1];
      String crlf = System.getProperty("line.separator");
      try {
         OutputStreamWriter out = new OutputStreamWriter(
            new FileOutputStream(outFile), outCharsetName);
         for (int i=0; i<text.length; i++) {
            out.write(text[i]);
            out.write(crlf);
         }
         out.close();
      } catch (IOException e) {
         System.out.println(e.toString());
      }
   }
}

Then I ran this program, and converted the output file with different encodings:

javac UnicodeSign.java
java UnicodeSign sign.utf-16be utf-16be
java EncodingConverter sign.utf-16be utf-16be sign.utf-8 utf-8
java EncodingHtml sign.utf-8 utf-8
java EncodingConverter sign.utf-16be utf-16be sign.gbk gbk
java EncodingHtml sign.gbk gbk
java EncodingConverter sign.utf-16be utf-16be sign.shift_jis shift_jis
java EncodingHtml sign.shif_jis shift_jis
java EncodingConverter sign.utf-16be utf-16be sign.johab johab
java EncodingHtml sign.johab johab

Then I viewed the different encoded test files with IE, and noticed that:

Last update: 2009.

Table of Contents

 About This Book

 Character Sets and Encodings

 ASCII Character Set and Encoding

 GB2312 Character Set and Encoding

 GB18030 Character Set and Encoding

 JIS X0208 Character Set and Encodings

 Unicode Character Set

 UTF-8 (Unicode Transformation Format - 8-Bit)

 UTF-16, UTF-16BE and UTF-16LE Encodings

 UTF-32, UTF-32BE and UTF-32LE Encodings

 Java Language and Unicode Characters

 Character Encoding in Java

 Character Set Encoding Maps

Encoding Conversion Programs for Encoded Text Files

 \uxxxx - Entering Unicode Data in Java Programs

 HexWriter.java - Converting Encoded Byte Sequences to Hex Values

 EncodingConverter.java - Encoding Conversion Sample Program

 Viewing Encoded Text Files in Web Browsers

Unicode Signs in Different Encodings

 Using Notepad as a Unicode Text Editor

 Using Microsoft Word as a Unicode Text Editor

 Using Microsoft Excel as a Unicode Text Editor

 Unicode Fonts

 Unicode Code Point Blocks - Code Charts

 Outdated Tutorials

 References

 PDF Printing Version