HexWriter.java - Converting Encoded Byte Sequences to Hex Values

This section provides a tutorial example on how to write a sample program, HexWriter.java, to convert encoded byte sequences to Hex values to help viewing encoded text files.

By running the sample program, UnicodeHello.java, presented in the previous section, I got this text file saved in UTF-16BE encoding, hello.utf-16be. The next question is how can I view and inspect this UTF-16BE encoded file. Normal text editors will not able to show the content of this correctly.

I have two choices: using a Hex editor to open the file or convert the file to Hex value file with a program.

I decide to write a simple Java program convert UTF-16BE byte sequences into Hex decimal digits to allow me inspecting the code values of the saved characters. Remember UTF-16BE encoding breaks the code values into two bytes directly without any changes in the value. Here is a program to convert any data file into Hex decimal digits:

/**
 * HexWriter.java
 - Copyright (c) 2009, HerongYang.com, All Rights Reserved.
 * This program allows you to convert and data file to a new data 
 * in Hex format with 16 bytes (32 Hex digits) per line.
 */
import java.io.*;
class HexWriter {
   static char hexDigit[] = {'0', '1', '2', '3', '4', '5', '6', '7',
                             '8', '9', 'A', 'B', 'C', 'D', 'E', 'F'};
   public static void main(String[] a) {
      String inFile = a[0];
      String outFile = a[1];
      int bufSize = 16;
      byte[] buffer = new byte[bufSize];
      String crlf = System.getProperty("line.separator");
      try {
         FileInputStream in = new FileInputStream(inFile);
         OutputStreamWriter out = new OutputStreamWriter(
            new FileOutputStream(outFile));
         int n = in.read(buffer,0,bufSize);
	 String s = null;
         int count = 0;
         while (n!=-1) {
            count += n;
            s = bytesToHex(buffer,0,n);
            out.write(s);
            out.write(crlf);
            n = in.read(buffer,0,bufSize);
         }
         in.close();
         out.close();
         System.out.println("Number of input bytes: "+count);
      } catch (IOException e) {
         System.out.println(e.toString());
      }
   }
   public static String bytesToHex(byte[] b, int off, int len) {
      StringBuffer buf = new StringBuffer();
      for (int j=0; j<len; j++)
         buf.append(byteToHex(b[off+j]));
      return buf.toString();
   }
   public static String byteToHex(byte b) {
      char[] a = { hexDigit[(b >> 4) & 0x0f], hexDigit[b & 0x0f] };
      return new String(a);
   }
}

Compile this program and run it to convert hello.utf-16be:

C:\herong>javac HexWriter.java

C:\herong>java java HexWriter hello.utf-16be hello.hex

Okay, here is the content of hello.hex:

00480065006C006C006F00200063006F
006D0070007500740065007200210020
002D00200045006E0067006C00690073
0068000D000A753581114F60597DFF01
0020002D002000530069006D0070006C
00690066006900650064002000430068
0069006E006500730065000D000A96FB
81664F60597DFE570020002D00200054
007200610064006900740069006F006E
0061006C0020004300680069006E0065
00730065000D000A

If you know how to read Hex number, you should be able to see:

Remember to use line break sequence 000D000A (\r\n) to help finding the first character of each line.

Last update: 2009.

Table of Contents

 About This Book

 Character Sets and Encodings

 ASCII Character Set and Encoding

 GB2312 Character Set and Encoding

 GB18030 Character Set and Encoding

 JIS X0208 Character Set and Encodings

 Unicode Character Set

 UTF-8 (Unicode Transformation Format - 8-Bit)

 UTF-16, UTF-16BE and UTF-16LE Encodings

 UTF-32, UTF-32BE and UTF-32LE Encodings

 Java Language and Unicode Characters

 Character Encoding in Java

 Character Set Encoding Maps

Encoding Conversion Programs for Encoded Text Files

 \uxxxx - Entering Unicode Data in Java Programs

HexWriter.java - Converting Encoded Byte Sequences to Hex Values

 EncodingConverter.java - Encoding Conversion Sample Program

 Viewing Encoded Text Files in Web Browsers

 Unicode Signs in Different Encodings

 Using Notepad as a Unicode Text Editor

 Using Microsoft Word as a Unicode Text Editor

 Using Microsoft Excel as a Unicode Text Editor

 Unicode Fonts

 Unicode Code Point Blocks - Code Charts

 Outdated Tutorials

 References

 PDF Printing Version