Entering Non-ASCII Characters as Static Text

This section provides a tutorial example to test how non-ASCII characters entered as static text in JSP pages are converted by JSP server and returned to Web browsers.

Entering non ASCII characters as static HTML text is much harder than what I initially thought. There are many factors that should be considered:

In order to test out how to control those factors, I picked two simplified Chinese characters, and entered them in 7 different formats as a simple HTML paragraph:

GB2312-binary: 쮵쏷=(0xCBB5C3F7)<br/>      
GB2312-#xHEX: &#xCBB5;&#xC3F7;<br/>
GB2312-\uHEX: \uCBB5\uC3F7<br/>
Unicode-binary: 说明=(0x8bf4660e)<br/>
Unicode-#xHEX: &#x8bf4;&#x660e;<br/>
Unicode-\uHEX: \u8bf4\u660e<br/>
Unicode-UTF8: 说明=(0xE8AFB4E6988E)<br/>

Hex numbers are provided next to the binary codes, just in case if you have trouble to copy this file to your local system.

In the next 3 sections, I will put this paragraph into a regular HTML file, a JSP page with standard syntax, and a JSP page with XML syntax to see how Tomcat server will convert them into Java class files and in what encodings.

Last update: 2012.

