HTERM: Difference between revisions
No edit summary |
|||
Line 16: | Line 16: | ||
HTERM creates such a correlation. HTERM manages converting between UTF-8, an in-memory Unicode representation, and the Model T's displayable character set. | HTERM creates such a correlation. HTERM manages converting between UTF-8, an in-memory Unicode representation, and the Model T's displayable character set. | ||
It turns out that most of the Model 100's character set maps to the 16-bit Unicode range. Given that, and the fact that the Model 100 can deal directly with 16-bit values, HTERM only supports mapping of Unicode characters in the 16-bit range. If it receives a character outside of that range, it will treat it as an unmapped character. |
Revision as of 15:31, 30 July 2011
Purpose
HTERM is a "dumb terminal" program which implements hardware flow control and rudimentary handling of UTF-8 encoding and ANSI escapes.
Source Code
Git repo: http://bitchin100.com/pub-git/hterm.git/
Unicode Support
HTERM is primarily designed to interface with Linux systems, either directly connected or through a serial to network bridge. Login consoles on Linux generally use ANSI escapes and UTF-8 encoding.
UTF-8 is a Unicode "encoding." It is a way of representing Unicode characters in an 8-bit byte format. Unicode itself is an abstraction... characters in Unicode have a range of 32-bits, with no defined wire or in-memory structure. UTF-8 is one such format.
The Model T laptops also have a 8-bit character code. For characters in the range 0-127, the ASCII range, the characters are the same in UTF-8. For characters above 127, there is no defined correlation between Unicode and the Model T.
HTERM creates such a correlation. HTERM manages converting between UTF-8, an in-memory Unicode representation, and the Model T's displayable character set.
It turns out that most of the Model 100's character set maps to the 16-bit Unicode range. Given that, and the fact that the Model 100 can deal directly with 16-bit values, HTERM only supports mapping of Unicode characters in the 16-bit range. If it receives a character outside of that range, it will treat it as an unmapped character.