Hangul in Unicode

Small introduction to Unicode
Unicode is an international standard that maps binary codes to graphical character representations. In it’s most basic explanation: Unicode maps a single number to a single character, for every language in the world. This page however will look specifically at how Korean is integrated into the Unicode specification. The implementation of Hangul in Unicode is actually quite unique and in my opinion nice and elegantly done.

To follow along with the upcoming paragraphs I will briefly note the Unicode character syntax. Referencing a specific number in Unicode is done with Hexadecimal numbers. Usually formatted like this: U+AC00.

  • U indicates that it is for Unicode.
  • +{Hexadecimal value} indicates the number you want the graphical representation of.
  • A position in the Unicode specification is referred to as a “code point”.

Here is a small online converter tool to make it more clear.

Hangul in Unicode
Good, with all the necessary Unicode details out of the way let’s start with the real deal. Hangul letters are detailed in several separate parts of the Unicode specification:

There are two ways of representing Hangul in Unicode. There are unique code points for syllable blocks and unique code points for all individual jamo.

The individual Hangul Jamo (U+1100–U+11FF) are then again (partially) duplicated for compatibility reasons to another range named Hangul Compatibility Jamo (U+3130-U+318F). It is important to note that in the Hangul Compatibility Jamo set there are no duplicate jamo and they are also not in “가나다 순” (ganada order).

Hangul Syllables are formed by 3 components, or (modern) jamo. They are called modern jamo because the unicode specification includes quite a few Hangul Jamo that for as far as I know have never been used or have fallen out of use. All the modern jamo are visualized with a green background in the tables underneath.

  • Intitial Jamo (19 modern jamo)
  • Medial Jamo (21 modern jamo)
  • Final Jamo (27 modern jamo, 28 if you include no final jamo)

This means that there are 67 (68) modern jamo being used to compose a unique Hangul Syllable.

19 * 21 * 68 = 11.172

That’s right, there are 11.172 unique Hangul Syllables. All these syllables are stored in the Hangul Syllables (U+AC00-U+D7A3) range.

Hangul Jamo
U+1100-U+11FF  (256 code points, 256 assigned)

0 1 2 3 4 5 6 7 8 9 A B C D E F
U+110x
U+111x
U+112x
U+113x
U+114x
U+115x
U+116x
U+117x
U+118x
U+119x
U+11Ax
U+11Bx
U+11Cx
U+11Dx
U+11Ex
U+11Fx

Hangul Compatibility Jamo
U+3130-U+318F (96 code points, 94 assigned)

0 1 2 3 4 5 6 7 8 9 A B C D E F
U+313x
U+314x
U+315x
U+316x
U+317x
U+318x