Composing syllables in Unicode

As explained in the introduction of Hangul in Unicode a proper Hangul Syllable in Unicode consists out of 3 modern jamo. An initial jamo, a medial jamo and a final jamo. Each jamo has their own value which can be put into an equation which will make up a final number that references to a specific code point in the Hangul Syllables (U+AC00-U+D7A3) range.

Jamo list

  • Intitial Jamo (19 modern jamo)
  • Medial Jamo (21 modern jamo)
  • Final Jamo (27 modern jamo, 28 if you include no final jamo)
Initial Jamo
0. 4. 8. 12. 16.
1. 5. 9. 13. 17.
2. 6. 10. 14. 18.
3. 7. 11. 15.
Medial Jamo
0. 7. 14.
1. 8. 15.
2. 9. 16.
3. 10. 17.
4. 11. 18.
5. 12. 19.
6. 13. 20.
Final Jamo
0. None 7. 14. 21.
1. 8. 15. 22.
2. 9. 16. 23.
3. 10. 17. 24.
4. 11. 18. 25.
5. 12. 19. 26.
6. 13. 20. 27.

With all of the tables out of the way, you may have noticed a pattern here. They are all the modern jamo from the Hangul Jamo (1100–11FF) range in order with some “empty” characters in between.

The equation(s)
All these values are great, but how do I now convert these into a proper Hangul Syllable? Well, up until recently I thought there was a single equation for this. Turns out there are multiple which obviously still comes down to the same, just refactored a little.

  • ((initial * 588) + (medial * 28) + final) + 44032 (Source)
  • (((initial * 21 + medial) * 28) + final) + 44032 (Source: Line number 305-308)

Examples:

Initial Medial Final Complete
Value Jamo Value Jamo Value Jamo Number Hexadecimal Syllable
5 20 4 47536 B9B0
8 4 26 48874 BEEA
0 2 18 44106 AC4A
15 14 8 53252 D004
11 7  ㅖ 13 50709 C615

Explanation of the equation:
So how is the equation built up? Simple! If you again look at the Hangul Syllables (U+AC00-U+D7A3) range it will make it easier to understand.

Step 1: The addition of 44032, where does it come from?

AC00 in Hexadecimal is equal to: 44032. The addition puts us right at the beginning of the Hangul Syllables range.

Step 2: The addition of the final jamo value.

If you look at the Hangul Syllables range you will notice that the final jamo is the least significant jamo. There is not a single instance where the final jamo is the same in a previous or up following Hangul syllable.

Step 3: The addition of the medial jamo value.

The medial jamo is the second least significant. A single medial jamo will be used in 28 consecutive  syllables.

Why 28?
Because that is the amount of final jamo. After all those have been used you would end up with duplicate syllables if the value of the medial jamo would not go up. That means that if you want the same initial and final jamo you will need to add in steps of 28.

Examples:

AC00 => 44032 (가)
Increment Hexadecimal Number Hangul Syllable
+ 0 AC00 44032
+ 28 AC1C 44032
+ 56 AC38 44088
B607 => 46599 (똇)
Increment Hexadecimal Number Hangul Syllable
+ 0 B607 46599
+ 28 B623 46627
+ 56 B63F 46655

Step 4: The addition of the initial jamo value.

The initial jamo is the most significant. A single initial jamo will be used in 588 consecutive  syllables and afterwards never again in the entire range.

Why 588?
Because that is the amount of final jamo times the amount of medial jamo. 21 * 28 = 588. That means you would again get duplicate syllables if the initial jamo value would not go up after 588 consecutive syllables. Like in step 3, should you want the same medial and final jamo you will need to add in steps of 588.

Examples:

AC00 => 44032 (가)
Increment Hexadecimal Number Hangul Syllable
+ 0 AC00 44032
+ 588 AE4C 44620
+ 1176 B098 45208
B607 => 46599 (똇)
Increment Hexadecimal Number Hangul Syllable
+ 0 B607 46599
+ 588 B853 47187
+ 1176 BA9F 47775

Isn’t that great?