This project is read-only.

Reflection and repetitions in the CJK decomposition data

The CJK decompositions should reflect the other Unicode properties in as many ways as possible. The Unicode properties include case folding/mapping, bidirectionality, and decompositions. The CJK decomp 'Modified' config is similar to Unicode case folding, and the 'Around', 'Down', 'Surround', 'Between', and 'Within' configs are similar to Unicode decompositions.

Standalone Reflections

The Unicode bidirectional algorithm includes many pairs of tokens intended to be swapped depending on whether the text is left-to-right or right-to-left, e.g:

() <> [] {} «» ∈∋ ≔≕ ≤≥ ≦≧ ≮≯ ≸≹ ⊂⊃ ⊏⊐ ⋋⋌ 『』 「」

Because of this, we'll define many of the CJK as horizontal reflections, e.g:

㠯𤕪 入人 卐卍 爿片 龴厶 𠂎卩 𠃛𠁣 𠅁亡 𣥄正 𨙨邑 𩰊𩰋

One issue is which of the two tokens is defined as a reflection of the other.

Reflection as a part of Repetition Across

Characters can repeat across, e.g:


Historically, CJK tokens are symmetrical horizontally. We'll use this as much as possible when defining decompositions, e.g:


By looking at historical characters, we can extrapolate reflections in modern characters, e.g by considering 𢏽, we can regard 己 as being the reflection of the topright of 与.

Rotation and Vertical Reflection

Other modifications similar to horizontal reflection are:

凵𠄟 //reflection vertically
𠄏𠄔 //rotation by 180 degrees

We'll also allow CJK tokens to be repeated in these ways:

𣥗 //repeat reflection down
𣥒 //repeat rotation upwards
𢨋 //repeat rotation downwards

Other Repetitions

Other repetitions available are:

㕕⺀㕛㚐㚣二仌吕圭多岀戔昌炎爻畕𠀘𠃙𡖈𣥕 //downwards
㐂㽓众刕劦厽叒品垚壵姦晶森𠁭𠁼𠄕𠦄𡘙 //triangle pointing up
㗊㠭㵘㸚叕燚茻𠈌𠫬𡮐𣬅 //four square
㴇巛川州𠱠𡥦𢏝𦧵 //three across
串丳出𢇍𤕪𤰶𦦀 //molded downwards
𣓏𥼬𪉓 //triangle pointing down
𠔽闁 //surround top
𡰲𢨳 //surround topleft
彡𡭯 //three down//four as a diamond//full surround
𠥼 //surround bottom right
𢩕 //three surround topleft
𢎧 //down molded
𠾅 //five as an X//four across//molded across
𣥒 //rotate upwards

Sometimes, some type of repetition can be applied twice:

𡦪𡿭𢌽 //down repetition of three across
𥷹 //square repetition of across repetition//down repetition of down repetition
𠁷 //across repetition of molded down repetition
𣡕 //across repetition of triangle repetition

Last edited Aug 24, 2011 at 3:14 PM by gavingrover, version 1


No comments yet.