Before the 20th century, variation in the shape of characters was ubiquitous, a dynamic which continued after the invention of woodblock printing.For example, prior to the Qin dynasty (221–206 BC) the character meaning 'bright' was written as either 明 or 朙—with either 日 'Sun' or 囧 'window' on the left, with the 月 'Moon' component on the right.Non-orthodox forms are known as folk variants (俗字; súzì; Revised Romanization: sokja; Hepburn: zokuji).These forms differ by their phonetic component, with the folk variant using a character with a "close enough" pronunciation but having much less strokes and thus quicker to write.In Han unification, some variants that are nearly identical between Chinese-, Japanese-, Korean-speaking regions are encoded in the same code point, and can only be distinguished using different typefaces.Instead, the Unicode standard allows encoding these variants as variation sequences,[5] by appending a variation selector (a glyph-less non-spacing mark) to the standard CJK unified ideograph (it also works directly inside plain text, without needing to use any rich text format to select the appropriate language or script, and allows easier and more selective control when the same language/script combination needs several variants).The list of valid variation sequences is standardized by Unicode, defined in the Ideographic Variation Database (IVD),[6][7] part of the Unicode Characters Database (UCD),[8] and it is expansible without reencoding new code points in the UCS (and since the Unicode versions where variation selectors were encoded and the IVD established, it's no longer needed to encode any new compatibility ideograph to render them; the two blocks CJK Compatibility Ideographs in the BMP and CJK Compatibility Ideographs Supplement in the SIP are now frozen since Unicode 4.1, except to fix a few past mistakes that were forgotten during the Han unification process for the review of normative sources).
Twelve variants of the character
劍
jiàn
'sword'
that vary both in which components are used, as well as which specific
allographs
are used for said components:
On the left side,
僉
,
㑒
and
佥
qiān
are allographs of the same phonetic component.
On the right side,
刂
'KNIFE'
,
釒
'GOLD'
, and
刃
'blade edge'
are each distinct signific components used by the different variants.
刄
is an allograph of
刃
.