ISO/IEC 8859-2
From Wikipedia, the free encyclopedia
ISO 8859-2, more formally cited as ISO/IEC 8859-2 or less formally as Latin-2, is part 2 of ISO/IEC 8859, a standard character encoding defined by ISO. It encodes what it refers to as Latin alphabet no. 2, consisting of 191 characters from the Latin script, each encoded as a single 8-bit code value.
ISO_8859-2:1987, more commonly known by its preferred mime name of ISO-8859-2 (note extra hyphen), is the IANA charset name for this standard used together with the control codes from ISO/IEC 6429 for the C0 (0x00-0x1F) and C1 (0x80-0x9F) parts. Escape sequences (from ISO/IEC 6429 or ISO/IEC 2022) are not to be interpreted. This character set also has the aliases ISO_8859-2, latin2, l2 and csISOLatin2.
This encoding shares a lot of assignments with windows-1250 but is not a strict subset of it (unlike the case with windows-1252 and ISO 8859-1).
These code values can be used in almost any data interchange system to communicate in the following European languages: Bosnian, Croatian, Czech, Hungarian, Polish, Romanian, Serbian (in Latin transcription), Serbocroatian, Slovak, Slovenian, Upper Sorbian and Lower Sorbian. Furthermore it is suitable to represent some western European languages like Finnish (with the exception of å used in Swedish-Finnish names) or German. When used alone, these latter languages are nominally using ISO 8859-1 encoding, but the needed codepoints are shared with ISO 8859-2, which is an important aspect for multi-lingual documents.
It may be argued that ISO 8859-2 is not really suitable for Romanian because of lack of letters s and t with commas below, containing s and t with cedillas instead. These letters were unified in the first versions of the Unicode standard, meaning that the appearance with cedilla or with comma was treated as a glyph choice rather than as separate characters; fonts intended for use with Romanian should, therefore, have characters with comma below at those code points.
ISO/IEC 8859-2 | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
x0 | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | xA | xB | xC | xD | xE | xF | |
0x | unused | |||||||||||||||
1x | ||||||||||||||||
2x | SP | ! | " | # | $ | % | & | ' | ( | ) | * | + | , | - | . | / |
3x | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | : | ; | < | = | > | ? |
4x | @ | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O |
5x | P | Q | R | S | T | U | V | W | X | Y | Z | [ | \ | ] | ^ | _ |
6x | ` | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o |
7x | p | q | r | s | t | u | v | w | x | y | z | { | | | } | ~ | |
8x | unused | |||||||||||||||
9x | ||||||||||||||||
Ax | NBSP | Ą | ˘ | Ł | ¤ | Ľ | Ś | § | ¨ | Š | Ş | Ť | Ź | SHY | Ž | Ż |
Bx | ° | ą | ˛ | ł | ´ | ľ | ś | ˇ | ¸ | š | ş | ť | ź | ˝ | ž | ż |
Cx | Ŕ | Á | Â | Ă | Ä | Ĺ | Ć | Ç | Č | É | Ę | Ë | Ě | Í | Î | Ď |
Dx | Đ | Ń | Ň | Ó | Ô | Ő | Ö | × | Ř | Ů | Ú | Ű | Ü | Ý | Ţ | ß |
Ex | ŕ | á | â | ă | ä | ĺ | ć | ç | č | é | ę | ë | ě | í | î | ď |
Fx | đ | ń | ň | ó | ô | ő | ö | ÷ | ř | ů | ú | ű | ü | ý | ţ | ˙ |
In the table above, 20 is the regular SPACE character, and A0 is the NO-BREAK SPACE. AD is a SOFT HYPHEN, which should not appear at all in compliant web browsers.
Code values 00-1F, 7F, and 80-9F are not assigned to characters by ISO/IEC 8859-2.
[edit] Code page layout
In the following table characters for code values A0-FF are shown together with their corresponding Unicode code points.
.0 | .1 | .2 | .3 | .4 | .5 | .6 | .7 | .8 | .9 | .A | .B | .C | .D | .E | .F | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
A. |
A0 |
Ą 104 |
˘ 2D8 |
Ł 141 |
¤ A4 |
Ľ 13D |
Ś 15A |
§ A7 |
¨ A8 |
Š 160 |
Ş 15E |
Ť 164 |
Ź 179 |
AD |
Ž 17D |
Ż 17B |
B. |
° B0 |
ą 105 |
˛ 2DB |
ł 142 |
´ B4 |
ľ 13E |
ś 15B |
ˇ 2C7 |
¸ B8 |
š 161 |
ş 15F |
ť 165 |
ź 17A |
˝ 2DD |
ž 17E |
ż 17C |
C. |
Ŕ 154 |
Á C1 |
 C2 |
Ă 102 |
Ä C4 |
Ĺ 139 |
Ć 106 |
Ç C7 |
Č 10C |
É C9 |
Ę 118 |
Ë CB |
Ě 11A |
Í CD |
Î CE |
Ď 10E |
D. |
Đ 110 |
Ń 143 |
Ň 147 |
Ó D3 |
Ô D4 |
Ő 150 |
Ö D6 |
× D7 |
Ř 158 |
Ů 16E |
Ú DA |
Ű 170 |
Ü DC |
Ý DD |
Ţ 162 |
ß DF |
E. |
ŕ 155 |
á E1 |
â E2 |
ă 103 |
ä E4 |
ĺ 13A |
ć 107 |
ç E7 |
č 10D |
é E9 |
ę 119 |
ë EB |
ě 11B |
í ED |
î EE |
ď 10F |
F. |
đ 111 |
ń 144 |
ň 148 |
ó F3 |
ô F4 |
ő 151 |
ö F6 |
÷ F7 |
ř 159 |
ů 16F |
ú FA |
ű 171 |
ü FC |
ý FD |
ţ 163 |
˙ 2D9 |
[edit] External links
- ISO 8859-2:1999
- Standard ECMA-94: 8-Bit Single Byte Coded Graphic Character Sets - Latin Alphabets No. 1 to No. 4 2nd edition (June 1986)
- ISO-IR 101 Right-Hand Part of Latin Alphabet No.2 (February 1, 1986)
- ISO 8859-2 (Latin 2) Resources