Go to the previous, next section.
This charset is available in recode
under the name ascii-bs
.
The file is straight ASCII, seven bits only. According to the definition of ASCII: diacritics are applied by a sequence of three characters: the letter, one BS, the diacritic mark. We deviate slightly from this by exchanging the diacritic mark and the letter so, on a screen device, the diacritic will disappear and let the letter alone. At recognition time, both methods are acceptable.
The French quotes are coded by the sequences: < BS " or "
BS < for the opening quote and > BS " or "
BS > for the closing quote. This artifical convention was
inherited in straight ascii-bs
from habits around bangbang
entry, and is not well known. But we decided to stick to it so that
ascii-bs
charset will not loose French quotes.
The ascii-bs
charset is independant of ascii
, and
different. The following examples demonstrate this, knowing at advance
that `!2' is the bangbang
way of representing an e
with an acute accent. Compare:
% echo \!2 | recode -v bang:ascii | od -bc bangbang -> iso-8859-1-1987 -> rfc1345 -> ansi-x3.4-1968 (many to one) bangbang -> iso-8859-1-1987 -> ansi-x3.4-1968 (many to one) 0000000 351 012 351 \n 0000002
with:
% echo \!2 | recode -v bang:ascii-bs | od -bc bangbang -> iso-8859-1-1987 -> ascii-bs (many to many) 0000000 047 010 145 012 ' \b e \n 0000004
In the first case, the e with an acute accent is merely
transmitted by the latin1:ascii
mapping, not having a special
recoding rule for it. In the latin1:ascii-bs
case, the acute
accent is applied over the e with a backspace: diacriticized
characters have special rules. For the ascii-bs
charset,
reversibility is still possible, but there might be difficult cases.
Go to the previous, next section.