Go to the previous, next section.

ASCII 7-bits, BS to overstrike

This charset is available in recode under the name ascii-bs.

The file is straight ASCII, seven bits only. According to the definition of ASCII: diacritics are applied by a sequence of three characters: the letter, one BS, the diacritic mark. We deviate slightly from this by exchanging the diacritic mark and the letter so, on a screen device, the diacritic will disappear and let the letter alone. At recognition time, both methods are acceptable.

The French quotes are coded by the sequences: < BS " or " BS < for the opening quote and > BS " or " BS > for the closing quote. This artifical convention was inherited in straight ascii-bs from habits around bangbang entry, and is not well known. But we decided to stick to it so that ascii-bs charset will not loose French quotes.

The ascii-bs charset is independant of ascii, and different. The following examples demonstrate this, knowing at advance that `!2' is the bangbang way of representing an e with an acute accent. Compare:

% echo \!2 | recode -v bang:ascii | od -bc
bangbang -> iso-8859-1-1987 -> rfc1345 -> ansi-x3.4-1968  (many to one)
bangbang -> iso-8859-1-1987 -> ansi-x3.4-1968  (many to one)
0000000 351 012
	351  \n
0000002

with:

% echo \!2 | recode -v bang:ascii-bs | od -bc
bangbang -> iso-8859-1-1987 -> ascii-bs  (many to many)
0000000 047 010 145 012
	  '  \b   e  \n
0000004

In the first case, the e with an acute accent is merely transmitted by the latin1:ascii mapping, not having a special recoding rule for it. In the latin1:ascii-bs case, the acute accent is applied over the e with a backspace: diacriticized characters have special rules. For the ascii-bs charset, reversibility is still possible, but there might be difficult cases.

Go to the previous, next section.