Go to the previous section.
It is easy for a programmer to add a new charset to recode
. All
it requires is making a few functions kept in a single `.c' file,
adjusting `Makefile.in', and remaking recode
.
One of the function should convert from any previous charset to the new one. Any previous charset will do, but try to select it so you will not loose too much information while converting. The other function should convert from the new charset to any older one. You do not have to select the same old charset than what you selected for the previous routine. Once again, select any charset for which you will not loose too much information while converting.
If, for any of these two functions, you have to read multiple bytes of
the old charset before recognizing the character to produce, you might
prefer programming it in flex
in a separate `.l' file.
Prototype your C or flex
files after one of those which exist
already, so to keep the sources uniform. Besides, at make
time,
all `.l' files are automatically merged into a single big one by
the script `mergelex.awk', which requires sources to follow some
rules. Mimetism is a simple approach which relieves me of explaining
all these rules!
Each of your source files should have its own initialization function,
named module_charset
, which is meant to be executed
quickly, once, prior to any recoding. It should declare the name of
your charsets and the single steps (or elementary recodings) you
provide, by calling declare_step
one or more times. Besides the
charset names, declare_step
expects a description of the recoding
quality (see `recode.h') and two functions you also provide.
The first such function has the purpose of allocating structures,
preconditionning conversion tables, etc. It is also the usual way of
further modifying the STEP
structure. This function is executed
only if and when the single step is retained in an actual recoding
sequence. If you do not need such delayed initialization, merely use
NULL for the function argument.
The second function executes the elementary recoding on a whole file. There are a few cases when you can spare writing this function:
file_one_to_one
, but have a delayed initialization for presetting
the field one_to_one
to the predefined value one_to_same
.
file_one_to_one
, but have a delayed initialization for
presetting the STEP
field one_to_one
with your table.
file_one_to_many
, but have a delayed initialization for
presetting the STEP
field one_to_many
with your table.
If you have a recoding table handy in a suitable format but do not use
one of the predefined recoding functions, it is still a good idea to use
a delayed initialization to save it anyway, because recode
option
-h
will take advantage of this information when available.
Finally, edit `Makefile.in' to add the source file name of your
routines to the C_STEPS
or L_STEPS
macro definition,
depending on the fact your routines is written in C or in flex
.
For C files only, also modify the STEPOBJS
macro definition.
Go to the previous section.