Go to the previous section.

Adding new charsets

It is easy for a programmer to add a new charset to recode. All it requires is making a few functions kept in a single `.c' file, adjusting `Makefile.in', and remaking recode.

One of the function should convert from any previous charset to the new one. Any previous charset will do, but try to select it so you will not loose too much information while converting. The other function should convert from the new charset to any older one. You do not have to select the same old charset than what you selected for the previous routine. Once again, select any charset for which you will not loose too much information while converting.

If, for any of these two functions, you have to read multiple bytes of the old charset before recognizing the character to produce, you might prefer programming it in flex in a separate `.l' file. Prototype your C or flex files after one of those which exist already, so to keep the sources uniform. Besides, at make time, all `.l' files are automatically merged into a single big one by the script `mergelex.awk', which requires sources to follow some rules. Mimetism is a simple approach which relieves me of explaining all these rules!

Each of your source files should have its own initialization function, named module_charset, which is meant to be executed quickly, once, prior to any recoding. It should declare the name of your charsets and the single steps (or elementary recodings) you provide, by calling declare_step one or more times. Besides the charset names, declare_step expects a description of the recoding quality (see `recode.h') and two functions you also provide.

The first such function has the purpose of allocating structures, preconditionning conversion tables, etc. It is also the usual way of further modifying the STEP structure. This function is executed only if and when the single step is retained in an actual recoding sequence. If you do not need such delayed initialization, merely use NULL for the function argument.

The second function executes the elementary recoding on a whole file. There are a few cases when you can spare writing this function:

If you have a recoding table handy in a suitable format but do not use one of the predefined recoding functions, it is still a good idea to use a delayed initialization to save it anyway, because recode option -h will take advantage of this information when available.

Finally, edit `Makefile.in' to add the source file name of your routines to the C_STEPS or L_STEPS macro definition, depending on the fact your routines is written in C or in flex. For C files only, also modify the STEPOBJS macro definition.

Go to the previous section.