Go to the previous, next section.
The main driver constructs, while initializing all conversion modules, a table giving all the conversion routines available (single steps) and for each, the starting charset and the ending charset. If we consider these charsets as being the nodes of a directed graph, each single step may be considered as oriented arc from one node to the other. A cost is attributed to each arc: for example, a high penality is given to single steps which are prone to loosing characters, a low penality is given to those which need studying more than one input character for producing an output character, etc.
Given a starting code and a goal code, recode
computes the most
economical route through the elementary recodings, that is, the best
sequence of conversions that will transform the input charset into the
final charset. To speed up execution, recode
looks for
subsequences of conversions which are simple enough to be merged, it
then dynamically creates new single steps, of course, use them.
A double step is a sequence of two single steps, the output of the
first being the special charset rfc1345
(which is not directly
available to the user), the input of the second single step being also
rfc1345
. A special machinery dynamically produces efficient,
reversible, mergeable single steps out of these double steps.
The main part of recode
is written in C, as are most single
steps. A few single steps need to recognize sequences of multiple
characters, they are often better written in flex
.
Go to the previous, next section.