For those who do not know - different computer platforms use different 8bit character sets for Cyrillic. Thus there is a problem of presenting the same textual data in different charsets to the network users on different computers.
Many people prefer to implement multi-charset support for Internet services (such as Mail, News and WWW) with some kind of a "proxy" server. One example is "cyrproxy" proxy server by Alex Tutubalin that can be taken from ftp://ftp.lexa.ru/pub/domestic/lexa/. The idea sounds very good, because this approach does not require any modification in the server code. But this also has at least two serious drawbacks:
(See also older versions: libmcs-2.02.tar.gz, libmcs-2.1.tar.gz and libtrans.tar.gz)
The basic concept (proposed by Igor V. Semenyuk <firstname.lastname@example.org>) is to set up multiple IP interfaces on the same machine (this is possible with all modern UNICes, both BSD and SVR4 flavours), one interface for each of supported codetables. These interfaces are assigned mnemonic names corresponding to the character set they support, e.g. koi.smtp.online.ru, win.smtp.online.ru, etc.
The server code is modified in the following way: when incoming connection is opened with accept(2) call, a function settrtab_bysocket() from the libtrans package should be called, with the accept()ed socket file descriptor as the parameter. settrtab_bysocket() determines the IP address of the local end of the connection with getsockname(2), and determines its host name with gethostbyaddr(3). Then, it tries to load a file with translation table named the same as the hostname. If there is no such file (i.e. the hostname does not match any charset mnemonic name), a "no-translation" table is set up. This table address is stored in the static memory, so that subsequent calls to TR_IN() and TR_OUT() macroes can do character conversion appropriate for the particular charset.
Then, every character received from the client should be passed thru the TR_IN() macro, and every character to be written to the client - thru the TR_OUT() macro.
There is also a "low-level" function, settrtab_byname(), which can be called in case the server had already determined its local domain name, to avoid redundant calls for getsockname() and gethostbyaddr(). This also can be useful if there is any other way to determine the desired character set, e.g. if the client has a way to specify it explicitly, as in "Accept-Charset:" header in HTTP protocol.
Ilya Etingof developed a Solaris STREAMS module that uses libmcs, which can be found here: http://www.glas.net/~ilya/software/osc.html.
Unfortunately, not all WWW clients are aware of this header. If accept-charset is not available, one could try to guess the correct codetale the client uses from the "User-Agent:" header. In most cases, this header, in addition to the browser name, contains an operating system identification string.
Though again, this may be not enough information. There are cases when different charsets are used on the same operating system. E.g. some MS/Windows® users install koi8-r fonts as a kludge solution to get the text from old-running WWW servers that only support koi8-r codetable for Cyrillic. Another example is a UNIX system using ISO-8859-5 instead of common koi8-r.
To deal with all these cases, the following solution is implemented: several virtual hosts are created on the same document tree, one for each supported codetable and one "generic". Normally, users come to the generic virtual host. The codetable they need is determined according to the "Accept-Charset:" header, or , if it is not available, by the "User-Agent:" header. In most cases, the user will get what she needs. If not, she has an option to request the same document from the specific virtual host, making the server use the codetable that she explicitly specifies via the host name.
Follows are the patches for Apache server (both SSL and non-SSL) implementing the described approach:
Here are patches to smtpserver and smtp transport of Zmailer 2.99.27 (smtp transport should be patched to avoid converting 8bit subjects to quoted-printable in outgoing mail).
And here are patches to smtpserver of Zmailer 2.99.38, 2.99.44 and 2.99.47 (no need to patch smtp transport, just use "-8H" options).
Starting from version 2.99.48, on-the-fly translation code is included into mainstream Zmailer release. See the file smtpserver/README.translation in Zmailer distribution for description.
Of course, after applying the patch, you need to modify the Makefile (by hand) to add appropriate "-I" directive to CFLAGS and "-lmcs" to LDFLAGS.
LIBS="-L/usr/local/lib -lmcs" INCLUDES="-I/usr/local/include" ./configure ...
(assuming that you have libmcs installed in /usr/local)