我正在开发一个需要处理不同字符集的电子邮件编码/解码的项目。一个python代码可以如下所示:
from email.header import Header, decode_header, make_header
from charset import text_to_utf8
class ....
def decode_header(self, header):
decoded_header = decode_header(header)
if decoded_header[0][1] is None:
return text_to_utf8(decoded_header[0][0]).decode("utf-8", "replace")
else:
return decoded_header[0][0].decode(decoded_header[0][1].replace("windows-", "cp"), "replace")
基本上,对于像 "=?iso-2022-jp?b?GyRCRW1CQE86GyhCIDxtb21vQHRhcm8ubmUuanA=?="; “decode_header”函数只是试图找到编码:'iso-2022-jp';然后它将使用“解码”函数将字符集解码为 unicode。
现在,在 Go 中,我可以做类似的事情:
import "mime"
dec := new(mime.WordDecoder)
text := "=?utf-8?q?=C3=89ric?= <eric@example.org>, =?utf-8?q?Ana=C3=AFs?= <anais@example.org>"
header, err := dec.DecodeHeader(text)
Seems that there mime.WordDecoder allow to put a charset decoder "hook":
type WordDecoder struct {
// CharsetReader, if non-nil, defines a function to generate
// charset-conversion readers, converting from the provided
// charset into UTF-8.
// Charsets are always lower-case. utf-8, iso-8859-1 and us-ascii charsets
// are handled by default.
// One of the the CharsetReader's result values must be non-nil.
CharsetReader func(charset string, input io.Reader) (io.Reader, error)
}
我想知道是否有任何库可以让我转换任意字符集,如 python 中的“解码”函数,如上例所示。我不想写一个像 mime/encodedword.go 中使用的那样的大“开关案例”:
func (d *WordDecoder) convert(buf *bytes.Buffer, charset string, content []byte) error {
switch {
case strings.EqualFold("utf-8", charset):
buf.Write(content)
case strings.EqualFold("iso-8859-1", charset):
for _, c := range content {
buf.WriteRune(rune(c))
}
....
任何帮助将不胜感激。
猛跑小猪
潇湘沐
相关分类