fileinput.hook_compressed 有时给我字符串，有时给我字节

fileinput.input根据是否获取 gzip 压缩文件，执行根本不同的操作。对于文本文件，它以常规打开open，默认情况下以文本模式有效打开。对于gzip.open，默认模式是二进制，这对于内容未知的压缩文件来说是合理的。仅二进制限制是由人为施加的fileinput.FileInput。从该方法的代码来看__init__： # restrict mode argument to reading modes if mode not in ('r', 'rU', 'U', 'rb'): raise ValueError("FileInput opening mode must be one of " "'r', 'rU', 'U' and 'rb'") if 'U' in mode: import warnings warnings.warn("'U' mode is deprecated", DeprecationWarning, 2) self._mode = mode这为您提供了两种解决方法。选项1设置_mode后的属性__init__。为了避免在使用中添加额外的代码行，您可以子类化fileinput.FileInput并直接使用该类：class TextFileInput(fileinput.FileInput): def __init__(*args, **kwargs): if 'mode' in kwargs and 't' in kwargs['mode']: mode = kwargs.pop['mode'] else: mode = '' super().__init__(*args, **kwargs) if mode: self._mode = modefor line in TextFileInput(filenames, openhook=fileinput.hook_compressed, mode='rt'): ...选项2弄乱未记录的前导下划线非常麻烦，因此您可以为 zip 文件创建自定义挂钩。这实际上非常简单，因为您可以使用代码作为fileinput.hook_compressed模板：def my_hook_compressed(filename, mode): if 'b' not in mode: mode += 't' ext = os.path.splitext(filename)[1] if ext == '.gz': import gzip return gzip.open(filename, mode) elif ext == '.bz2': import bz2 return bz2.open(filename, mode) else: return open(filename, mode)选项3最后，您始终可以将字节解码为 unicode 字符串。这显然不是更好的选择。

fileinput.hook_compressed 有时给我字符串，有时给我字节

2回答