有没有办法告诉 Pandas在编写 HDF5 文件时使用特定的 pickle 协议(例如 4) ?
这是情况(非常简化):
客户端 A 正在使用python=3.8.1
(以及pandas=1.0.0
和pytables=3.6.1
)。A 使用df.to_hdf(file, key)
.
客户端 B 正在使用python=3.7.1
(并且,正如它发生的那样,pandas=0.25.1
并且pytables=3.5.2
——但这无关紧要)。B 尝试使用 读取 A 写入的数据,但以pd.read_hdf(file, key)
失败ValueError: unsupported pickle protocol: 5
。
请注意,纯数字 DataFrame 不会发生这种情况(例如pd.DataFrame(np.random.normal(size=(10,10)))
。所以这是一个可重现的示例:
(base) $ conda activate py38
(py38) $ python
Python 3.8.1 (default, Jan 8 2020, 22:29:32)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> df = pd.DataFrame(['hello', 'world']))
>>> df.to_hdf('foo', 'x')
>>> exit()
(py38) $ conda deactivate
(base) $ python
Python 3.7.4 (default, Aug 13 2019, 20:35:49)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> df = pd.read_hdf('foo', 'x')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/opt/anaconda3/lib/python3.7/site-packages/pandas/io/pytables.py", line 407, in read_hdf
return store.select(key, auto_close=auto_close, **kwargs)
File "/opt/anaconda3/lib/python3.7/site-packages/pandas/io/pytables.py", line 782, in select
return it.get_result()
File "/opt/anaconda3/lib/python3.7/site-packages/pandas/io/pytables.py", line 1639, in get_result
results = self.func(self.start, self.stop, where)
File "/opt/anaconda3/lib/python3.7/site-packages/pandas/io/pytables.py", line 766, in func
return s.read(start=_start, stop=_stop, where=_where, columns=columns)
File "/opt/anaconda3/lib/python3.7/site-packages/pandas/io/pytables.py", line 3206, in read
pandas=1.0.0注意:我也尝试pytables=3.6.1在python=3.7.4. 这也失败了,所以我相信它只是导致问题的 Python 版本(3.8 writer vs 3.7 reader)。这是有道理的,因为 pickle 协议 5 是作为Python 3.8的PEP-574引入的。
梵蒂冈之花
吃鸡游戏
繁华开满天机
相关分类