来自 io.BytesIO 流的 numpy.load

我在 Azure Blob 存储中保存了 numpy 数组,我正在将它们加载到这样的流中:

stream = io.BytesIO()
store.get_blob_to_stream(container, 'cat.npy', stream)

我从中知道stream.getvalue()流包含用于重建数组的元数据。这是前 150 个字节:

b"\x93NUMPY\x01\x00v\x00{'descr': '|u1', 'fortran_order': False, 'shape': (720, 1280, 3), }                                                  \n\xc1\xb0\x94\xc2\xb1\x95\xc3\xb2\x96\xc4\xb3\x97\xc5\xb4\x98\xc6\xb5\x99\xc7\xb6\x9a\xc7"

是否可以使用numpy.load或通过其他一些简单的方法加载字节流?

我可以将阵列保存到磁盘并从磁盘加载它,但出于多种原因我想避免这种情况......

编辑:只是为了强调,输出需要是一个 numpy 数组,其形状和数据类型在流的第 128 个字节中指定。


米琪卡哇伊
浏览 280回答 3
3回答

LEATH

我尝试使用几种方法来实现您的需求。这是我的示例代码。from azure.storage.blob.baseblobservice import BaseBlobServiceimport numpy as npaccount_name = '<your account name>'account_key = '<your account key>'container_name = '<your container name>'blob_name = '<your blob name>'blob_service = BaseBlobService(&nbsp; &nbsp; account_name=account_name,&nbsp; &nbsp; account_key=account_key)示例 1. 使用 sas 令牌生成 blob url 以通过以下方式获取内容 requestsfrom azure.storage.blob import BlobPermissionsfrom datetime import datetime, timedeltaimport requestssas_token = blob_service.generate_blob_shared_access_signature(container_name, blob_name, permission=BlobPermissions.READ, expiry=datetime.utcnow() + timedelta(hours=1))print(sas_token)url_with_sas = blob_service.make_blob_url(container_name, blob_name, sas_token=sas_token)print(url_with_sas)r = requests.get(url_with_sas)dat = np.frombuffer(r.content)print('from requests', dat)示例 2. 通过以下方式将 blob 的内容下载到内存中 BytesIOimport iostream = io.BytesIO()blob_service.get_blob_to_stream(container_name, blob_name, stream)dat = np.frombuffer(stream.getbuffer())print('from BytesIO', dat)示例 3. 使用numpy.fromfilewithDataSource打开带有 sas 令牌的 blob url,它实际上会将 blob 文件下载到本地文件系统中。ds = np.DataSource()# ds = np.DataSource(None)&nbsp; # use with temporary file# ds = np.DataSource(path) # use with path like `data/`f = ds.open(url_with_sas)dat = np.fromfile(f)print('from DataSource', dat)我认为示例 1 和 2 更适合您。

慕码人2483693

当涉及到 np.savez 时,上述解决方案通常需要工作。上传到存储:import io&nbsp; &nbsp;&nbsp;import numpy as np&nbsp; &nbsp;&nbsp;stream = io.BytesIO()&nbsp;&nbsp;arr1 = np.random.rand(20,4)&nbsp;&nbsp;arr2 = np.random.rand(20,4)&nbsp;&nbsp;np.savez(stream, A=arr1, B=arr2)&nbsp;&nbsp;block_blob_service.create_blob_from_bytes(container,&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; "my/path.npz",&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; stream.getvalue())从存储下载:from numpy.lib.npyio import NpzFile&nbsp;stream = io.BytesIO()&nbsp;&nbsp;block_blob_service.get_blob_to_stream(container, "my/path.npz", stream)&nbsp;&nbsp;ret = NpzFile(stream, own_fid=True, allow_pickle=True)&nbsp;&nbsp;print(ret.files)&nbsp;&nbsp;""" ['A', 'B'] """&nbsp;&nbsp;print(ret['A'].shape)&nbsp;&nbsp;""" (20, 4) """&nbsp;&nbsp;

慕仙森

有点晚了,但如果有人想使用 numpy.load 执行此操作,这里是代码(Azure SDK v12.8.1):from azure.storage.blob import BlobServiceClientimport ioimport numpy as np# define your connection parametersconnect_str = ''container_name = ''blob_name = ''blob_service_client = BlobServiceClient.from_connection_string(connect_str)blob_client = blob_service_client.get_blob_client(container=container_name,&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; blob=blob_name)# Get StorageStreamDownloaderblob_stream = blob_client.download_blob()stream = io.BytesIO()blob_stream.download_to_stream(stream)stream.seek(0)# Load form io.BytesIO objectdata = np.load(stream, allow_pickle=False)print(data.shape)
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python