建议为类型化的内存视图分配内存的方法是什么?

有关类型化内存视图的Cython文档列出了分配给类型化内存视图的三种方式:


从原始的C指针

从np.ndarray和

从cython.view.array。

假设我没有从外部将数据传递到cython函数中,而是想分配内存并将其返回为np.ndarray,那么我选择了哪些选项?还要假设该缓冲区的大小不是编译时常量,即我无法在堆栈上分配,但需要malloc为选项1 分配。


因此,这三个选项看起来像这样:


from libc.stdlib cimport malloc, free

cimport numpy as np

from cython cimport view


np.import_array()


def memview_malloc(int N):

    cdef int * m = <int *>malloc(N * sizeof(int))

    cdef int[::1] b = <int[:N]>m

    free(<void *>m)


def memview_ndarray(int N):

    cdef int[::1] b = np.empty(N, dtype=np.int32)


def memview_cyarray(int N):

    cdef int[::1] b = view.array(shape=(N,), itemsize=sizeof(int), format="i")

使我感到惊讶的是,在所有这三种情况下,Cython都会为内存分配生成大量代码,尤其是对的调用__Pyx_PyObject_to_MemoryviewSlice_dc_int。这表明(我可能错了,我对Cython内部工作的了解非常有限)它首先创建了一个Python对象,然后将其“投射”到内存视图中,这似乎是不必要的开销。


一个简单的基准测试并不能揭示这三种方法之间的太大差异,其中2.是最快的方法。


建议使用三种方法中的哪一种?还是有其他更好的选择?


后续问题:np.ndarray在函数中使用该内存视图后,我想最终将结果返回为。类型化的内存视图是最佳选择,还是我宁愿只使用如下所示的旧缓冲区接口来创建一个ndarray?


cdef np.ndarray[DTYPE_t, ndim=1] b = np.empty(N, dtype=np.int32)


慕丝7291255
浏览 466回答 2
2回答

波斯汪

看看这里的回答。基本思想是您想要cpython.array.array和cpython.array.clone(不是 cython.array.*):from cpython.array cimport array, clone# This type is what you want and can be cast to things of# the "double[:]" syntax, so no problems therecdef array[double] armv, templatemvtemplatemv = array('d')# This is fastarmv = clone(templatemv, L, False)编辑事实证明,该线程中的基准是垃圾。这是我的设定,以及我的时间安排:# cython: language_level=3# cython: boundscheck=False# cython: wraparound=Falseimport timeimport sysfrom cpython.array cimport array, clonefrom cython.view cimport array as cvarrayfrom libc.stdlib cimport malloc, freeimport numpy as numpycimport numpy as numpycdef int loopsdef timefunc(name):&nbsp; &nbsp; def timedecorator(f):&nbsp; &nbsp; &nbsp; &nbsp; cdef int L, i&nbsp; &nbsp; &nbsp; &nbsp; print("Running", name)&nbsp; &nbsp; &nbsp; &nbsp; for L in [1, 10, 100, 1000, 10000, 100000, 1000000]:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; start = time.clock()&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; f(L)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; end = time.clock()&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; print(format((end-start) / loops * 1e6, "2f"), end=" ")&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sys.stdout.flush()&nbsp; &nbsp; &nbsp; &nbsp; print("μs")&nbsp; &nbsp; return timedecoratorprint()print("INITIALISATIONS")loops = 100000@timefunc("cpython.array buffer")def _(int L):&nbsp; &nbsp; cdef int i&nbsp; &nbsp; cdef array[double] arr, template = array('d')&nbsp; &nbsp; for i in range(loops):&nbsp; &nbsp; &nbsp; &nbsp; arr = clone(template, L, False)&nbsp; &nbsp; # Prevents dead code elimination&nbsp; &nbsp; str(arr[0])@timefunc("cpython.array memoryview")def _(int L):&nbsp; &nbsp; cdef int i&nbsp; &nbsp; cdef double[::1] arr&nbsp; &nbsp; cdef array template = array('d')&nbsp; &nbsp; for i in range(loops):&nbsp; &nbsp; &nbsp; &nbsp; arr = clone(template, L, False)&nbsp; &nbsp; # Prevents dead code elimination&nbsp; &nbsp; str(arr[0])@timefunc("cpython.array raw C type")def _(int L):&nbsp; &nbsp; cdef int i&nbsp; &nbsp; cdef array arr, template = array('d')&nbsp; &nbsp; for i in range(loops):&nbsp; &nbsp; &nbsp; &nbsp; arr = clone(template, L, False)&nbsp; &nbsp; # Prevents dead code elimination&nbsp; &nbsp; str(arr[0])@timefunc("numpy.empty_like memoryview")def _(int L):&nbsp; &nbsp; cdef int i&nbsp; &nbsp; cdef double[::1] arr&nbsp; &nbsp; template = numpy.empty((L,), dtype='double')&nbsp; &nbsp; for i in range(loops):&nbsp; &nbsp; &nbsp; &nbsp; arr = numpy.empty_like(template)&nbsp; &nbsp; # Prevents dead code elimination&nbsp; &nbsp; str(arr[0])@timefunc("malloc")def _(int L):&nbsp; &nbsp; cdef int i&nbsp; &nbsp; cdef double* arrptr&nbsp; &nbsp; for i in range(loops):&nbsp; &nbsp; &nbsp; &nbsp; arrptr = <double*> malloc(sizeof(double) * L)&nbsp; &nbsp; &nbsp; &nbsp; free(arrptr)&nbsp; &nbsp; # Prevents dead code elimination&nbsp; &nbsp; str(arrptr[0])@timefunc("malloc memoryview")def _(int L):&nbsp; &nbsp; cdef int i&nbsp; &nbsp; cdef double* arrptr&nbsp; &nbsp; cdef double[::1] arr&nbsp; &nbsp; for i in range(loops):&nbsp; &nbsp; &nbsp; &nbsp; arrptr = <double*> malloc(sizeof(double) * L)&nbsp; &nbsp; &nbsp; &nbsp; arr = <double[:L]>arrptr&nbsp; &nbsp; &nbsp; &nbsp; free(arrptr)&nbsp; &nbsp; # Prevents dead code elimination&nbsp; &nbsp; str(arr[0])@timefunc("cvarray memoryview")def _(int L):&nbsp; &nbsp; cdef int i&nbsp; &nbsp; cdef double[::1] arr&nbsp; &nbsp; for i in range(loops):&nbsp; &nbsp; &nbsp; &nbsp; arr = cvarray((L,),sizeof(double),'d')&nbsp; &nbsp; # Prevents dead code elimination&nbsp; &nbsp; str(arr[0])print()print("ITERATING")loops = 1000@timefunc("cpython.array buffer")def _(int L):&nbsp; &nbsp; cdef int i&nbsp; &nbsp; cdef array[double] arr = clone(array('d'), L, False)&nbsp; &nbsp; cdef double d&nbsp; &nbsp; for i in range(loops):&nbsp; &nbsp; &nbsp; &nbsp; for i in range(L):&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; d = arr[i]&nbsp; &nbsp; # Prevents dead-code elimination&nbsp; &nbsp; str(d)@timefunc("cpython.array memoryview")def _(int L):&nbsp; &nbsp; cdef int i&nbsp; &nbsp; cdef double[::1] arr = clone(array('d'), L, False)&nbsp; &nbsp; cdef double d&nbsp; &nbsp; for i in range(loops):&nbsp; &nbsp; &nbsp; &nbsp; for i in range(L):&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; d = arr[i]&nbsp; &nbsp; # Prevents dead-code elimination&nbsp; &nbsp; str(d)@timefunc("cpython.array raw C type")def _(int L):&nbsp; &nbsp; cdef int i&nbsp; &nbsp; cdef array arr = clone(array('d'), L, False)&nbsp; &nbsp; cdef double d&nbsp; &nbsp; for i in range(loops):&nbsp; &nbsp; &nbsp; &nbsp; for i in range(L):&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; d = arr[i]&nbsp; &nbsp; # Prevents dead-code elimination&nbsp; &nbsp; str(d)@timefunc("numpy.empty_like memoryview")def _(int L):&nbsp; &nbsp; cdef int i&nbsp; &nbsp; cdef double[::1] arr = numpy.empty((L,), dtype='double')&nbsp; &nbsp; cdef double d&nbsp; &nbsp; for i in range(loops):&nbsp; &nbsp; &nbsp; &nbsp; for i in range(L):&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; d = arr[i]&nbsp; &nbsp; # Prevents dead-code elimination&nbsp; &nbsp; str(d)@timefunc("malloc")def _(int L):&nbsp; &nbsp; cdef int i&nbsp; &nbsp; cdef double* arrptr = <double*> malloc(sizeof(double) * L)&nbsp; &nbsp; cdef double d&nbsp; &nbsp; for i in range(loops):&nbsp; &nbsp; &nbsp; &nbsp; for i in range(L):&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; d = arrptr[i]&nbsp; &nbsp; free(arrptr)&nbsp; &nbsp; # Prevents dead-code elimination&nbsp; &nbsp; str(d)@timefunc("malloc memoryview")def _(int L):&nbsp; &nbsp; cdef int i&nbsp; &nbsp; cdef double* arrptr = <double*> malloc(sizeof(double) * L)&nbsp; &nbsp; cdef double[::1] arr = <double[:L]>arrptr&nbsp; &nbsp; cdef double d&nbsp; &nbsp; for i in range(loops):&nbsp; &nbsp; &nbsp; &nbsp; for i in range(L):&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; d = arr[i]&nbsp; &nbsp; free(arrptr)&nbsp; &nbsp; # Prevents dead-code elimination&nbsp; &nbsp; str(d)@timefunc("cvarray memoryview")def _(int L):&nbsp; &nbsp; cdef int i&nbsp; &nbsp; cdef double[::1] arr = cvarray((L,),sizeof(double),'d')&nbsp; &nbsp; cdef double d&nbsp; &nbsp; for i in range(loops):&nbsp; &nbsp; &nbsp; &nbsp; for i in range(L):&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; d = arr[i]&nbsp; &nbsp; # Prevents dead-code elimination&nbsp; &nbsp; str(d)输出:INITIALISATIONSRunning cpython.array buffer0.100040 0.097140 0.133110 0.121820 0.131630 0.108420 0.112160 μsRunning cpython.array memoryview0.339480 0.333240 0.378790 0.445720 0.449800 0.414280 0.414060 μsRunning cpython.array raw C type0.048270 0.049250 0.069770 0.074140 0.076300 0.060980 0.060270 μsRunning numpy.empty_like memoryview1.006200 1.012160 1.128540 1.212350 1.250270 1.235710 1.241050 μsRunning malloc0.021850 0.022430 0.037240 0.046260 0.039570 0.043690 0.030720 μsRunning malloc memoryview1.640200 1.648000 1.681310 1.769610 1.755540 1.804950 1.758150 μsRunning cvarray memoryview1.332330 1.353910 1.358160 1.481150 1.517690 1.485600 1.490790 μsITERATINGRunning cpython.array buffer0.010000 0.027000 0.091000 0.669000 6.314000 64.389000 635.171000 μsRunning cpython.array memoryview0.013000 0.015000 0.058000 0.354000 3.186000 33.062000 338.300000 μsRunning cpython.array raw C type0.014000 0.146000 0.979000 9.501000 94.160000 916.073000 9287.079000 μsRunning numpy.empty_like memoryview0.042000 0.020000 0.057000 0.352000 3.193000 34.474000 333.089000 μsRunning malloc0.002000 0.004000 0.064000 0.367000 3.599000 32.712000 323.858000 μsRunning malloc memoryview0.019000 0.032000 0.070000 0.356000 3.194000 32.100000 327.929000 μsRunning cvarray memoryview0.014000 0.026000 0.063000 0.351000 3.209000 32.013000 327.890000 μs(之所以使用“迭代”基准,是因为某些方法在这方面具有令人惊讶的不同特征。)按照初始化速度的顺序:malloc:这是一个严酷的世界,但是很快。如果您需要分配很多东西并且具有不受阻碍的迭代和索引性能,那就必须如此。但通常情况下,您是个不错的选择。cpython.array raw C type:该死,很快。而且很安全。不幸的是,它通过Python来访问其数据字段。您可以使用一个绝妙的技巧来避免这种情况:arr.data.as_doubles[i]在确保安全的同时,使其达到标准速度!这使它成为的绝妙替代品malloc,基本上是一个参考计数很高的版本!cpython.array buffer:只需3到4倍的设置时间即可进入malloc,这看起来是个不错的选择。不幸的是,它具有大量的开销(尽管与boundscheckand wraparound指令相比很小)。这意味着它只能与完全安全的变体竞争,但它是初始化速度最快的变体。你的选择。cpython.array memoryview:这比malloc初始化要慢一个数量级。太可惜了,但是迭代的速度一样快。这是我建议的标准解决方案,除非boundscheck或wraparound启用(在这种情况下cpython.array buffer可能是更引人注目的折衷方案)。其余的部分。numpy由于对象具有许多有趣的方法,因此唯一有价值的东西是。就是这样。

慕容3067478

作为Veedrac答案的后续行动:请注意,使用python 2.7 的memoryview支持cpython.array似乎导致当前内存泄漏。这似乎是一个长期存在的问题,因为它是在用Cython用户邮件列表中提到这里从2012年11月后运行Veedrac与用Cython版本0.22的基准通货与两个的Python 2.7.6和Python 2.7.9通向cpython.array使用buffer或memoryview接口初始化a时,大内存泄漏。使用Python 3.4运行脚本时,不会发生内存泄漏。我已经将此问题报告给Cython开发人员邮件列表。
打开App,查看更多内容
随时随地看视频慕课网APP

相关分类

Python