如何为CUDA内核选择网格和块尺寸?
const int n = 128 * 1024;int blocksize = 512; // value usually chosen by tuning and hardware constraintsint nblocks = n / nthreads; // value determine by block size and total workmadd<<<nblocks,blocksize>>>mAdd(A,B,C,n);
ibeautiful
相关分类