如何为CUDA内核选择网格和块尺寸?
const int n = 128 * 1024;int blocksize = 512; // value usually chosen by tuning and hardware constraintsint nblocks = n / nthreads; // value determine by block size and total workmadd<<<nblocks,blocksize>>>mAdd(A,B,C,n);
慕仙森
ibeautiful
随时随地看视频慕课网APP