PyCUDA: syntax for function that calls a function -
when using function sourcemodule depends on function in sourcemodule, how pass in function call, i.e. "???" in following code:
import numpy import pycuda.autoinit import pycuda.driver drv pycuda.compiler import sourcemodule mod = sourcemodule(""" __global__ void make_square(float *in_array, float *out_array) { int i; int n = 5; (i=0; i<n; i++) { out_array[i] = pow(in_array[i],2); } } __global__ void make_square_add_one(float *in_array, float *out_array, void make_square(float *, float *)) { int n = 5; make_square(in_array,out_array); (int i=0; i<n; i++) out_array[i] = out_array[i] + 1; } """) make_square = mod.get_function("make_square") make_square_add_one = mod.get_function("make_square_add_one") in_array = numpy.array([1.,2.,3.,4.,5.]).astype(numpy.float32) out_array = numpy.zeros_like(in_array).astype(numpy.float32) make_square_add_one(drv.in(in_array), drv.out(out_array), ??? , block = (1,1,1), grid = (1,1))
thanks information.
in traditional cuda execution model, __global__ functions kernels, , can't passed arguments other kernels , can't called other kernels. looks make_square
should device function, like:
__device__ void make_square(float *in_array, float *out_array) { int i; (i=0; i<5; i++) { out_array[i] = pow(in_array[i],2); } }
which called running kernel as:
__global__ void make_square_add_one(float *in_array, float *out_array) { int n = 5; make_square(in_array,out_array); (int i=0; i<n; i++) out_array[i] = out_array[i] + 1; }
it worth noting kernel entirely serial , pretty orthogonal how cuda kernels expected written.
Comments
Post a Comment