dot-kernel: Excute the openCL function

Description Usage Arguments Details Value Examples

Description

The function serves as a bridge between R and openCL, it sends the openCL code and R matrix object to the device and excutes it on the device. The function has an auto-type ability which can make the openCL code independent with the type of its function argument, see detail and examples for the usage.

Usage

1
2
3
4
5
6
7
8
.kernel(
  src = "",
  kernel,
  parms,
  .device = "auto",
  .globalThreadNum = "length(FirstArg)",
  .options = kernel.getOption()
)

Arguments

src

the source code, it can be either a file directory or the code

kernel

the kernel function that will be called on the device

parms

a list containing the function arguments. The number of elements in the list has to match the number of function arguments.

.device

the device that will excute the function. If not specified, all the selected devices will be used.

.globalThreadNum

the number of threads that will be created to excute the kernel. If not specified, the length of the first argument will be used as the thread number

.options

the kernel options

Details

The function .kernel() is the low level API to communicate with openCL device. It provides a way to run the customized code on the device, the source code should be openCL code and the kernel is the kernel function that you want to run on the device.

You can specify with device the code should be run on by specifying the .device argument. By default, if you do not specify any device, the first device in the device list will be used

The argument .globalThreadNum specifys the number of threads that will be used to excute the kernel. The concept is the same as 'global_work_size“ in openCL functions

There are multiple options that you can change in the kernel function. You can call the function kernel.getOption() to obtain the default setting. The most distinguishable feature in this package is probably the auto type function, which can set the type of the kernel arguments as an macro in the openCL code. This feature allows the user to create a type-free code. If the kernelOption$autoType in .options is true(Default), four macros will be defined, they are(X is the position of the function arguments):

autoX: The variable type

gAutoX: Short for global autoX

lAutoX: short for local autoX

autoX_v4: Define a vector of length 4 with the same variable type as the X th function argument

Please refer to the example for the usage

Value

A vector or a matrix

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
#The GPU code
code='
kernel void matAdd(gAuto1* A,gAuto2* B,gAuto3* C,gAuto4* size){
uint col_id=get_global_id(0);
uint rowNum=*size;
for(uint i=0;i<rowNum;i++){
C[i+col_id*rowNum]=A[i+col_id*rowNum]+B[i+col_id*rowNum];
}
}
'
#Create data in R
m=100
n=200
A=matrix(runif(m*n),m,n)
B=matrix(runif(m*n),m,n)
#Send the data to GPU
A_dev=gpuMatrix(A,type='double')
B_dev=gpuMatrix(B,type='double')
#Create an empty data matrix in GPU
C_dev=gpuEmptMatrix(row=m,col=n,type='double')

#Get the default options
options=kernel.getOption()
#Run the GPU function with n threads, each thread computes one column addition
.kernel(src = code,kernel='matAdd',parms=list(A_dev,B_dev,C_dev,m),
.globalThreadNum = n,.options = options)

#This is just a patch to fix check error
if(!is.null(C_dev)){
#Retrieve the data
C_dev=download(C_dev)
C=as.matrix(C_dev)
#Check the error
range(C-A-B)
}

gpuMagic documentation built on Nov. 8, 2020, 5:15 p.m.