H2Lib
3.0
|
Management routines for OpenCL computations via task scheduler. More...
Data Structures | |
struct | _ocl |
Structure that contains basic OpenCL objects for arbitrary computations. More... | |
struct | _task |
Simple representation of a task. More... | |
struct | _taskgroup |
A collection of tasks for the same callback function. More... | |
Macros | |
#define | CL_CHECK(res) |
#define | SCHEDULE_OPENCL(cpu_threads, gpu_threads, func, ...) |
Macro that simplifies the use of the task scheduler. More... | |
Typedefs | |
typedef struct _task | task |
Abbreviation for a struct _task. | |
typedef task * | ptask |
Abbreviation for a pointer to a struct _task. | |
typedef struct _taskgroup | taskgroup |
Abbreviation for a struct _taskgroup. | |
typedef taskgroup * | ptaskgroup |
Abbreviation for a pointer to a struct _taskgroup. | |
Enumerations | |
enum | task_affinity { CPU_ONLY, GPU_ONLY, CPU_FIRST, GPU_FIRST } |
This enum specifies the affinity of exceution of a taskgroup. More... | |
Functions | |
void | get_opencl_devices (cl_device_id **devices, cl_uint *ndevices) |
Retrieve an array of available OpenCL devices. More... | |
void | set_opencl_devices (cl_device_id *devices, cl_uint ndevices, cl_uint queues_per_device) |
Set an array of OpenCL devices to be used for computations. More... | |
const char * | get_error_string (cl_int error) |
Returns a string corresponding to the error code error . More... | |
void | setup_kernels (const char *src_str, const uint num_kernels, const char **kernel_names, cl_kernel **kernels) |
Reads a file specified by filename and compiles all OpenCL kernels given by the array kernel_names into OpenCL kernels kernels . More... | |
ptask | new_ocltask (void *data, ptask next) |
Create a new task. More... | |
void | del_ocltask (ptaskgroup tg, ptask t) |
Delete a task. More... | |
ptaskgroup | new_ocltaskgroup (task_affinity affinity, ptaskgroup next, void(*merge_tasks)(ptaskgroup tg, void **data), void(*cleanup_merge)(void *data), void(*distribute_results)(ptaskgroup tg, void *data), void(*close_taskgroup)(ptaskgroup tg), void(*callback_cpu)(void *data), void(*callback_gpu)(void *data), size_t(*getsize_task)(void *data), void(*split_task)(void *data, void ***split, uint *n), void(*cleanup_task)(void *data), void *data) |
Create a new list of tasks with the same code to execute. More... | |
void | del_taskgroup (ptaskgroup tg) |
Delete a taskgroup object. More... | |
void | add_task_taskgroup (ptaskgroup *tg, void *data) |
Adds a new _task to a _taskgroup. More... | |
void | add_taskgroup_to_scheduler (ptaskgroup tg) |
Enqueues a full taskgroup to the execution queue of the scheduler. More... | |
void | start_scheduler (uint cpu_threads, uint gpu_threads) |
Starts the task scheduler with cpu_threads Threads on the CPU and gpu_threads on the GPU. More... | |
void | stop_scheduler () |
Stop the task scheduler. More... | |
Variables | |
struct _ocl | ocl_system |
Global variable that contains basic OpenCL objects for arbitrary computations. | |
Management routines for OpenCL computations via task scheduler.
#define CL_CHECK | ( | res | ) |
Simple macro, that checks OpenCL error codes and prints them to stderr
.
#define SCHEDULE_OPENCL | ( | cpu_threads, | |
gpu_threads, | |||
func, | |||
... | |||
) |
Macro that simplifies the use of the task scheduler.
cpu_threads | Number of CPU worker threads. This parameter will be passed directly to start_scheduler. |
gpu_threads | Number of CPU threads, that employ the available GPUs. This parameter will be directly passed to start_scheduler. |
func | The function call that shall be handled by the task scheduler. All necessary parameters of func should be passed afterwards. |
enum task_affinity |
This enum specifies the affinity of exceution of a taskgroup.
Enumerator | |
---|---|
CPU_ONLY |
CPU_ONLY A taskgroup should only be executed on the cpu. |
GPU_ONLY |
GPU_ONLY A taskgroup should only be executed on the gpu. |
CPU_FIRST |
CPU_FIRST A taskgroup is preferred to be executed on the cpu, but execution on the gpu might be possible. |
GPU_FIRST |
GPU_FIRST A taskgroup is preferred to be executed on the gpu, but execution on the cpu might be possible. |
void add_task_taskgroup | ( | ptaskgroup * | tg, |
void * | data | ||
) |
Adds a new _task to a _taskgroup.
The task is enqueued into the task list of the _taskgroup object tg
.
tg | The _taskgroup object where the task has to be enqueued to. |
data | The data to be enqueued. |
void add_taskgroup_to_scheduler | ( | ptaskgroup | tg | ) |
Enqueues a full taskgroup to the execution queue of the scheduler.
tg | _taskgroup to be enqueued. |
void del_ocltask | ( | ptaskgroup | tg, |
ptask | t | ||
) |
void del_taskgroup | ( | ptaskgroup | tg | ) |
const char* get_error_string | ( | cl_int | error | ) |
Returns a string corresponding to the error code error
.
error | An error code return by some OpenCL runtime function. |
void get_opencl_devices | ( | cl_device_id ** | devices, |
cl_uint * | ndevices | ||
) |
Retrieve an array of available OpenCL devices.
devices | Pointer to an array where the results should be stored. |
ndevices | Number of retrieved devices will be stored in this variable. |
Create a new task.
A new _task object will be created containing the input / output data of the task specified by data
and a pointer to the next element in the task list given by next
.
data | Data to processed by the new task. |
next | Pointer to the next task in the list. |
ptaskgroup new_ocltaskgroup | ( | task_affinity | affinity, |
ptaskgroup | next, | ||
void(*)(ptaskgroup tg, void **data) | merge_tasks, | ||
void(*)(void *data) | cleanup_merge, | ||
void(*)(ptaskgroup tg, void *data) | distribute_results, | ||
void(*)(ptaskgroup tg) | close_taskgroup, | ||
void(*)(void *data) | callback_cpu, | ||
void(*)(void *data) | callback_gpu, | ||
size_t(*)(void *data) | getsize_task, | ||
void(*)(void *data, void ***split, uint *n) | split_task, | ||
void(*)(void *data) | cleanup_task, | ||
void * | data | ||
) |
Create a new list of tasks with the same code to execute.
Create a new _taskgroup object and set the callback functions in a way that the correct piece of code will be executed on the data provided by the particular tasks.
affinity | |
next | |
merge_tasks | Callback function that merges the data of the task list into a single data element for processing on the GPU. |
cleanup_merge | Callback function which is called after completion of merge_tasks to cleanup its intermediate results. |
distribute_results | Callback function that distributes the results of callback_gpu corresponding to the task list. |
close_taskgroup | This callback function is responsible to safely close a non-empty but not completely filled taskgroup. |
getsize_task | Calculate the size of the results. |
callback_cpu | Code to be executed on the CPU for the list of tasks. |
callback_gpu | Code to be executed on the GPU for the list of tasks. When callback_gpu is set, the callbacks merge_tasks and distribute_results have to be set correctly as well. |
getsize_task | The result of this callback function is the size in Byte, that a single task, belonging to the current taskgroup, has. |
split_task | This callback function will split a single task into a number of smaller tasks, that might fit better into a taskgroup than the original one. |
cleanup_task | Callback to free memory used by a task. |
data | Additional information for a taskgroup |
void set_opencl_devices | ( | cl_device_id * | devices, |
cl_uint | ndevices, | ||
cl_uint | queues_per_device | ||
) |
Set an array of OpenCL devices to be used for computations.
devices | An array of OpenCL device-ids. The length should be the same as the result from get_opencl_devices array. Devices that should not be used have to be set to 0 . |
ndevices | Number of retrieved devices will be stored in this variable. |
queues_per_device | Determines the number of CPU threads that should employ a single GPU. |
void setup_kernels | ( | const char * | src_str, |
const uint | num_kernels, | ||
const char ** | kernel_names, | ||
cl_kernel ** | kernels | ||
) |
Reads a file specified by filename
and compiles all OpenCL kernels given by the array kernel_names
into OpenCL kernels kernels
.
src_str | Source code string. |
num_kernels | Number of kernels that should be compiled given by kernel_names . |
kernel_names | Array of function names that should be compiled as OpenCL kernels. |
kernels | Resulting array of OpenCL kernels. |
Starts the task scheduler with cpu_threads
Threads on the CPU and gpu_threads
on the GPU.
cpu_threads | Number of CPU threads for task executions |
gpu_threads | Number of CPU threads, that employ the available GPUs for task executions. |
void stop_scheduler | ( | ) |
Stop the task scheduler.