HIP: Heterogenous-computing Interface for Portability
|
This section describes the device management functions of HIP runtime API. More...
Functions | |
hipError_t | hipDeviceSynchronize (void) |
Waits on all active streams on current device. More... | |
hipError_t | hipDeviceReset (void) |
The state of current device is discarded and updated to a fresh state. More... | |
hipError_t | hipSetDevice (int deviceId) |
Set default device to be used for subsequent hip API calls from this thread. More... | |
hipError_t | hipGetDevice (int *deviceId) |
Return the default device id for the calling host thread. More... | |
hipError_t | hipGetDeviceCount (int *count) |
Return number of compute-capable devices. More... | |
hipError_t | hipDeviceGetAttribute (int *pi, hipDeviceAttribute_t attr, int deviceId) |
Query for a specific device attribute. More... | |
hipError_t | hipGetDeviceProperties (hipDeviceProp_t *prop, int deviceId) |
Returns device properties. More... | |
hipError_t | hipDeviceSetCacheConfig (hipFuncCache_t cacheConfig) |
Set L1/Shared cache partition. More... | |
hipError_t | hipDeviceGetCacheConfig (hipFuncCache_t *cacheConfig) |
Set Cache configuration for a specific function. More... | |
hipError_t | hipDeviceGetLimit (size_t *pValue, enum hipLimit_t limit) |
Get Resource limits of current device. More... | |
hipError_t | hipDeviceGetSharedMemConfig (hipSharedMemConfig *pConfig) |
Returns bank width of shared memory for current device. More... | |
hipError_t | hipGetDeviceFlags (unsigned int *flags) |
Gets the flags set for current device. More... | |
hipError_t | hipDeviceSetSharedMemConfig (hipSharedMemConfig config) |
The bank width of shared memory on current device is set. More... | |
hipError_t | hipSetDeviceFlags (unsigned flags) |
The current device behavior is changed according the flags passed. More... | |
hipError_t | hipChooseDevice (int *device, const hipDeviceProp_t *prop) |
Device which matches hipDeviceProp_t is returned. More... | |
hipError_t | hipExtGetLinkTypeAndHopCount (int device1, int device2, uint32_t *linktype, uint32_t *hopcount) |
Returns the link type and hop count between two devices. More... | |
hipError_t | hipIpcGetMemHandle (hipIpcMemHandle_t *handle, void *devPtr) |
Gets an interprocess memory handle for an existing device memory allocation. More... | |
hipError_t | hipIpcOpenMemHandle (void **devPtr, hipIpcMemHandle_t handle, unsigned int flags) |
Opens an interprocess memory handle exported from another process and returns a device pointer usable in the local process. More... | |
hipError_t | hipIpcCloseMemHandle (void *devPtr) |
Close memory mapped with hipIpcOpenMemHandle. More... | |
hipError_t | hipIpcGetEventHandle (hipIpcEventHandle_t *handle, hipEvent_t event) |
hipError_t | hipIpcOpenEventHandle (hipEvent_t *event, hipIpcEventHandle_t handle) |
This section describes the device management functions of HIP runtime API.
hipError_t hipChooseDevice | ( | int * | device, |
const hipDeviceProp_t * | prop | ||
) |
Device which matches hipDeviceProp_t is returned.
[out] | device | ID |
[in] | device | properties pointer |
hipError_t hipDeviceGetAttribute | ( | int * | pi, |
hipDeviceAttribute_t | attr, | ||
int | deviceId | ||
) |
Query for a specific device attribute.
[out] | pi | pointer to value to return |
[in] | attr | attribute to query |
[in] | deviceId | which device to query for information |
hipError_t hipDeviceGetCacheConfig | ( | hipFuncCache_t * | cacheConfig | ) |
Set Cache configuration for a specific function.
[in] | cacheConfig |
hipError_t hipDeviceGetLimit | ( | size_t * | pValue, |
enum hipLimit_t | limit | ||
) |
Get Resource limits of current device.
[out] | pValue | |
[in] | limit |
hipError_t hipDeviceGetSharedMemConfig | ( | hipSharedMemConfig * | pConfig | ) |
Returns bank width of shared memory for current device.
[out] | pConfig |
Note: AMD devices and some Nvidia GPUS do not support shared cache banking, and the hint is ignored on those architectures.
hipError_t hipDeviceReset | ( | void | ) |
The state of current device is discarded and updated to a fresh state.
Calling this function deletes all streams created, memory allocated, kernels running, events created. Make sure that no other thread is using the device or streams, memory, kernels, events associated with the current device.
hipError_t hipDeviceSetCacheConfig | ( | hipFuncCache_t | cacheConfig | ) |
Set L1/Shared cache partition.
[in] | cacheConfig |
hipError_t hipDeviceSetSharedMemConfig | ( | hipSharedMemConfig | config | ) |
The bank width of shared memory on current device is set.
[in] | config |
Note: AMD devices and some Nvidia GPUS do not support shared cache banking, and the hint is ignored on those architectures.
hipError_t hipDeviceSynchronize | ( | void | ) |
Waits on all active streams on current device.
When this command is invoked, the host thread gets blocked until all the commands associated with streams associated with the device. HIP does not support multiple blocking modes (yet!).
hipError_t hipExtGetLinkTypeAndHopCount | ( | int | device1, |
int | device2, | ||
uint32_t * | linktype, | ||
uint32_t * | hopcount | ||
) |
Returns the link type and hop count between two devices.
[in] | device1 | Ordinal for device1 |
[in] | device2 | Ordinal for device2 |
[out] | linktype | Returns the link type (See hsa_amd_link_info_type_t) between the two devices |
[out] | hopcount | Returns the hop count between the two devices |
Queries and returns the HSA link type and the hop count between the two specified devices.
hipError_t hipGetDevice | ( | int * | deviceId | ) |
Return the default device id for the calling host thread.
[out] | device | *device is written with the default device |
HIP maintains an default device for each thread using thread-local-storage. This device is used implicitly for HIP runtime APIs called by this thread. hipGetDevice returns in * device
the default device for the calling host thread.
hipError_t hipGetDeviceCount | ( | int * | count | ) |
Return number of compute-capable devices.
[output] | count Returns number of compute-capable devices. |
Returns in *count
the number of devices that have ability to run compute commands. If there are no such devices, then hipGetDeviceCount will return hipErrorNoDevice. If 1 or more devices can be found, then hipGetDeviceCount returns hipSuccess.
hipError_t hipGetDeviceFlags | ( | unsigned int * | flags | ) |
Gets the flags set for current device.
[out] | flags |
hipError_t hipGetDeviceProperties | ( | hipDeviceProp_t * | prop, |
int | deviceId | ||
) |
Returns device properties.
[out] | prop | written with device properties |
[in] | deviceId | which device to query for information |
Populates hipGetDeviceProperties with information for the specified device.
hipError_t hipIpcCloseMemHandle | ( | void * | devPtr | ) |
Close memory mapped with hipIpcOpenMemHandle.
Unmaps memory returnd by hipIpcOpenMemHandle. The original allocation in the exporting process as well as imported mappings in other processes will be unaffected.
Any resources used to enable peer access will be freed if this is the last mapping using them.
devPtr | - Device pointer returned by hipIpcOpenMemHandle |
hipError_t hipIpcGetMemHandle | ( | hipIpcMemHandle_t * | handle, |
void * | devPtr | ||
) |
Gets an interprocess memory handle for an existing device memory allocation.
Takes a pointer to the base of an existing device memory allocation created with hipMalloc and exports it for use in another process. This is a lightweight operation and may be called multiple times on an allocation without adverse effects.
If a region of memory is freed with hipFree and a subsequent call to hipMalloc returns memory with the same device address, hipIpcGetMemHandle will return a unique handle for the new memory.
handle | - Pointer to user allocated hipIpcMemHandle to return the handle in. |
devPtr | - Base pointer to previously allocated device memory |
hipError_t hipIpcOpenMemHandle | ( | void ** | devPtr, |
hipIpcMemHandle_t | handle, | ||
unsigned int | flags | ||
) |
Opens an interprocess memory handle exported from another process and returns a device pointer usable in the local process.
Maps memory exported from another process with hipIpcGetMemHandle into the current device address space. For contexts on different devices hipIpcOpenMemHandle can attempt to enable peer access between the devices as if the user called hipDeviceEnablePeerAccess. This behavior is controlled by the hipIpcMemLazyEnablePeerAccess flag. hipDeviceCanAccessPeer can determine if a mapping is possible.
Contexts that may open hipIpcMemHandles are restricted in the following way. hipIpcMemHandles from each device in a given process may only be opened by one context per device per other process.
Memory returned from hipIpcOpenMemHandle must be freed with hipIpcCloseMemHandle.
Calling hipFree on an exported memory region before calling hipIpcCloseMemHandle in the importing context will result in undefined behavior.
devPtr | - Returned device pointer |
handle | - hipIpcMemHandle to open |
flags | - Flags for this operation. Must be specified as hipIpcMemLazyEnablePeerAccess |
*devPtr
. In particular, multiple processes may not receive the same address for the same handle
. hipError_t hipSetDevice | ( | int | deviceId | ) |
Set default device to be used for subsequent hip API calls from this thread.
[in] | deviceId | Valid device in range 0...hipGetDeviceCount(). |
Sets device
as the default device for the calling host thread. Valid device id's are 0... (hipGetDeviceCount()-1).
Many HIP APIs implicitly use the "default device" :
This function may be called from any host thread. Multiple host threads may use the same device. This function does no synchronization with the previous or new device, and has very little runtime overhead. Applications can use hipSetDevice to quickly switch the default device before making a HIP runtime call which uses the default device.
The default device is stored in thread-local-storage for each thread. Thread-pool implementations may inherit the default device of the previous thread. A good practice is to always call hipSetDevice at the start of HIP coding sequency to establish a known standard device.
hipError_t hipSetDeviceFlags | ( | unsigned | flags | ) |
The current device behavior is changed according the flags passed.
[in] | flags | The schedule flags impact how HIP waits for the completion of a command running on a device. hipDeviceScheduleSpin : HIP runtime will actively spin in the thread which submitted the work until the command completes. This offers the lowest latency, but will consume a CPU core and may increase power. hipDeviceScheduleYield : The HIP runtime will yield the CPU to system so that other tasks can use it. This may increase latency to detect the completion but will consume less power and is friendlier to other tasks in the system. hipDeviceScheduleBlockingSync : On ROCm platform, this is a synonym for hipDeviceScheduleYield. hipDeviceScheduleAuto : Use a hueristic to select between Spin and Yield modes. If the number of HIP contexts is greater than the number of logical processors in the system, use Spin scheduling. Else use Yield scheduling. |
hipDeviceMapHost : Allow mapping host memory. On ROCM, this is always allowed and the flag is ignored. hipDeviceLmemResizeToMax :