HIP: Heterogenous-computing Interface for Portability
HIP RTC Programming Guide

HIP RTC lib

HIP allows you to compile kernels at runtime with its hiprtc* APIs. Kernels can be store as a text string and can be passed on to hiprtc APIs alongside options to guide the compilation.

NOTE:

  • This library can be used on systems without HIP install nor AMD GPU driver installed at all (offline compilation). Therefore it does not depend on any HIP runtime library.
  • But it does depend on COMGr. We may try to statically link COMGr into hipRTC to avoid any ambiguity.
  • Developers can decide to bundle this library with their application.

Example

To use hiprtc functionality, hiprtc header needs to be included first.

{#include}
Kernels can be stored in a string:
```cpp
static constexpr auto kernel {
R"(
extern "C"
__global__ void gpu_kernel(...) {
// Kernel Functionality
}
)"};

Now to compile this kernel, it needs to be associated with hiprtcProgram type, which is done via declaring hiprtcProgram prog; and associating the string of kernel with this program:

hiprtcCreateProgram(&prog, // hiprtc program
kernel, // kernel string
"gpu_kernel.cu", // Name of the file
num_headers, // Number of headers
&header_sources[0], // Header sources
&header_names[0]); // Name of header files
hiprtcResult hiprtcCreateProgram(hiprtcProgram *prog, const char *src, const char *name, int numHeaders, const char **headers, const char **includeNames)
Creates an instance of hiprtcProgram with the given input parameters, and sets the output hiprtcProgr...

hiprtcCreateProgram API also allows you to add headers which can be included in your rtc program. For online compilation, the compiler pre-defines HIP device API functions, HIP specific types and macros for device compilation, but does not include standard C/C++ headers by default. Users can only include header files provided to hiprtcCreateProgram.

After associating the kernel string with hiprtcProgram, you can now compile this program using:

hiprtcCompileProgram(prog, // hiprtcProgram
0, // Number of options
options); // Clang Options [Supported Clang Options](clang_options.md)
hiprtcResult hiprtcCompileProgram(hiprtcProgram prog, int numOptions, const char **options)
Compiles the given runtime compilation program.

hiprtcCompileProgram returns a status value which can be converted to string via hiprtcGetErrorString. If compilation is successful, hiprtcCompileProgram will return HIPRTC_SUCCESS.

If the compilation fails, you can look up the logs via:

size_t logSize;
hiprtcGetProgramLogSize(prog, &logSize);
if (logSize) {
string log(logSize, '\0');
hiprtcGetProgramLog(prog, &log[0]);
// Corrective action with logs
}
hiprtcResult hiprtcGetProgramLogSize(hiprtcProgram prog, size_t *logSizeRet)
Gets the size of log generated by the runtime compilation program instance.
hiprtcResult hiprtcGetProgramLog(hiprtcProgram prog, char *log)
Gets the log generated by the runtime compilation program instance.

If the compilation is successful, you can load the compiled binary in a local variable.

size_t codeSize;
hiprtcGetCodeSize(prog, &codeSize);
vector<char> kernel_binary(codeSize);
hiprtcGetCode(kernel_binary, code.data());
hiprtcResult hiprtcGetCodeSize(hiprtcProgram prog, size_t *codeSizeRet)
Gets the size of compilation binary by the runtime compilation program instance.
hiprtcResult hiprtcGetCode(hiprtcProgram prog, char *code)
Gets the pointer of compilation binary by the runtime compilation program instance.

After loading the binary, hiprtcProgram can be destroyed.

{hiprtcDestroyProgram(&prog);```}
The binary present in ```kernel_binary``` can now be loaded via ```hipModuleLoadData``` API.
```cpp
hipModule_t module;
hipFunction_t kernel;
hipModuleLoadData(&module, kernel_binary.data());
hipModuleGetFunction(&kernel, module, "gpu_kernel");

And now this kernel can be launched via hipModule APIs.

Please have a look at saxpy.cpp and hiprtcGetLoweredName.cpp files for a detailed example.

HIPRTC specific options

HIPRTC provides a few hiprtc specific flags

  • --gpu-architecture : This flag can guide the code object generation for a specific gpu arch. Example: --gpu-architecture=gfx906:sramecc+:xnack-, its equivalent to --offload-arch.
    • This option is compulsory if compilation is done on a system without AMD GPUs supported by HIP runtime.
    • Otherwise, hipRTC will load the hip runtime and gather the current device and its architecture info and use it as option.

Deprecation notice

Currently HIPRTC APIs are separated from HIP APIs and HIPRTC is available as a separate library libhiprtc.so/libhiprtc.dll. But hiprtc symbols are present in libhipamd64.so/libhipamd64.dll in order to support the existing applications. Gradually, these symbols will be removed from HIP library and applications using HIPRTC will be required to explictly link to HIPRTC library.