Improve ROCm support. (#780)
- updates to support ROCm 5.2 - workarounds in tests where NV tools were used unconditionally - implemented `get_num_blocks()` and `add_memfence()` for AMD GPU - backported from history some atomics - added bf16 support - minor warnings cleanup - added dockerfile to run on a ROCm enabled machine Co-authored-by: B1tway <andrew.shukshov@gmail.com> Co-authored-by: Andrey Shukshov <36711069+B1tway@users.noreply.github.com>
This commit is contained in:
@@ -47,6 +47,8 @@ ir::type *computation_type(ir::type* a_ty, ir::type* b_ty){
|
||||
// converted to half
|
||||
if(a_ty->is_fp16_ty() || b_ty->is_fp16_ty())
|
||||
return type::get_fp16_ty(ctx);
|
||||
if(a_ty->is_bf16_ty() || b_ty->is_bf16_ty())
|
||||
return type::get_bf16_ty(ctx);
|
||||
if(!a_ty->is_integer_ty() || !b_ty->is_integer_ty())
|
||||
throw_unreachable("augment_types");
|
||||
// 4 ) both operands are integer and undergo
|
||||
|
Reference in New Issue
Block a user