[GENERAL] Merged v1.0alpha into master. Added features are:
- A100 support via mma.16816 - Thread swizzling for conflict-free shared memory accesses without padding - Complete overhaul of the LLVM code generation in codegen/selection/generator.cc to remove overengineering - Added debugging capabilities in the Python binding - Compilation error for kernels that spill
This commit is contained in:
@@ -68,9 +68,10 @@ unsigned user::get_num_hidden() const {
|
||||
|
||||
value::users_t::iterator user::replace_uses_of_with(value *before, value *after) {
|
||||
for(size_t i = 0; i < ops_.size(); i++)
|
||||
if(ops_[i] == before)
|
||||
if(ops_[i] == before){
|
||||
ops_[i] = after;
|
||||
after->add_use(this);
|
||||
after->add_use(this);
|
||||
}
|
||||
return before->erase_use(this);
|
||||
}
|
||||
|
||||
|
Reference in New Issue
Block a user