[GENERAL] Merged v1.0alpha into master. Added features are:

- A100 support via mma.16816
- Thread swizzling for conflict-free shared memory accesses without
padding
- Complete overhaul of the LLVM code generation in
codegen/selection/generator.cc to remove overengineering
- Added debugging capabilities in the Python binding
- Compilation error for kernels that spill
This commit is contained in:
Philippe Tillet
2021-01-11 19:20:34 -05:00
parent c0bc7ed8b0
commit 083bbd1e8d
75 changed files with 2688 additions and 4512 deletions

View File

@@ -68,9 +68,10 @@ unsigned user::get_num_hidden() const {
value::users_t::iterator user::replace_uses_of_with(value *before, value *after) {
for(size_t i = 0; i < ops_.size(); i++)
if(ops_[i] == before)
if(ops_[i] == before){
ops_[i] = after;
after->add_use(this);
after->add_use(this);
}
return before->erase_use(this);
}