[GENERAL] Merged v1.0alpha into master. Added features are:

- A100 support via mma.16816
- Thread swizzling for conflict-free shared memory accesses without
padding
- Complete overhaul of the LLVM code generation in
codegen/selection/generator.cc to remove overengineering
- Added debugging capabilities in the Python binding
- Compilation error for kernels that spill
This commit is contained in:
Philippe Tillet
2021-01-11 19:20:34 -05:00
parent c0bc7ed8b0
commit 083bbd1e8d
75 changed files with 2688 additions and 4512 deletions

View File

@@ -65,7 +65,12 @@ void print(module &mod, std::ostream& os) {
os << get_name(ops[i], cnt++);
os << (i < num_ops - 1?", ":"");
}
os << ";" << std::endl;
os << ";";
// os << " (";
// for(ir::user* usr: inst->get_users())
// os << get_name(usr, cnt++) << ", " ;
// os << " )";
os << std::endl;
}
}
os << "}" << std::endl;