Philippe Tillet
2acaa4d0dd
[LANG] Added support for constexpr ( #361 )
2021-10-30 00:32:58 -07:00
Philippe Tillet
2c287544cb
[OPS] Faster and cleaner block-sparse implementation ( #311 )
2021-09-27 18:25:16 -07:00
Philippe Tillet
ec2e7b8f48
[CODEGEN] Fixed nasty bug in coalesce pass ( #303 )
2021-09-23 17:05:11 -07:00
Philippe Tillet
2849e7a773
[CODEGEN] now re-coalescing before atomics ( #298 )
2021-09-22 13:35:53 -07:00
Philippe Tillet
4ff3714d61
[CODEGEN] Various bugfixes and stability improvements in compiler backend ( #240 )
2021-08-30 11:50:35 -07:00
Philippe Tillet
083bbd1e8d
[GENERAL] Merged v1.0alpha into master. Added features are:
...
- A100 support via mma.16816
- Thread swizzling for conflict-free shared memory accesses without
padding
- Complete overhaul of the LLVM code generation in
codegen/selection/generator.cc to remove overengineering
- Added debugging capabilities in the Python binding
- Compilation error for kernels that spill
2021-07-27 12:38:48 -07:00
Philippe Tillet
a8f1b85c5f
[CODEGEN] Removed unnecessary coalescing rematerialization
2021-07-27 12:38:48 -07:00
Philippe Tillet
0516ea96d0
[CODEGEN] Fixed bug that caused missing recoalescing for some transpose
...
operations
2021-07-27 12:38:48 -07:00
Philippe Tillet
6d7cf35123
History prior to this date belonged to the now deprecated ISAAC project, and was deleted to save space
2021-07-27 12:38:38 -07:00