Philippe Tillet
7c09ff80eb
[CORE] Fixed several issues that arose in the development of the
...
torch-blocksparse package:
* Now using warp shuffle in reductions when possible
* Various bugfixes in layout inference
* Added INFINITY, exponential and select
* Better error messages for unimplemented constructs
2020-03-31 18:57:28 -04:00
Philippe Tillet
7621aeda3f
[CODEGEN][TRANSFORM][PEEPHOLE] Fixed bug in *1 multiplication
2020-02-19 00:18:55 -05:00
Philippe Tillet
de6fdd5625
[general] removed useless files and includes
2019-10-20 19:29:48 -04:00
Philippe Tillet
ed1b2bc563
more work on padding
2019-09-27 22:15:30 -04:00
Philippe Tillet
575dd06be3
[codegen] more progress towards unified dot implementation
2019-09-26 14:01:28 -04:00
Philippe Tillet
f0013f8bf1
[codegen] [allocation] fixed issues in HMMA
2019-09-23 17:54:42 -04:00
Philippe Tillet
7e0af2118c
[codegen] worked around bug seemingly from nvptx/ptxas by simplifying multiplications by 1:
...
- Generated LLVM-IR looked correct
- Illegal addressing disappeared when running cuda-memcheck
- Illegal addressing disappeared when using nvptx-short-pointer
2019-08-30 16:45:14 -07:00
Philippe Tillet
96b4d5e411
[examples] multiple transposition schemes now supported
2019-08-24 13:08:38 -07:00
Philippe Tillet
732156b942
[general] rename *.cpp -> *.cc
2019-08-23 19:06:39 -07:00