Commit Graph

8 Commits

Author SHA1 Message Date
Philippe Tillet
7c09ff80eb [CORE] Fixed several issues that arose in the development of the
torch-blocksparse package:

* Now using warp shuffle in reductions when possible
* Various bugfixes in layout inference
* Added INFINITY, exponential and select
* Better error messages for unimplemented constructs
2020-03-31 18:57:28 -04:00
Philippe Tillet
fbf2a3f56f [CODEGEN][TRANSFORM] some bug-fixes for FP32 einsum 2020-01-20 12:42:53 -05:00
Philippe Tillet
f278d9741a [GENERAL] Merged einsum feature branch. Various feature, performance
improvements and bugfixes:

* Added preliminary support for extended Einstein summation in PyTriton
* Significant performance improvement on FP32 kernels containing matrix
multiplication
* Added re-coalescing pass for FP16 kernels containing matrix
multiplication
* Various bugfixes
2020-01-20 12:42:48 -05:00
Philippe Tillet
de6fdd5625 [general] removed useless files and includes 2019-10-20 19:29:48 -04:00
Philippe Tillet
323c90e431 ugh 2019-10-11 19:05:54 -04:00
Philippe Tillet
856e7baa04 [test] added tests for copy 2019-09-23 12:07:24 -04:00
Philippe Tillet
001973630e [codegen] cleaned up shared memory and double-buffering logic 2019-09-21 22:21:40 -04:00
Philippe Tillet
43d88154bd [codegen] cleaning-up / formalizing shared-memory passes 2019-09-20 16:01:12 -04:00