triton

Author	SHA1	Message	Date
Philippe Tillet	7c09ff80eb	[CORE] Fixed several issues that arose in the development of the torch-blocksparse package: * Now using warp shuffle in reductions when possible * Various bugfixes in layout inference * Added INFINITY, exponential and select * Better error messages for unimplemented constructs	2020-03-31 18:57:28 -04:00
Philippe Tillet	7621aeda3f	[CODEGEN][TRANSFORM][PEEPHOLE] Fixed bug in *1 multiplication	2020-02-19 00:18:55 -05:00
Philippe Tillet	de6fdd5625	[general] removed useless files and includes	2019-10-20 19:29:48 -04:00
Philippe Tillet	ed1b2bc563	more work on padding	2019-09-27 22:15:30 -04:00
Philippe Tillet	575dd06be3	[codegen] more progress towards unified dot implementation	2019-09-26 14:01:28 -04:00
Philippe Tillet	f0013f8bf1	[codegen] [allocation] fixed issues in HMMA	2019-09-23 17:54:42 -04:00
Philippe Tillet	7e0af2118c	[codegen] worked around bug seemingly from nvptx/ptxas by simplifying multiplications by 1: - Generated LLVM-IR looked correct - Illegal addressing disappeared when running cuda-memcheck - Illegal addressing disappeared when using nvptx-short-pointer	2019-08-30 16:45:14 -07:00
Philippe Tillet	96b4d5e411	[examples] multiple transposition schemes now supported	2019-08-24 13:08:38 -07:00
Philippe Tillet	732156b942	[general] rename .cpp -> .cc	2019-08-23 19:06:39 -07:00