triton

Author	SHA1	Message	Date
Philippe Tillet	bef76b142a	[BACKEND] float division is now approximate by default (#446 )	2022-01-29 18:29:29 -08:00
daadaada	59d371c6eb	[BACKEND] Added Int8 mma (#440 )	2022-01-27 09:12:44 -08:00
Philippe Tillet	4c97d1ecd7	[FRONTEND] Bunch of fixes here and there (#436 )	2022-01-20 10:55:59 -08:00
Philippe Tillet	e0c5709cc8	[FRONTEND] Fixed semantics bug on ptr to bool conversions (#432 )	2022-01-17 18:00:03 -08:00
daadaada	94a2e10fe5	[BACKEND] Add bf16 & tf32 mma supports (on A100) (#426 )	2022-01-11 10:20:31 -08:00
Madeleine Thompson	0ab9d67bad	uint8, uint16, uint32, and uint64 in kernels (#413 ) A forthcoming PR will update the RNG to use these types. Also: - Add tests for the `//`, `<<`, and `>>` operators. - Change `TensorWrapper` to unwrap objects when the resulting object would be simpler. - Clean up `throw_unreachable`, since it was triggering compiler warnings.	2022-01-05 15:27:17 -08:00
Philippe Tillet	03f1256f60	[FRONTEND] Added `volatile` flag for load (#407 )	2021-12-30 22:33:24 -08:00
Madeleine Thompson	985798f101	add missing bfloat16 repr and improve assertions (#403 ) - `BF16TyID` was missing a repr implementation. - Throw a better exception on impossible casts. - Add a few assertions. Tested with a debug build. - Add `pointer_dtype.__str__` to aid kernel debugging.	2021-12-23 17:01:17 -08:00
Madeleine Thompson	5cdb948c05	[FRONTEND] signed-integer math fixes and testing (#395 ) - Promote 16-bit floating-point `/` and `%` to 32-bit; we have to anyway. - Do not force result of integer binary operations to be the LHS type. There used to be a bug in pytorch that did this, which Triton matched, but that bug is fixed now. - When testing signed integer operations, use random numbers from the full range of the type. - Add an optional `seed` argument to `triton.testing.random` so binary operations are not tested with both sides equal when the LHS and RHS have the same type. - Fix a bad `CompilationError` invocation. - Fix a warning suppression that causes tests to fail if you run them with `-W error` on python 3.8.	2021-12-21 09:46:05 -08:00
Philippe Tillet	558555630f	[FRONTEND] Added xor_sum	2021-12-16 17:55:35 -08:00
Philippe Tillet	5ce1b726dc	[CODEGEN] Various bugfixes that make it possible to fuse RNG in a matmul epilogue (#356 )	2021-10-24 02:30:46 -07:00
daadaada	858dec8372	[CODEGEN] Add cache modifier to tl.load (#351 ) * Add cache modifier to tl.load * Add comment to cache_modifier * Remove force_nc_cache * Update test	2021-10-17 22:14:04 -07:00
Philippe Tillet	8fdd7e7ed6	[LANG] Fixed semantics of boolean load/store (#282 )	2021-09-13 17:39:06 -07:00
Philippe Tillet	3e395bc84e	[LANG] Fixed semantics of NaN in float comparisons (#281 )	2021-09-13 15:06:29 -07:00
Philippe Tillet	768e0ded28	[CODEGEN] Fixed bug in pipelining pass and casting semantics analysis (#257 )	2021-09-01 20:58:47 -07:00
Philippe Tillet	4ff3714d61	[CODEGEN] Various bugfixes and stability improvements in compiler backend (#240 )	2021-08-30 11:50:35 -07:00
Philippe Tillet	226fde6ea1	[CODEGEN] Now using atomic_rmw code path for atomic_xchg (#222 )	2021-08-17 16:33:23 -07:00
Philippe Tillet	298da78058	[CODEGEN/DRIVER] Tweaks for performance optimization (#193 )	2021-08-07 16:41:44 -07:00
Philippe Tillet	2824345065	[LANGUAGE] Added cos/sin (#132 )	2021-07-27 12:38:49 -07:00
Philippe Tillet	8cea583109	[IR] Preliminary support for BF16 (#129 ) This PR adds a BF16 data-type, along with FP32 <-> BF16 conversion instructions in the LLVM codegen. Other kinds of ops on bfloat16 are not yet supported.	2021-07-27 12:38:49 -07:00
Philippe Tillet	80c86ecf4a	[LANG] Minor semantic changes (#121 ) * Now using unordered instead of ordered float (fixes NaN issues) * Bool -> int32 now converts to 1 rather than -1 * Reduce extend arguments to 32-bits if possible	2021-07-27 12:38:49 -07:00
Philippe Tillet	0274429429	[IR] Added IR and Codegen support for atomic_rmw (#120 )	2021-07-27 12:38:49 -07:00
Philippe Tillet	59b0ac672a	[LANGUAGE] Added support for bitcast (#119 )	2021-07-27 12:38:49 -07:00
Philippe Tillet	6a9810ccf2	[codegen] small bugfix: (#97 ) * Added fp32 -> fp8 for ConstantFP = 0 * Added some more robust semantic check for atomic_add	2021-07-27 12:38:49 -07:00
Philippe Tillet	2b75158426	[PYTHON] Added atomic_add (#94 )	2021-07-27 12:38:49 -07:00
Philippe Tillet	39f4730305	Deprecation of Triton-C and Replacement by decorated Python functions (#86 ) This PR implements a major overhaul of the frontend for Triton, and replaces Triton-C by a pure Python API in which kernels are defined as @triton.jit decorated functions. The documentation and tutorials have also been updated to accommodate these changes. See documentations for more information on the new API	2021-07-27 12:38:49 -07:00

26 Commits