triton

Author	SHA1	Message	Date
Yan Da	26c59e4718	More on SCF conversion	2022-05-04 21:50:32 +08:00
Yan Da	a96fe07e1c	DotOp conversion	2022-05-04 15:56:24 +08:00
Yan Da	2d281cbc0a	ConstantOp conversion pattern	2022-05-04 15:35:43 +08:00
Yan Da	b9279d2e3b	More progress on TritonGPU conversion	2022-05-04 14:54:31 +08:00
Yan Da	3ad7bee35e	More conversion patterns	2022-05-04 12:50:02 +08:00
Yan Da	5f08e2fdae	More arith patterns	2022-05-02 22:31:29 +08:00
Yan Da	75d32e2442	More on TritonGPU conversion	2022-05-02 21:51:00 +08:00
Yan Da	1428185c9c	More progress on TritonGPUTypeConverter & TritonGPUConversionTarget	2022-05-01 22:06:54 +08:00
Yan Da	4ece9fd1f3	Move dependentDialects from .cpp to .td	2022-05-01 13:06:51 +08:00
Phil Tillet	d9017f8593	add basic template for legalizing arithmetic op	2022-04-30 20:42:25 -07:00
Phil Tillet	2c6a213131	[TRITONGPU] Added template for Triton -> TritonGPU conversion	2022-04-30 16:00:39 -07:00
Yan Da	2239ac1998	more progress on TritonGPU	2022-04-28 18:51:31 +08:00
Yan Da	29859605ee	Remove unused files	2022-04-27 21:20:07 +08:00
Yan Da	edca91bf8f	Update traits (NoSideEffect)	2022-04-27 19:41:07 +08:00
Yan Da	8dfe78f6cf	Add TritonCombineOps	2022-04-27 19:28:21 +08:00
Yan Da	c70f6b666e	Merge previous changes	2022-04-27 14:06:55 +08:00
Yan Da	74585fb970	Add Triton CombineOps	2022-04-27 13:45:56 +08:00
Philippe Tillet	81001d318c	Putting Triton dialect in its own folder	2022-04-26 14:39:27 -07:00
apd10	44d75cf9bb	Bugfix in ptxas path. (#487 ) Bug: "ret" value is destroyed when a failing "ptxas --version" is run overwriting the previous valid "ret" value. Fix: keep rets only for those runs which are successful. Pick the first one	2022-04-12 13:07:28 +08:00
Philippe Tillet	9be2d655a3	[DRIVER] LLVM driver fixup (#482 ) Current way of doing things is probably not super thread safe. init is shared between threads and some threads my not call the LLVMInitialize* function.	2022-04-12 13:03:02 +08:00
Yan Da	f1cc67bbc3	triton -> tt	2022-04-10 12:07:19 +08:00
Yan Da	28e96bbfd1	Remove the dependency on TensorDialect	2022-04-08 19:43:09 +08:00
Yan Da	62f7609612	More on type inference & assembly format	2022-04-08 19:37:57 +08:00
Yan Da	13aead4808	Use TableGen to define new types	2022-04-08 16:32:46 +08:00
Yan Da	040a2b6c75	Fix OpBuilder	2022-04-07 20:01:31 +08:00
Yan Da	0d139ec460	Introducing SCF	2022-03-26 17:02:32 +08:00
Yan Da	cf7fc8d642	Update includes	2022-03-24 13:33:54 +08:00
Yan Da	f2ab318614	New python binding	2022-03-22 21:53:22 +08:00
Yan Da	419bbe0f6e	Reverts back to MLIR 14 & updates CMakeLists	2022-03-20 16:41:48 +08:00
Yan Da	a2c31ff434	Init commit	2022-03-17 20:40:55 +08:00
daadaada	539961072c	[FRONTEND] Semantic analysis refactor (#473 ) Moved dispatch.cc to semantic.py Integer signedness now moved from C++ to python Cleaner frontend type Co-authored-by: Phil Tillet <phil@openai.com>	2022-03-16 21:25:30 -07:00
Philippe Tillet	a50a47a85b	[CODEGEN] Reverted some changes from previous PR; fixed vectorization characteristics of mma layout (#469 )	2022-03-04 01:53:31 -08:00
Philippe Tillet	bb5765df5c	[CODEGEN] Now padding shared memory for layout conversion (#468 )	2022-03-03 22:19:05 -08:00
daadaada	d9dd97492f	Use unique_ptr in ir::context_impl (#462 ) Co-authored-by: Philippe Tillet <Phil.Tillet@gmail.com>	2022-02-24 16:07:10 -08:00
Philippe Tillet	98ed7db8c1	[CODEGEN] Improvements and bugfixes (#463 )	2022-02-24 14:56:24 -08:00
Philippe Tillet	69ff52ea1f	[CODEGEN] removed buggy (and mostly useless) optimization in peephole pass (#449 )	2022-02-05 21:37:23 -08:00
TC	137bb67fad	[LANG] Add fp16 to fp8 conversion (#444 )	2022-02-02 20:42:09 -08:00
Philippe Tillet	2922dc141c	Merge branch 'master' into v2.0	2022-01-30 20:25:01 -08:00
Philippe Tillet	807d8a1945	[ALL] Merge master (#447 )	2022-01-30 20:21:20 -08:00
Philippe Tillet	bef76b142a	[BACKEND] float division is now approximate by default (#446 )	2022-01-29 18:29:29 -08:00
daadaada	e68d6a7776	[BACKEND] Making the warp-level tile "more square" to increase data-reuse for tl.dot. (#442 ) * Increase smem data-reuse for some layouts * tweak * Keep the original tiling logic for sm < 80 Co-authored-by: Philippe Tillet <phil@openai.com>	2022-01-27 09:59:54 -08:00
daadaada	59d371c6eb	[BACKEND] Added Int8 mma (#440 )	2022-01-27 09:12:44 -08:00
Philippe Tillet	4c97d1ecd7	[FRONTEND] Bunch of fixes here and there (#436 )	2022-01-20 10:55:59 -08:00
Philippe Tillet	e0c5709cc8	[FRONTEND] Fixed semantics bug on ptr to bool conversions (#432 )	2022-01-17 18:00:03 -08:00
daadaada	94a2e10fe5	[BACKEND] Add bf16 & tf32 mma supports (on A100) (#426 )	2022-01-11 10:20:31 -08:00
Madeleine Thompson	0ab9d67bad	uint8, uint16, uint32, and uint64 in kernels (#413 ) A forthcoming PR will update the RNG to use these types. Also: - Add tests for the `//`, `<<`, and `>>` operators. - Change `TensorWrapper` to unwrap objects when the resulting object would be simpler. - Clean up `throw_unreachable`, since it was triggering compiler warnings.	2022-01-05 15:27:17 -08:00
Philippe Tillet	03f1256f60	[FRONTEND] Added `volatile` flag for load (#407 )	2021-12-30 22:33:24 -08:00
Madeleine Thompson	985798f101	add missing bfloat16 repr and improve assertions (#403 ) - `BF16TyID` was missing a repr implementation. - Throw a better exception on impossible casts. - Add a few assertions. Tested with a debug build. - Add `pointer_dtype.__str__` to aid kernel debugging.	2021-12-23 17:01:17 -08:00
Philippe Tillet	2509124dd0	[DRIVER] Fixed some issue with how ptxas is used (#399 ) Now using tmpnam and properly deleting temporaries when an exception is raised	2021-12-21 14:31:51 -08:00
daadaada	39d4bfed83	[OPS] Add performance model for gemm/gemv (#397 ) Significantly improves the performance of `triton.ops.matmul` in memory-bound settings via the use of many more block configs coupled with a performance model to drive the auto-tuning process.	2021-12-21 09:56:10 -08:00

... 3 4 5 6 7 ...

394 Commits