triton

Author	SHA1	Message	Date
Yan Da	c7ad928e60	More progress on WhileOp codegen	2022-04-05 15:55:48 +08:00
Yan Da	0f96da336a	codegen for If	2022-04-04 12:58:37 +08:00
Yan Da	9df899b291	Some progress on visit_If	2022-04-03 22:34:46 +08:00
Yan Da	c71c50cd0c	ForOp's SSA construction	2022-04-03 19:11:47 +08:00
Yan Da	61413b8a97	More python bindings	2022-04-01 22:22:39 +08:00
Yan Da	9dafa0e2e3	Update trtion dependencies	2022-04-01 20:16:07 +08:00
Yan Da	bde103fab0	Replace MlirType with mlir::Type	2022-04-01 18:46:46 +08:00
Yan Da	4ad432f1fc	More on scf Ops	2022-03-31 21:42:48 +08:00
Yan Da	2041b67fbf	Now vecadd works	2022-03-30 20:21:47 +08:00
Yan Da	e381dc72c5	Use mlir::Block to replace MlirBlock	2022-03-30 16:31:03 +08:00
Yan Da	e95d98a886	bindings for ModuleOp	2022-03-30 13:32:52 +08:00
Yan Da	38e67b4293	Add more Ops	2022-03-28 19:50:23 +08:00
Yan Da	0d139ec460	Introducing SCF	2022-03-26 17:02:32 +08:00
Yan Da	c53f3486e4	create shr	2022-03-26 16:41:49 +08:00
Yan Da	ba16116f96	Let python manage created objects	2022-03-26 16:31:01 +08:00
Yan Da	fed9925bbd	Using stable LLVM release	2022-03-26 16:25:18 +08:00
Yan Da	a17fba86b1	Logic Op creation	2022-03-26 16:16:20 +08:00
Yan Da	d5612333c0	More fcmp ops	2022-03-25 14:12:20 +08:00
Yan Da	07881b4d41	Update includes	2022-03-24 13:46:35 +08:00
Yan Da	cf7fc8d642	Update includes	2022-03-24 13:33:54 +08:00
Yan Da	14a71dcb6f	Replace MlirOperation with MlirValue	2022-03-23 13:31:14 +08:00
Yan Da	f2ab318614	New python binding	2022-03-22 21:53:22 +08:00
Yan Da	419bbe0f6e	Reverts back to MLIR 14 & updates CMakeLists	2022-03-20 16:41:48 +08:00
Yan Da	a2c31ff434	Init commit	2022-03-17 20:40:55 +08:00
daadaada	539961072c	[FRONTEND] Semantic analysis refactor (#473 ) Moved dispatch.cc to semantic.py Integer signedness now moved from C++ to python Cleaner frontend type Co-authored-by: Phil Tillet <phil@openai.com>	2022-03-16 21:25:30 -07:00
Yongjik Kim	0dd2ec2e3a	[FRONTEND] Add an assert in case we get a CPU tensor. (#478 )	2022-03-16 14:38:56 -07:00
Philippe Tillet	d4d8eaf6c0	[FRONTEND] improved caching mechanism (#474 ) Co-authored-by: Greg Brockman <gdb@gregbrockman.com> Co-authored-by: Christopher Hesse <christopherhesse@users.noreply.github.com>	2022-03-15 12:20:51 -07:00
Philippe Tillet	98ed7db8c1	[CODEGEN] Improvements and bugfixes (#463 )	2022-02-24 14:56:24 -08:00
daadaada	a9dfdcaaa9	[FRONTEND] Make the performance model work for int8, tf32, and fp32 (#456 )	2022-02-11 22:34:42 -08:00
Philippe Tillet	9b100302d3	[FRONTEND] Now using pybind11 to release GIL (#458 )	2022-02-10 01:57:39 -08:00
Philippe Tillet	7b48340ffd	[CI] Some fixes for the build (#451 )	2022-02-06 19:11:33 -08:00
Philippe Tillet	5a8a544d10	[OPS][BLOCKSPARSE] Improved robustness, clarity and performance (#450 ) * dds layout now internally re-uses dsd code path for increased code * at_mask and kp_mask related things are now dropped from the softmax API. I couldn't think of any case where it was needed beyond is_causal. And if there is any, we should probably find a way to get it implemented statically so that users don't have to materialize masks. * fixed bug in blocksparse matmul that caused troubles when layout had a full row/col of zeros * blocksparse softmax now no longer modifies any data in-place * blocksparse softmax now takes an is_dense arguments that provides better performance. Passing is_dense=True, is_causal=True is the best way to achieve triangular attention. * unit tests now test backward pass	2022-02-06 18:00:45 -08:00
TC	137bb67fad	[LANG] Add fp16 to fp8 conversion (#444 )	2022-02-02 20:42:09 -08:00
Philippe Tillet	b0d6e2f322	[STYLE] run autopep	2022-01-30 20:27:44 -08:00
Philippe Tillet	2922dc141c	Merge branch 'master' into v2.0	2022-01-30 20:25:01 -08:00
Philippe Tillet	807d8a1945	[ALL] Merge master (#447 )	2022-01-30 20:21:20 -08:00
Philippe Tillet	bef76b142a	[BACKEND] float division is now approximate by default (#446 )	2022-01-29 18:29:29 -08:00
Philippe Tillet	bd52e530a0	[OPS][BLOCKSPARSE] Fix padding issue in DSD LUT (#445 )	2022-01-28 21:40:30 -08:00
daadaada	59d371c6eb	[BACKEND] Added Int8 mma (#440 )	2022-01-27 09:12:44 -08:00
Philippe Tillet	ccf9abe0ba	[FRONTEND][RANDOM] Improved backward compatibility of RNG (#438 ) The unsigned int PR definitely improved our RNG. However, it requires different floating point arithmetics which, means the results are not bit-wise identical to how they were before. This commit revives backward compatibility, but we should change it back to the "right" way later.	2022-01-21 18:05:55 -08:00
Philippe Tillet	4c97d1ecd7	[FRONTEND] Bunch of fixes here and there (#436 )	2022-01-20 10:55:59 -08:00
daadaada	2a944ded53	[TESTS] Added bfloat16 tests (#430 )	2022-01-13 23:38:32 -08:00
Philippe Tillet	4c94359199	[FRONTEND] Alignment fix-up (#428 )	2022-01-11 23:11:58 -08:00
Philippe Tillet	bbc78f6516	[FRONTEND][RANDOM] Make sure offset dtype is always uint32 before calling uint32_to_uniform_float (#427 )	2022-01-11 11:08:49 -08:00
Botao Yu	bf32205edc	[OPS][BLOCKSPARSE] Remove unnecessary loop and add cuda bool layout support (#425 )	2022-01-11 11:07:16 -08:00
daadaada	94a2e10fe5	[BACKEND] Add bf16 & tf32 mma supports (on A100) (#426 )	2022-01-11 10:20:31 -08:00
Madeleine Thompson	efdabe6073	[STYLE] check python with flake8 (#424 ) I've been using this locally to find errors without running tests, and now that we're using autopep8, it passes with minimal suppressions. This is also what turned up the issues with the tutorials, which were fixed in #422.	2022-01-07 15:28:36 -08:00
Madeleine Thompson	a70acfec77	[STYLE] add isort and autopep8 config files and check on CI (#423 ) Also a fix a few more style issues from the "aggressive" mode of autopep8.	2022-01-07 13:11:34 -08:00
Madeleine Thompson	9801aa7b56	[DOCS] fix tutorials for v2.0 (#422 ) - Fix meta-parameter usage on tutorials. - Install tutorial dependencies on CI. - Switch from `requirements-test.txt` to `extras_require` for test dependencies, and also use it for tutorial dependencies. - Make some performance tests deterministic.	2022-01-07 12:34:38 -08:00
Madeleine Thompson	8bf551ae7a	[STYLE] run autopep8 and isort (#421 ) Run: ``` isort ./python autopep8 -i --ignore E501,E701,E731 $(find ./python/ -name '*.py') ``` with an `.isort.cfg` and then clean up a few warts. This PR should be a no-op; the idea is that this is all boring whitespace changes, and any config file changes will be in a different change to make it easier to review.	2022-01-06 14:34:17 -08:00

... 2 3 4 5 6 ...

459 Commits