triton

Author	SHA1	Message	Date
Da Yan	0f5c6e619c	[BUILD] Add the missing triton/impl to setup.py (#1042 )	2023-01-09 19:03:45 +00:00
fdrocha	194ba103b1	[BUILD] Fixed error when compiling in systems with multiple versions of python installed (#1019 )	2022-12-29 15:10:34 -08:00
Philippe Tillet	20100a7254	Merge `triton-mlir` branch - Complete rewrite of the backend from scratch (#1004 ) This PR merges the `triton-mlir` branch, in which we have been quietly rewriting the Triton backend from scratch to increase maintainability, stability and ultimately performance. Changes to the runtime are minimal, and this new version aims to remain backward-compatible with the previous commit. The legacy backend is now officially deprecated, but can still be accessed via the `legacy-backend` tag. Co-authored-by: Keren Zhou <kerenzhou@openai.com> Co-authored-by: Yan Chunwei <yanchunwei@outlook.com> Co-authored-by: goostavz <109190422+goostavz@users.noreply.github.com> Co-authored-by: Shintaro Iwasaki <siwasaki@fb.com> Co-authored-by: Yan Da <dyanab@connect.ust.hk> Co-authored-by: Jun Yang <yangjunpro@gmail.com> Co-authored-by: Ian Bearman <ianb@microsoft.com> Co-authored-by: Jason Ansel <jansel@jansel.net> Co-authored-by: Qingyi Liu <qingyil@nvidia.com> Co-authored-by: ben-zhang-609 <110140741+ben-zhang-609@users.noreply.github.com> Co-authored-by: Chenggang Zhao <lyricz@yeah.net> Co-authored-by: ben-zhang-609 <benzh609@gmail.com> Co-authored-by: dongdongl <dongdongl@nvidia.com>	2022-12-21 01:30:50 -08:00
Yang Hau	8650b4d1cb	[DRIVER] Fix typos (#939 )	2022-12-02 11:13:46 -08:00
Shintaro Iwasaki	3ac929b48b	[BUILD] Download pybind11 in setup.py (#703 ) Based on the discussion in #700, this PR enables downloading pybind11 in `setup.py` without `git submodule` instead of copy-pasting pybind11 code. The downloaded pybind11 will be in `~/.triton/pybind` (like `llvm`).	2022-09-23 15:54:07 -07:00
Philippe Tillet	25e1b36785	Revert "[pybind11] Use git-submodule for pybind11" (#701 ) Reverts openai/triton#699	2022-09-23 12:25:38 -07:00
Shintaro Iwasaki	61d104ab3a	[FRONTEND] Use git-submodule for pybind11 (#699 ) This PR changes the `pybind11` source code management from copy-paste to a package controlled by git-submodule. See the discussion in #694 for details.	2022-09-23 09:55:03 -07:00
Phil Tillet	82956e5d6b	[PACKAGING] Added missing package	2022-09-18 17:34:05 -07:00
Jason Ansel	0a3f3d5f25	[PACKAGING] Include triton/language/libdevice.10.bc in package data (#582 )	2022-07-13 23:45:27 -07:00
Keren Zhou	4912916c11	[FRONTEND] Added support for element-wise function defined in external LLVM bitcode (e.g., libdevice) (#562 )	2022-07-13 15:52:21 -07:00
Keren Zhou	4bf509889b	[BUILD] Change the default build type to Release (#571 )	2022-07-01 12:17:22 -07:00
Madeleine Thompson	8ce2c12e33	[PYTHON] move ephemeral files to homedir (#549 ) This prevents potential conflicts with other users on shared machines.	2022-06-13 19:37:52 -07:00
Philippe Tillet	8876e53206	[BACKEND] Restored reduction bugfixes	2022-06-03 11:38:52 -07:00
Philippe Tillet	a60374a597	Revert "[BACKEND] Various bug fixes; making reductions faster (#533 )". This is a more stable commit that produce bitwise identical code to earlier versions. Using commits after this one may lead to slightly different numerics	2022-06-03 11:36:06 -07:00
Philippe Tillet	3e7500dfe6	[BACKEND] Various bug fixes; making reductions faster (#533 )	2022-05-31 17:14:44 -07:00
Philippe Tillet	2bed6fc850	[LANG] Added support for device functions (#484 )	2022-04-03 20:58:16 -07:00
Philippe Tillet	807d8a1945	[ALL] Merge master (#447 )	2022-01-30 20:21:20 -08:00
daadaada	59d371c6eb	[BACKEND] Added Int8 mma (#440 )	2022-01-27 09:12:44 -08:00
Madeleine Thompson	efdabe6073	[STYLE] check python with flake8 (#424 ) I've been using this locally to find errors without running tests, and now that we're using autopep8, it passes with minimal suppressions. This is also what turned up the issues with the tutorials, which were fixed in #422.	2022-01-07 15:28:36 -08:00
Madeleine Thompson	a70acfec77	[STYLE] add isort and autopep8 config files and check on CI (#423 ) Also a fix a few more style issues from the "aggressive" mode of autopep8.	2022-01-07 13:11:34 -08:00
Madeleine Thompson	9801aa7b56	[DOCS] fix tutorials for v2.0 (#422 ) - Fix meta-parameter usage on tutorials. - Install tutorial dependencies on CI. - Switch from `requirements-test.txt` to `extras_require` for test dependencies, and also use it for tutorial dependencies. - Make some performance tests deterministic.	2022-01-07 12:34:38 -08:00
Madeleine Thompson	8bf551ae7a	[STYLE] run autopep8 and isort (#421 ) Run: ``` isort ./python autopep8 -i --ignore E501,E701,E731 $(find ./python/ -name '*.py') ``` with an `.isort.cfg` and then clean up a few warts. This PR should be a no-op; the idea is that this is all boring whitespace changes, and any config file changes will be in a different change to make it easier to review.	2022-01-06 14:34:17 -08:00
Philippe Tillet	5d54352164	[FRONTEND] Significantly reduce kernel launch time (#367 )	2021-11-04 13:25:24 -07:00
Philippe Tillet	770ea96cca	[PACKAGING] Bumped dev version to 2.0.0	2021-10-29 01:28:17 -07:00
Philippe Tillet	2d6df9b518	[PACKAGING] Bumped dev version to 1.1.2	2021-10-29 01:24:19 -07:00
Philippe Tillet	abbc554838	[VERSION] Bumped version to 1.1.1 (#350 )	2021-10-14 18:09:39 -07:00
Philippe Tillet	44442db96e	[VERSION] Bumped to 1.1 (#313 )	2021-09-28 00:25:42 -07:00
Philippe Tillet	6e5b0b4301	[FRONTEND] Added on-disk cache for compiled kernels (#287 )	2021-09-18 22:48:26 -07:00
Szymon Sidor	8bedcce9be	[LANG] Added seeded random number generation - philox (#261 )	2021-09-02 22:02:40 -07:00
Reid Draper	2322d6df2a	[CI] Update `ptillet` to `openai` (#152 )	2021-07-29 11:39:50 -07:00
Philippe Tillet	4b9df06568	[CI] Bumped dev version to 1.0.1 and fixed permissions in documentation.yml (#149 )	2021-07-28 04:35:14 -07:00
Philippe Tillet	acd5e44611	[GENERAL] Some minor improvements here and there to build systems and docs (#148 )	2021-07-28 01:51:17 -07:00
Philippe Tillet	57c1fd3366	[BUILD] Now downloading LLVM from web if system does not have `llvm-config-11` (#142 )	2021-07-28 01:02:31 -07:00
Philippe Tillet	76c6f24fb6	[CI] Made build-wheels compatible with system LLVM setup (#138 ) This speeds up wheelhouse build time by ~10x	2021-07-27 12:38:49 -07:00
Philippe Tillet	8eb63bcb01	[CI] Various improvements to CI (#137 ) Add clean-up before CI runs. Now using static LLVM-11 libraries from system rather than recompilation. Still no run-time LLVM dependencies	2021-07-27 12:38:49 -07:00
Philippe Tillet	8cea583109	[IR] Preliminary support for BF16 (#129 ) This PR adds a BF16 data-type, along with FP32 <-> BF16 conversion instructions in the LLVM codegen. Other kinds of ops on bfloat16 are not yet supported.	2021-07-27 12:38:49 -07:00
Philippe Tillet	288b4f7f58	[PYTHON] Added frontend to print sass using turingas disasm.py (#109 )	2021-07-27 12:38:49 -07:00
Philippe Tillet	c91dd56a92	[CI] Made setup.py more backwards-compatible (#108 )	2021-07-27 12:38:49 -07:00
Philippe Tillet	147675923e	[triton-ops] Minor build improvements (#106 )	2021-07-27 12:38:49 -07:00
Philippe Tillet	29e33e50b7	[DOCS] Updates and improvements (#87 )	2021-07-27 12:38:49 -07:00
Philippe Tillet	39f4730305	Deprecation of Triton-C and Replacement by decorated Python functions (#86 ) This PR implements a major overhaul of the frontend for Triton, and replaces Triton-C by a pure Python API in which kernels are defined as @triton.jit decorated functions. The documentation and tutorials have also been updated to accommodate these changes. See documentations for more information on the new API	2021-07-27 12:38:49 -07:00
Philippe Tillet	1fdb465b71	[DOCS] Various improvements and typo fixes	2021-07-27 12:38:49 -07:00
Philippe Tillet	b352bc79e3	[CI] Changed triton-nightly to --pre triton (#78 ) The solution proposed in #77 can create namespace conflicts when triton and triton-nightly have both been pip installed. Therefore, this PR is moving nightly releases to pre-releases in the main triton index.	2021-07-27 12:38:49 -07:00
Philippe Tillet	2f80a98776	[BUILD] Added automatic nightly build releases to pip in CI; removed build-time dependence on LLVM and PyTorch (#77 ) Recently there has been more and more report about installation issues: - Installing Triton before upgrading pytorch can create some issues because Triton uses some torch headers - llvm-10-dev not available on some platform; llvm-11-dev not available on e.g. Ubuntu. absence of nightly builds This PR should fix all these issues. Some CMake tricks are used to download and install llvm at build time. Triton Python bindings were modified to remove dependence on pytorch ops. Midnight CI job added to generate binary wheels for all Triton version and update them on pypi's new triton-nightly project. This PR will also make it very easy to use LLVM forks in the future for whatever needs we have.	2021-07-27 12:38:49 -07:00
Philippe Tillet	183878dce5	[DOCS] Added matrix multiplication tutorial	2021-07-27 12:38:49 -07:00
Philippe Tillet	eacbb73968	[PYTHON] CUTLASS wrapper for fair benchmarks (#75 ) Before this commit, the benchmarking infrastructure used heterogeneous protocols between library (e.g., CUTLASS uses a C++ binary that reports mean TFLOPS; torch and triton use python call and report 10th, 50th and 90th quantiles). For the sake of uniformity and fair benchmark practices, this PR adds a python wrapper for auto-tuned CUTLASS matrix multiplication. Benchmarks have been rewritten to use this wrapper with `triton.testing.do_bench` rather than system calls to CUTLASS profiler. Importantly, this also ensures that all the matmuls are done on the same input data which should stabilize clock across providers.	2021-07-27 12:38:49 -07:00
Philippe Tillet	5b83259592	[CODEGEN] Major performance improvements on A100 (#70 ) Improved handling of asynchronous copy, scheduling and synchronization for A100. Now achieving CUTLASS-like performance on large square dense matrix multiplication tasks	2021-07-27 12:38:49 -07:00
Philippe Tillet	ce8aa2a41a	[CI] Added benchmarking to CI script (#65 )	2021-07-27 12:38:49 -07:00
Philippe Tillet	5e3c7f5a60	[PYTHON] Added automated benchmark script (#63 ) This adds a bench functionality to the setup.py that can be used to run the benchmark suite and generates a bunch of csv files (and optionally plots) python setup.py bench python setup.py bench --with-plots python setup.py bench --filter=cross_entropy	2021-07-27 12:38:48 -07:00
Philippe Tillet	7cf358a352	[TUTORIALS] Fixed TYPO in CMakeLists.txt	2021-07-27 12:38:48 -07:00

1 2

76 Commits