triton

Author	SHA1	Message	Date
Shintaro Iwasaki	ff399fbc20	[Build] Support GCC 8.x to build Triton (#1036 )	2023-01-06 19:36:14 -08:00
Philippe Tillet	20100a7254	Merge `triton-mlir` branch - Complete rewrite of the backend from scratch (#1004 ) This PR merges the `triton-mlir` branch, in which we have been quietly rewriting the Triton backend from scratch to increase maintainability, stability and ultimately performance. Changes to the runtime are minimal, and this new version aims to remain backward-compatible with the previous commit. The legacy backend is now officially deprecated, but can still be accessed via the `legacy-backend` tag. Co-authored-by: Keren Zhou <kerenzhou@openai.com> Co-authored-by: Yan Chunwei <yanchunwei@outlook.com> Co-authored-by: goostavz <109190422+goostavz@users.noreply.github.com> Co-authored-by: Shintaro Iwasaki <siwasaki@fb.com> Co-authored-by: Yan Da <dyanab@connect.ust.hk> Co-authored-by: Jun Yang <yangjunpro@gmail.com> Co-authored-by: Ian Bearman <ianb@microsoft.com> Co-authored-by: Jason Ansel <jansel@jansel.net> Co-authored-by: Qingyi Liu <qingyil@nvidia.com> Co-authored-by: ben-zhang-609 <110140741+ben-zhang-609@users.noreply.github.com> Co-authored-by: Chenggang Zhao <lyricz@yeah.net> Co-authored-by: ben-zhang-609 <benzh609@gmail.com> Co-authored-by: dongdongl <dongdongl@nvidia.com>	2022-12-21 01:30:50 -08:00
Shintaro Iwasaki	3ac929b48b	[BUILD] Download pybind11 in setup.py (#703 ) Based on the discussion in #700, this PR enables downloading pybind11 in `setup.py` without `git submodule` instead of copy-pasting pybind11 code. The downloaded pybind11 will be in `~/.triton/pybind` (like `llvm`).	2022-09-23 15:54:07 -07:00
Philippe Tillet	25e1b36785	Revert "[pybind11] Use git-submodule for pybind11" (#701 ) Reverts openai/triton#699	2022-09-23 12:25:38 -07:00
Shintaro Iwasaki	61d104ab3a	[FRONTEND] Use git-submodule for pybind11 (#699 ) This PR changes the `pybind11` source code management from copy-paste to a package controlled by git-submodule. See the discussion in #694 for details.	2022-09-23 09:55:03 -07:00
Keren Zhou	4bf509889b	[BUILD] Change the default build type to Release (#571 )	2022-07-01 12:17:22 -07:00
Philippe Tillet	98ed7db8c1	[CODEGEN] Improvements and bugfixes (#463 )	2022-02-24 14:56:24 -08:00
Victor	73b04d71b2	Fixes for building on Windows (#382 ) * make C++ code compatible with Windows + MSVC * added dlfcn-win32 for cross-platform dlopen * fixed building and pip install on Windows * fixed shared library file name under Windows	2021-12-07 14:10:58 -08:00
Philippe Tillet	94c83d30ce	[GENERAL] Removed deprecated driver files and added basic compatibility with rocm (#268 ) - Removed driver module -- accelerator runtime is handled by pytorch - Added basic support for ROCM based on @micmelesse 's PR -- now can execute empty kernel on AMD devices without any compile-time changes - Now only using PREFER_SHARED for kernels when the size of shared memory is greater than 49k. Otherwise there can be poor L1 performance for broadcast tensors	2021-09-09 00:04:28 -07:00
Philippe Tillet	c0bb895d9d	[BUILD] More portable detection of terminfo (#173 )	2021-07-31 17:09:49 -07:00
Philippe Tillet	acd5e44611	[GENERAL] Some minor improvements here and there to build systems and docs (#148 )	2021-07-28 01:51:17 -07:00
Philippe Tillet	57c1fd3366	[BUILD] Now downloading LLVM from web if system does not have `llvm-config-11` (#142 )	2021-07-28 01:02:31 -07:00
Philippe Tillet	76c6f24fb6	[CI] Made build-wheels compatible with system LLVM setup (#138 ) This speeds up wheelhouse build time by ~10x	2021-07-27 12:38:49 -07:00
Philippe Tillet	8eb63bcb01	[CI] Various improvements to CI (#137 ) Add clean-up before CI runs. Now using static LLVM-11 libraries from system rather than recompilation. Still no run-time LLVM dependencies	2021-07-27 12:38:49 -07:00
Philippe Tillet	147675923e	[triton-ops] Minor build improvements (#106 )	2021-07-27 12:38:49 -07:00
Philippe Tillet	39f4730305	Deprecation of Triton-C and Replacement by decorated Python functions (#86 ) This PR implements a major overhaul of the frontend for Triton, and replaces Triton-C by a pure Python API in which kernels are defined as @triton.jit decorated functions. The documentation and tutorials have also been updated to accommodate these changes. See documentations for more information on the new API	2021-07-27 12:38:49 -07:00
Philippe Tillet	2f80a98776	[BUILD] Added automatic nightly build releases to pip in CI; removed build-time dependence on LLVM and PyTorch (#77 ) Recently there has been more and more report about installation issues: - Installing Triton before upgrading pytorch can create some issues because Triton uses some torch headers - llvm-10-dev not available on some platform; llvm-11-dev not available on e.g. Ubuntu. absence of nightly builds This PR should fix all these issues. Some CMake tricks are used to download and install llvm at build time. Triton Python bindings were modified to remove dependence on pytorch ops. Midnight CI job added to generate binary wheels for all Triton version and update them on pypi's new triton-nightly project. This PR will also make it very easy to use LLVM forks in the future for whatever needs we have.	2021-07-27 12:38:49 -07:00
Philippe Tillet	183878dce5	[DOCS] Added matrix multiplication tutorial	2021-07-27 12:38:49 -07:00
Philippe Tillet	eacbb73968	[PYTHON] CUTLASS wrapper for fair benchmarks (#75 ) Before this commit, the benchmarking infrastructure used heterogeneous protocols between library (e.g., CUTLASS uses a C++ binary that reports mean TFLOPS; torch and triton use python call and report 10th, 50th and 90th quantiles). For the sake of uniformity and fair benchmark practices, this PR adds a python wrapper for auto-tuned CUTLASS matrix multiplication. Benchmarks have been rewritten to use this wrapper with `triton.testing.do_bench` rather than system calls to CUTLASS profiler. Importantly, this also ensures that all the matmuls are done on the same input data which should stabilize clock across providers.	2021-07-27 12:38:49 -07:00
Philippe Tillet	2a02fabdac	[PYTHON] Some cleaning of the PyBind11 wrappers (#62 )	2021-07-27 12:38:48 -07:00
Philippe Tillet	7cf358a352	[TUTORIALS] Fixed TYPO in CMakeLists.txt	2021-07-27 12:38:48 -07:00
Philippe Tillet	269ebc12e5	[PYTHON][TESTS][DOC] Various improvement of the API and code quality: * Simplified `triton.kernel` API to achieve lower latency: > .data_ptr() must now be passed as kernel argument. No more implicit conversion from torch.tensor > compilation options are now constant attributes, i.e., opt.d('VAR') becomes opt.VAR > torch.device must now be passed explicitly to triton.kernel (no longer inferred from torch.tensor arguments) * C++ tests moved to `python/tests/` * C++ tutorial created in `tutorials/` * Python tutorial created in python/tutorials/ * Version changed to 1.0alpha * No longer copying C++ headers into the Python package * added python/triton/ops/ package for pre-written Triton ops	2021-07-27 12:38:48 -07:00
Philippe Tillet	50587bbf4b	[General] LLVM-9 -> LLVM-10	2021-07-27 12:38:48 -07:00
Philippe Tillet	cf80ccc798	[PYTHON] Fixed torch ABI issue	2021-07-27 12:38:48 -07:00
Philippe Tillet	444907589d	[GENERAL] Fixed MacOS compilation issues	2021-07-27 12:38:48 -07:00
Jeff Rasley	7fdf2e378c	fix llvm build inside conda environment (see link for similar issue) https://github.com/tensorflow/tensorflow/issues/12998	2021-07-27 12:38:48 -07:00
Philippe Tillet	acff1b5e05	[RUNTIME] Lower-level interface for executing functions	2021-07-27 12:38:48 -07:00
Philippe Tillet	04a9ea060b	[GENERAL] Added compatibility with pytorch 1.2.0 and powerpc	2021-07-27 12:38:48 -07:00
Philippe Tillet	24586e60aa	[PACKAGING] sdist now generates working .tar.gz file	2021-07-27 12:38:48 -07:00
Philippe Tillet	3304629de9	[CORE] Fixed several issues that arose in the development of the torch-blocksparse package: * Now using warp shuffle in reductions when possible * Various bugfixes in layout inference * Added INFINITY, exponential and select * Better error messages for unimplemented constructs	2021-07-27 12:38:48 -07:00
Philippe Tillet	f08dd0ec58	[CMAKE] target_link_directories -> link_directories	2021-07-27 12:38:48 -07:00
Philippe Tillet	646c49f847	[CMAKE] Fixed issue in LLVM link directory	2021-07-27 12:38:48 -07:00
Philippe Tillet	6d7cf35123	History prior to this date belonged to the now deprecated ISAAC project, and was deleted to save space	2021-07-27 12:38:38 -07:00

33 Commits