triton

Author	SHA1	Message	Date
Philippe Tillet	183878dce5	[DOCS] Added matrix multiplication tutorial	2021-07-27 12:38:49 -07:00
Philippe Tillet	eacbb73968	[PYTHON] CUTLASS wrapper for fair benchmarks (#75 ) Before this commit, the benchmarking infrastructure used heterogeneous protocols between library (e.g., CUTLASS uses a C++ binary that reports mean TFLOPS; torch and triton use python call and report 10th, 50th and 90th quantiles). For the sake of uniformity and fair benchmark practices, this PR adds a python wrapper for auto-tuned CUTLASS matrix multiplication. Benchmarks have been rewritten to use this wrapper with `triton.testing.do_bench` rather than system calls to CUTLASS profiler. Importantly, this also ensures that all the matmuls are done on the same input data which should stabilize clock across providers.	2021-07-27 12:38:49 -07:00
Philippe Tillet	5b83259592	[CODEGEN] Major performance improvements on A100 (#70 ) Improved handling of asynchronous copy, scheduling and synchronization for A100. Now achieving CUTLASS-like performance on large square dense matrix multiplication tasks	2021-07-27 12:38:49 -07:00
Philippe Tillet	ce8aa2a41a	[CI] Added benchmarking to CI script (#65 )	2021-07-27 12:38:49 -07:00
Philippe Tillet	5e3c7f5a60	[PYTHON] Added automated benchmark script (#63 ) This adds a bench functionality to the setup.py that can be used to run the benchmark suite and generates a bunch of csv files (and optionally plots) python setup.py bench python setup.py bench --with-plots python setup.py bench --filter=cross_entropy	2021-07-27 12:38:48 -07:00
Philippe Tillet	7cf358a352	[TUTORIALS] Fixed TYPO in CMakeLists.txt	2021-07-27 12:38:48 -07:00
Philippe Tillet	269ebc12e5	[PYTHON][TESTS][DOC] Various improvement of the API and code quality: * Simplified `triton.kernel` API to achieve lower latency: > .data_ptr() must now be passed as kernel argument. No more implicit conversion from torch.tensor > compilation options are now constant attributes, i.e., opt.d('VAR') becomes opt.VAR > torch.device must now be passed explicitly to triton.kernel (no longer inferred from torch.tensor arguments) * C++ tests moved to `python/tests/` * C++ tutorial created in `tutorials/` * Python tutorial created in python/tutorials/ * Version changed to 1.0alpha * No longer copying C++ headers into the Python package * added python/triton/ops/ package for pre-written Triton ops	2021-07-27 12:38:48 -07:00
Philippe Tillet	083bbd1e8d	[GENERAL] Merged v1.0alpha into master. Added features are: - A100 support via mma.16816 - Thread swizzling for conflict-free shared memory accesses without padding - Complete overhaul of the LLVM code generation in codegen/selection/generator.cc to remove overengineering - Added debugging capabilities in the Python binding - Compilation error for kernels that spill	2021-07-27 12:38:48 -07:00
Philippe Tillet	547a99a5d4	[VERSION] 0.2.3 -> 0.3.0	2021-07-27 12:38:48 -07:00
Philippe Tillet	073fddffc1	[PYTHON] Compiling Triton in Release mode now...	2021-07-27 12:38:48 -07:00
Philippe Tillet	50587bbf4b	[General] LLVM-9 -> LLVM-10	2021-07-27 12:38:48 -07:00
Philippe Tillet	cf80ccc798	[PYTHON] Fixed torch ABI issue	2021-07-27 12:38:48 -07:00
Philippe Tillet	049ab989b5	[GENERAL] Various improvements: * Sparse einsum in triton.ops.einsum * Hacky support for fixed-tile-size atomic-add * Various bugfixes in parser	2021-07-27 12:38:48 -07:00
Philippe Tillet	840308ab5d	[CODEGEN] More work on the CPU backend	2021-07-27 12:38:48 -07:00
Philippe Tillet	64eaec016f	[Version] Now version 0.2.3	2021-07-27 12:38:48 -07:00
Philippe Tillet	db4e4b9dbf	[VERSION] Now version 0.2.2	2021-07-27 12:38:48 -07:00
Philippe Tillet	acff1b5e05	[RUNTIME] Lower-level interface for executing functions	2021-07-27 12:38:48 -07:00
Philippe Tillet	46297a949f	[PACKAGING] Now version 0.2.1	2021-07-27 12:38:48 -07:00
Philippe Tillet	c251dc50f3	[PACKAGING] Now version 0.2.0	2021-07-27 12:38:48 -07:00
Philippe Tillet	d85141182d	[PACKAGING] Now version 0.1.3	2021-07-27 12:38:48 -07:00
Philippe Tillet	694bfbddf9	[PACKAGING] Now version 0.1.2	2021-07-27 12:38:48 -07:00
Philippe Tillet	13ff6472e0	[LANG] Fixed undefined behavior in replace_all_uses_with()	2021-07-27 12:38:48 -07:00
Philippe Tillet	24586e60aa	[PACKAGING] sdist now generates working .tar.gz file	2021-07-27 12:38:48 -07:00
Philippe Tillet	769c1180c5	[PACKAGING] Fixed import error	2021-07-27 12:38:48 -07:00
Philippe Tillet	435acbf585	[PACKAGING] Added MANIFEST.in and some symlinks for better packaging	2021-07-27 12:38:48 -07:00
Philippe Tillet	ce4a4728f5	[PACKAGING] Fixed typo in setup.py	2021-07-27 12:38:48 -07:00
Philippe Tillet	3709f564e1	[PACKAGING] Added some more files for packaging	2021-07-27 12:38:48 -07:00
Philippe Tillet	5943baa53f	[GENERAL] Error messages now no longer make terminal color green	2021-07-27 12:38:48 -07:00
Jack Turner	33d7619482	[PYTHON] Add empty string to llvm-config versions in setup.py	2021-07-27 12:38:48 -07:00
Philippe Tillet	01154f24db	[PYTHON][SETUP] Removed obsolete debug print()	2021-07-27 12:38:48 -07:00
Philippe Tillet	3d769b57e2	[PYTHON] Better packaging	2021-07-27 12:38:48 -07:00
Philippe Tillet	6d7cf35123	History prior to this date belonged to the now deprecated ISAAC project, and was deleted to save space	2021-07-27 12:38:38 -07:00

32 Commits