triton

Author	SHA1	Message	Date
Philippe Tillet	eacbb73968	[PYTHON] CUTLASS wrapper for fair benchmarks (#75 ) Before this commit, the benchmarking infrastructure used heterogeneous protocols between library (e.g., CUTLASS uses a C++ binary that reports mean TFLOPS; torch and triton use python call and report 10th, 50th and 90th quantiles). For the sake of uniformity and fair benchmark practices, this PR adds a python wrapper for auto-tuned CUTLASS matrix multiplication. Benchmarks have been rewritten to use this wrapper with `triton.testing.do_bench` rather than system calls to CUTLASS profiler. Importantly, this also ensures that all the matmuls are done on the same input data which should stabilize clock across providers.	2021-07-27 12:38:49 -07:00
Philippe Tillet	2a02fabdac	[PYTHON] Some cleaning of the PyBind11 wrappers (#62 )	2021-07-27 12:38:48 -07:00
Philippe Tillet	7cf358a352	[TUTORIALS] Fixed TYPO in CMakeLists.txt	2021-07-27 12:38:48 -07:00
Philippe Tillet	269ebc12e5	[PYTHON][TESTS][DOC] Various improvement of the API and code quality: * Simplified `triton.kernel` API to achieve lower latency: > .data_ptr() must now be passed as kernel argument. No more implicit conversion from torch.tensor > compilation options are now constant attributes, i.e., opt.d('VAR') becomes opt.VAR > torch.device must now be passed explicitly to triton.kernel (no longer inferred from torch.tensor arguments) * C++ tests moved to `python/tests/` * C++ tutorial created in `tutorials/` * Python tutorial created in python/tutorials/ * Version changed to 1.0alpha * No longer copying C++ headers into the Python package * added python/triton/ops/ package for pre-written Triton ops	2021-07-27 12:38:48 -07:00
Philippe Tillet	50587bbf4b	[General] LLVM-9 -> LLVM-10	2021-07-27 12:38:48 -07:00
Philippe Tillet	cf80ccc798	[PYTHON] Fixed torch ABI issue	2021-07-27 12:38:48 -07:00
Philippe Tillet	444907589d	[GENERAL] Fixed MacOS compilation issues	2021-07-27 12:38:48 -07:00
Jeff Rasley	7fdf2e378c	fix llvm build inside conda environment (see link for similar issue) https://github.com/tensorflow/tensorflow/issues/12998	2021-07-27 12:38:48 -07:00
Philippe Tillet	acff1b5e05	[RUNTIME] Lower-level interface for executing functions	2021-07-27 12:38:48 -07:00
Philippe Tillet	04a9ea060b	[GENERAL] Added compatibility with pytorch 1.2.0 and powerpc	2021-07-27 12:38:48 -07:00
Philippe Tillet	24586e60aa	[PACKAGING] sdist now generates working .tar.gz file	2021-07-27 12:38:48 -07:00
Philippe Tillet	3304629de9	[CORE] Fixed several issues that arose in the development of the torch-blocksparse package: * Now using warp shuffle in reductions when possible * Various bugfixes in layout inference * Added INFINITY, exponential and select * Better error messages for unimplemented constructs	2021-07-27 12:38:48 -07:00
Philippe Tillet	f08dd0ec58	[CMAKE] target_link_directories -> link_directories	2021-07-27 12:38:48 -07:00
Philippe Tillet	646c49f847	[CMAKE] Fixed issue in LLVM link directory	2021-07-27 12:38:48 -07:00
Philippe Tillet	6d7cf35123	History prior to this date belonged to the now deprecated ISAAC project, and was deleted to save space	2021-07-27 12:38:38 -07:00

15 Commits