triton

Author	SHA1	Message	Date
Philippe Tillet	d4d8eaf6c0	[FRONTEND] improved caching mechanism (#474 ) Co-authored-by: Greg Brockman <gdb@gregbrockman.com> Co-authored-by: Christopher Hesse <christopherhesse@users.noreply.github.com>	2022-03-15 12:20:51 -07:00
Philippe Tillet	5a8a544d10	[OPS][BLOCKSPARSE] Improved robustness, clarity and performance (#450 ) * dds layout now internally re-uses dsd code path for increased code * at_mask and kp_mask related things are now dropped from the softmax API. I couldn't think of any case where it was needed beyond is_causal. And if there is any, we should probably find a way to get it implemented statically so that users don't have to materialize masks. * fixed bug in blocksparse matmul that caused troubles when layout had a full row/col of zeros * blocksparse softmax now no longer modifies any data in-place * blocksparse softmax now takes an is_dense arguments that provides better performance. Passing is_dense=True, is_causal=True is the best way to achieve triangular attention. * unit tests now test backward pass	2022-02-06 18:00:45 -08:00
TC	137bb67fad	[LANG] Add fp16 to fp8 conversion (#444 )	2022-02-02 20:42:09 -08:00
Philippe Tillet	807d8a1945	[ALL] Merge master (#447 )	2022-01-30 20:21:20 -08:00
daadaada	59d371c6eb	[BACKEND] Added Int8 mma (#440 )	2022-01-27 09:12:44 -08:00
Philippe Tillet	4c97d1ecd7	[FRONTEND] Bunch of fixes here and there (#436 )	2022-01-20 10:55:59 -08:00
daadaada	2a944ded53	[TESTS] Added bfloat16 tests (#430 )	2022-01-13 23:38:32 -08:00
daadaada	94a2e10fe5	[BACKEND] Add bf16 & tf32 mma supports (on A100) (#426 )	2022-01-11 10:20:31 -08:00
Madeleine Thompson	efdabe6073	[STYLE] check python with flake8 (#424 ) I've been using this locally to find errors without running tests, and now that we're using autopep8, it passes with minimal suppressions. This is also what turned up the issues with the tutorials, which were fixed in #422.	2022-01-07 15:28:36 -08:00
Madeleine Thompson	8bf551ae7a	[STYLE] run autopep8 and isort (#421 ) Run: ``` isort ./python autopep8 -i --ignore E501,E701,E731 $(find ./python/ -name '*.py') ``` with an `.isort.cfg` and then clean up a few warts. This PR should be a no-op; the idea is that this is all boring whitespace changes, and any config file changes will be in a different change to make it easier to review.	2022-01-06 14:34:17 -08:00
Madeleine Thompson	0ab9d67bad	uint8, uint16, uint32, and uint64 in kernels (#413 ) A forthcoming PR will update the RNG to use these types. Also: - Add tests for the `//`, `<<`, and `>>` operators. - Change `TensorWrapper` to unwrap objects when the resulting object would be simpler. - Clean up `throw_unreachable`, since it was triggering compiler warnings.	2022-01-05 15:27:17 -08:00
Madeleine Thompson	d8db0308cb	[TEST] use numpy for reference results in test_core.py (#409 ) Since numpy supports unsigned integers, and pytorch doesn't, this will make it easier to test unsigned integer support. This adds an explicit requirement for numpy in tests, but we already required scipy, so it was already an implicit dependency.	2022-01-04 13:07:29 -08:00
Madeleine Thompson	985798f101	add missing bfloat16 repr and improve assertions (#403 ) - `BF16TyID` was missing a repr implementation. - Throw a better exception on impossible casts. - Add a few assertions. Tested with a debug build. - Add `pointer_dtype.__str__` to aid kernel debugging.	2021-12-23 17:01:17 -08:00
Philippe Tillet	a425f24d54	[FRONTEND] Better cache hook (#400 ) Added an additional `repr` argument to the cache hook, which represents a human-readable string representation of the signature and argument attributes associated with the compiled binary.	2021-12-21 21:29:47 -08:00
Madeleine Thompson	5cdb948c05	[FRONTEND] signed-integer math fixes and testing (#395 ) - Promote 16-bit floating-point `/` and `%` to 32-bit; we have to anyway. - Do not force result of integer binary operations to be the LHS type. There used to be a bug in pytorch that did this, which Triton matched, but that bug is fixed now. - When testing signed integer operations, use random numbers from the full range of the type. - Add an optional `seed` argument to `triton.testing.random` so binary operations are not tested with both sides equal when the LHS and RHS have the same type. - Fix a bad `CompilationError` invocation. - Fix a warning suppression that causes tests to fail if you run them with `-W error` on python 3.8.	2021-12-21 09:46:05 -08:00
Madeleine Thompson	e575ae3443	[FRONTEND] Minor accumulated style and warning fixes (#388 ) - Fix some whitespace. - Make an undeclared dependency on `pytest` explicit. - Fix deprecated `description-file` use. - `#ifdef` out a deprecated `PyEval_InitThreads` call. - Use a slightly different numpy invocation in `test_random.py` to quiet down overflow warnings in tests. - Fix a deprecated cast in `test_core.py`. - Suppress a warning about `visit_Constant` in Python 3.9+; we can't migrate yet because it'd break Python 3.6 and 3.7. - Use chained exceptions for `CompilationError` rather than rolling our own; it makes the error messages nicer. - Add a `__str__` for `tl.dtype` to make debugging kernels easier; it lets you `print` a dtype to see what type was inferred. - Fix a few bad escapes.	2021-12-10 15:19:20 -08:00
Philippe Tillet	8ec9f037bb	[BACKEND/CODE_GEN] Fixed float32 matmul problem (#380 )	2021-11-30 22:00:56 -08:00
Philippe Tillet	c86ad9c9ab	[FRONTEND] Added default arguments to non-kernel @triton.jit'd function (#379 )	2021-11-29 19:11:26 -08:00
daadaada	1296eb877b	[RUNTIME] Config hook v2.0 (#373 ) * Add pre_hook to triton.Config * Use argument names in triton.heuristics * Update base perf * Remove meta from heuristics	2021-11-21 11:20:59 -08:00
Philippe Tillet	01cc3d4503	[RUNTIME] Restored `do_not_specialize` (#374 )	2021-11-12 15:06:55 -08:00
Philippe Tillet	e66bf76354	[RUNTIME] Bunch of bugfixes (#372 )	2021-11-12 00:55:00 -08:00
Philippe Tillet	2acaa4d0dd	[LANG] Added support for constexpr (#361 )	2021-10-30 00:32:58 -07:00
daadaada	858dec8372	[CODEGEN] Add cache modifier to tl.load (#351 ) * Add cache modifier to tl.load * Add comment to cache_modifier * Remove force_nc_cache * Update test	2021-10-17 22:14:04 -07:00
Stephen McGroarty	c2e6b90ff1	[CODEGEN] Fixes masked load exception (#342 )	2021-10-13 13:31:52 -07:00
Philippe Tillet	c3c0ff0552	[LANGUAGE] Fixed issue with duplicates in large arrays of random uniform numbers (#338 )	2021-10-10 15:22:34 -07:00
daadaada	9e9d781912	[CODEGEN] Pipeline fixup (#336 )	2021-10-10 01:47:11 -07:00
Philippe Tillet	5123db0b7d	[LANG] Various (relatively minor) improvements (#320 )	2021-10-04 18:39:40 -07:00
Philippe Tillet	2c287544cb	[OPS] Faster and cleaner block-sparse implementation (#311 )	2021-09-27 18:25:16 -07:00
Benjamin Lefaudeux	b53f5f3803	[OPS][BLOCKSPARSE] safeguarding a couple more configurations (#292 )	2021-09-20 17:15:31 -07:00
Philippe Tillet	3e395bc84e	[LANG] Fixed semantics of NaN in float comparisons (#281 )	2021-09-13 15:06:29 -07:00

1 2

80 Commits