Yan Da
6b4da6f016
Documentation
2022-04-07 16:00:53 +08:00
Yan Da
9cf4107990
Add TensorSizeTrait
2022-04-07 15:18:43 +08:00
Yan Da
39fad2b18a
More progress on WhileOp
2022-04-05 17:55:43 +08:00
Yan Da
d7fbddc7d4
Fix ret::reference issue
2022-04-05 16:09:09 +08:00
Yan Da
c7ad928e60
More progress on WhileOp codegen
2022-04-05 15:55:48 +08:00
Yan Da
9df899b291
Some progress on visit_If
2022-04-03 22:34:46 +08:00
Yan Da
c71c50cd0c
ForOp's SSA construction
2022-04-03 19:11:47 +08:00
Yan Da
61413b8a97
More python bindings
2022-04-01 22:22:39 +08:00
Yan Da
bde103fab0
Replace MlirType with mlir::Type
2022-04-01 18:46:46 +08:00
Yan Da
4ad432f1fc
More on scf Ops
2022-03-31 21:42:48 +08:00
Yan Da
2041b67fbf
Now vecadd works
2022-03-30 20:21:47 +08:00
Yan Da
e381dc72c5
Use mlir::Block to replace MlirBlock
2022-03-30 16:31:03 +08:00
Yan Da
e95d98a886
bindings for ModuleOp
2022-03-30 13:32:52 +08:00
Yan Da
38e67b4293
Add more Ops
2022-03-28 19:50:23 +08:00
Yan Da
0d139ec460
Introducing SCF
2022-03-26 17:02:32 +08:00
Yan Da
c53f3486e4
create shr
2022-03-26 16:41:49 +08:00
Yan Da
ba16116f96
Let python manage created objects
2022-03-26 16:31:01 +08:00
Yan Da
a17fba86b1
Logic Op creation
2022-03-26 16:16:20 +08:00
Yan Da
d5612333c0
More fcmp ops
2022-03-25 14:12:20 +08:00
Yan Da
07881b4d41
Update includes
2022-03-24 13:46:35 +08:00
Yan Da
cf7fc8d642
Update includes
2022-03-24 13:33:54 +08:00
Yan Da
14a71dcb6f
Replace MlirOperation with MlirValue
2022-03-23 13:31:14 +08:00
Yan Da
f2ab318614
New python binding
2022-03-22 21:53:22 +08:00
Yan Da
419bbe0f6e
Reverts back to MLIR 14 & updates CMakeLists
2022-03-20 16:41:48 +08:00
Yan Da
a2c31ff434
Init commit
2022-03-17 20:40:55 +08:00
daadaada
539961072c
[FRONTEND] Semantic analysis refactor ( #473 )
...
Moved dispatch.cc to semantic.py
Integer signedness now moved from C++ to python
Cleaner frontend type
Co-authored-by: Phil Tillet <phil@openai.com >
2022-03-16 21:25:30 -07:00
Philippe Tillet
d4d8eaf6c0
[FRONTEND] improved caching mechanism ( #474 )
...
Co-authored-by: Greg Brockman <gdb@gregbrockman.com >
Co-authored-by: Christopher Hesse <christopherhesse@users.noreply.github.com >
2022-03-15 12:20:51 -07:00
Philippe Tillet
98ed7db8c1
[CODEGEN] Improvements and bugfixes ( #463 )
2022-02-24 14:56:24 -08:00
Philippe Tillet
9b100302d3
[FRONTEND] Now using pybind11 to release GIL ( #458 )
2022-02-10 01:57:39 -08:00
Philippe Tillet
7b48340ffd
[CI] Some fixes for the build ( #451 )
2022-02-06 19:11:33 -08:00
Philippe Tillet
807d8a1945
[ALL] Merge master ( #447 )
2022-01-30 20:21:20 -08:00
Philippe Tillet
bef76b142a
[BACKEND] float division is now approximate by default ( #446 )
2022-01-29 18:29:29 -08:00
Philippe Tillet
4c97d1ecd7
[FRONTEND] Bunch of fixes here and there ( #436 )
2022-01-20 10:55:59 -08:00
Philippe Tillet
4c94359199
[FRONTEND] Alignment fix-up ( #428 )
2022-01-11 23:11:58 -08:00
Madeleine Thompson
0ab9d67bad
uint8, uint16, uint32, and uint64 in kernels ( #413 )
...
A forthcoming PR will update the RNG to use these types.
Also:
- Add tests for the `//`, `<<`, and `>>` operators.
- Change `TensorWrapper` to unwrap objects when the resulting object would be simpler.
- Clean up `throw_unreachable`, since it was triggering compiler warnings.
2022-01-05 15:27:17 -08:00
Philippe Tillet
03f1256f60
[FRONTEND] Added volatile
flag for load ( #407 )
2021-12-30 22:33:24 -08:00
Madeleine Thompson
985798f101
add missing bfloat16 repr and improve assertions ( #403 )
...
- `BF16TyID` was missing a repr implementation.
- Throw a better exception on impossible casts.
- Add a few assertions. Tested with a debug build.
- Add `pointer_dtype.__str__` to aid kernel debugging.
2021-12-23 17:01:17 -08:00
Philippe Tillet
a425f24d54
[FRONTEND] Better cache hook ( #400 )
...
Added an additional `repr` argument to the cache hook, which represents a human-readable string representation of the signature and argument attributes associated with the compiled binary.
2021-12-21 21:29:47 -08:00
daadaada
39d4bfed83
[OPS] Add performance model for gemm/gemv ( #397 )
...
Significantly improves the performance of `triton.ops.matmul` in memory-bound settings via the use of many more block configs coupled with a performance model to drive the auto-tuning process.
2021-12-21 09:56:10 -08:00
daadaada
4a8953efa3
[FRONTEND] Replace the legacy print call in triton.cc with the SlotTracker-based one. ( #396 )
...
The legacy print call will assign names (e.g., %10) to values, which can be undesirable in some cases.
2021-12-18 18:03:22 -08:00
Philippe Tillet
558555630f
[FRONTEND] Added xor_sum
2021-12-16 17:55:35 -08:00
Philippe Tillet
e31b9b4e66
[RUNTIME] Better support for None
( #387 )
...
* regression test fails but it doesn't make sense to me.
2021-12-09 13:21:22 -08:00
Philippe Tillet
f23bf55f15
[RUNTIME] release the gil on launch ( #383 )
2021-12-03 13:01:01 -08:00
Philippe Tillet
c86ad9c9ab
[FRONTEND] Added default arguments to non-kernel @triton.jit'd function ( #379 )
2021-11-29 19:11:26 -08:00
Philippe Tillet
5693b582ea
[RUNTIME] Now using pybind11 to avoid memory leaks ( #377 )
2021-11-21 02:30:22 -08:00
Philippe Tillet
01cc3d4503
[RUNTIME] Restored do_not_specialize
( #374 )
2021-11-12 15:06:55 -08:00
Philippe Tillet
5d54352164
[FRONTEND] Significantly reduce kernel launch time ( #367 )
2021-11-04 13:25:24 -07:00
Philippe Tillet
5ce1b726dc
[CODEGEN] Various bugfixes that make it possible to fuse RNG in a matmul epilogue ( #356 )
2021-10-24 02:30:46 -07:00
daadaada
858dec8372
[CODEGEN] Add cache modifier to tl.load ( #351 )
...
* Add cache modifier to tl.load
* Add comment to cache_modifier
* Remove force_nc_cache
* Update test
2021-10-17 22:14:04 -07:00
Philippe Tillet
5123db0b7d
[LANG] Various (relatively minor) improvements ( #320 )
2021-10-04 18:39:40 -07:00