Commit Graph

434 Commits

Author SHA1 Message Date
Keren Zhou
a4f68165cd [FRONTEND] Hot fix for lineno (#481)
Override __reduce__ to make CompilationError pickable and print out error messages
2022-03-22 22:09:49 -07:00
daadaada
539961072c [FRONTEND] Semantic analysis refactor (#473)
Moved dispatch.cc to semantic.py
Integer signedness now moved from C++ to python
Cleaner frontend type

Co-authored-by: Phil Tillet <phil@openai.com>
2022-03-16 21:25:30 -07:00
Yongjik Kim
0dd2ec2e3a [FRONTEND] Add an assert in case we get a CPU tensor. (#478) 2022-03-16 14:38:56 -07:00
Philippe Tillet
d4d8eaf6c0 [FRONTEND] improved caching mechanism (#474)
Co-authored-by: Greg Brockman <gdb@gregbrockman.com>
Co-authored-by: Christopher Hesse <christopherhesse@users.noreply.github.com>
2022-03-15 12:20:51 -07:00
Doğukan Tuna
21f8a0646d [DOCS] Minor README.md (#470)
Added binary distribution for quick installation
2022-03-05 00:50:37 -08:00
Philippe Tillet
a50a47a85b [CODEGEN] Reverted some changes from previous PR; fixed vectorization characteristics of mma layout (#469) 2022-03-04 01:53:31 -08:00
Philippe Tillet
bb5765df5c [CODEGEN] Now padding shared memory for layout conversion (#468) 2022-03-03 22:19:05 -08:00
daadaada
d9dd97492f Use unique_ptr in ir::context_impl (#462)
Co-authored-by: Philippe Tillet <Phil.Tillet@gmail.com>
2022-02-24 16:07:10 -08:00
Philippe Tillet
98ed7db8c1 [CODEGEN] Improvements and bugfixes (#463) 2022-02-24 14:56:24 -08:00
daadaada
a9dfdcaaa9 [FRONTEND] Make the performance model work for int8, tf32, and fp32 (#456) 2022-02-11 22:34:42 -08:00
Philippe Tillet
9b100302d3 [FRONTEND] Now using pybind11 to release GIL (#458) 2022-02-10 01:57:39 -08:00
Philippe Tillet
40093a9878 [DOCS] Multiple versions are now supported (#457) 2022-02-09 01:32:41 -08:00
Philippe Tillet
4941bc7001 [DOCS] Some more fixes (#455) 2022-02-08 16:53:56 -08:00
Philippe Tillet
2fdf0a4fe8 [DOCS] changed build command 2022-02-08 11:45:21 -08:00
Philippe Tillet
077d6c8ff0 [DOCS] re-activated tutorials 2022-02-08 11:42:39 -08:00
Philippe Tillet
822ddcd14b [DOCS] Added versioning (#453) 2022-02-08 11:28:18 -08:00
Philippe Tillet
7b48340ffd [CI] Some fixes for the build (#451) 2022-02-06 19:11:33 -08:00
Philippe Tillet
5a8a544d10 [OPS][BLOCKSPARSE] Improved robustness, clarity and performance (#450)
* dds layout now internally re-uses dsd code path for increased code 
* at_mask and kp_mask related things are now dropped from the softmax API. I couldn't think of any case where it was needed beyond is_causal. And if there is any, we should probably find a way to get it implemented statically so that users don't have to materialize masks.
 * fixed bug in blocksparse matmul that caused troubles when layout had a full row/col of zeros
 * blocksparse softmax now no longer modifies any data in-place
 * blocksparse softmax now takes an is_dense arguments that provides better performance. Passing is_dense=True, is_causal=True is the best way to achieve triangular attention.
  * unit tests now test backward pass
2022-02-06 18:00:45 -08:00
Philippe Tillet
69ff52ea1f [CODEGEN] removed buggy (and mostly useless) optimization in peephole pass (#449) 2022-02-05 21:37:23 -08:00
TC
137bb67fad [LANG] Add fp16 to fp8 conversion (#444) 2022-02-02 20:42:09 -08:00
Philippe Tillet
3b20170fa3 Merge pull request #448 from openai/v2.0
`v2.0` is now merged into `master`
2022-01-30 20:49:08 -08:00
Philippe Tillet
b0d6e2f322 [STYLE] run autopep 2022-01-30 20:27:44 -08:00
Philippe Tillet
2922dc141c Merge branch 'master' into v2.0 2022-01-30 20:25:01 -08:00
Philippe Tillet
807d8a1945 [ALL] Merge master (#447) 2022-01-30 20:21:20 -08:00
Philippe Tillet
bef76b142a [BACKEND] float division is now approximate by default (#446) 2022-01-29 18:29:29 -08:00
Philippe Tillet
bd52e530a0 [OPS][BLOCKSPARSE] Fix padding issue in DSD LUT (#445) 2022-01-28 21:40:30 -08:00
daadaada
e68d6a7776 [BACKEND] Making the warp-level tile "more square" to increase data-reuse for tl.dot. (#442)
* Increase smem data-reuse for some layouts

* tweak

* Keep the original tiling logic for sm < 80

Co-authored-by: Philippe Tillet <phil@openai.com>
2022-01-27 09:59:54 -08:00
daadaada
59d371c6eb [BACKEND] Added Int8 mma (#440) 2022-01-27 09:12:44 -08:00
Benjamin Lefaudeux
3a23c1dd33 [BACKEND] minor, hotfix for gcc compilation (#439) 2022-01-23 14:24:02 -08:00
Philippe Tillet
ccf9abe0ba [FRONTEND][RANDOM] Improved backward compatibility of RNG (#438)
The unsigned int PR definitely improved our RNG. However, it requires
different floating point arithmetics which, means the results are not
bit-wise identical to how they were before. This commit revives backward
compatibility, but we should change it back to the "right" way later.
2022-01-21 18:05:55 -08:00
Philippe Tillet
4c97d1ecd7 [FRONTEND] Bunch of fixes here and there (#436) 2022-01-20 10:55:59 -08:00
Philippe Tillet
e0c5709cc8 [FRONTEND] Fixed semantics bug on ptr to bool conversions (#432) 2022-01-17 18:00:03 -08:00
daadaada
2a944ded53 [TESTS] Added bfloat16 tests (#430) 2022-01-13 23:38:32 -08:00
Philippe Tillet
4c94359199 [FRONTEND] Alignment fix-up (#428) 2022-01-11 23:11:58 -08:00
Philippe Tillet
bbc78f6516 [FRONTEND][RANDOM] Make sure offset dtype is always uint32 before calling uint32_to_uniform_float (#427) 2022-01-11 11:08:49 -08:00
Botao Yu
bf32205edc [OPS][BLOCKSPARSE] Remove unnecessary loop and add cuda bool layout support (#425) 2022-01-11 11:07:16 -08:00
daadaada
94a2e10fe5 [BACKEND] Add bf16 & tf32 mma supports (on A100) (#426) 2022-01-11 10:20:31 -08:00
Madeleine Thompson
efdabe6073 [STYLE] check python with flake8 (#424)
I've been using this locally to find errors without running tests, and now that we're using autopep8, it passes with minimal suppressions. This is also what turned up the issues with the tutorials, which were fixed in #422.
2022-01-07 15:28:36 -08:00
Madeleine Thompson
a70acfec77 [STYLE] add isort and autopep8 config files and check on CI (#423)
Also a fix a few more style issues from the "aggressive" mode of autopep8.
2022-01-07 13:11:34 -08:00
Madeleine Thompson
9801aa7b56 [DOCS] fix tutorials for v2.0 (#422)
- Fix meta-parameter usage on tutorials.
- Install tutorial dependencies on CI.
- Switch from `requirements-test.txt` to `extras_require` for test dependencies, and also use it for tutorial dependencies.
- Make some performance tests deterministic.
2022-01-07 12:34:38 -08:00
Madeleine Thompson
8bf551ae7a [STYLE] run autopep8 and isort (#421)
Run:
```
isort ./python
autopep8 -i --ignore E501,E701,E731 $(find ./python/ -name '*.py')
```
with an `.isort.cfg` and then clean up a few warts. This PR should be a no-op; the idea is that this is all boring whitespace changes, and any config file changes will be in a different change to make it easier to review.
2022-01-06 14:34:17 -08:00
Shantanu
6f7acad48f [CODEGEN] Avoid use of deprecated AST nodes (#418)
Co-authored-by: hauntsaninja <>
2022-01-06 12:04:33 -08:00
Madeleine Thompson
120cda015e [FRONTEND] use unsigned integers to simplify RNG (#417) 2022-01-06 10:49:09 -08:00
Philippe Tillet
001fb757fe [OPS][BLOCKSPARSE] Added .contiguous() in blocksparse inputs when necessary (#420) 2022-01-06 09:56:22 -08:00
Madeleine Thompson
0ab9d67bad uint8, uint16, uint32, and uint64 in kernels (#413)
A forthcoming PR will update the RNG to use these types.

Also:
- Add tests for the `//`, `<<`, and `>>` operators.
- Change `TensorWrapper` to unwrap objects when the resulting object would be simpler.
- Clean up `throw_unreachable`, since it was triggering compiler warnings.
2022-01-05 15:27:17 -08:00
Madeleine Thompson
d8db0308cb [TEST] use numpy for reference results in test_core.py (#409)
Since numpy supports unsigned integers, and pytorch doesn't, this will make it easier to test unsigned integer support.

This adds an explicit requirement for numpy in tests, but we already required scipy, so it was already an implicit dependency.
2022-01-04 13:07:29 -08:00
Philippe Tillet
03f1256f60 [FRONTEND] Added volatile flag for load (#407) 2021-12-30 22:33:24 -08:00
Noah Ziems
3edc2633e9 [TUTORIALS] Fix 01-vector-add.py typo (#406) 2021-12-29 15:09:34 -08:00
Madeleine Thompson
985798f101 add missing bfloat16 repr and improve assertions (#403)
- `BF16TyID` was missing a repr implementation.
- Throw a better exception on impossible casts.
- Add a few assertions. Tested with a debug build.
- Add `pointer_dtype.__str__` to aid kernel debugging.
2021-12-23 17:01:17 -08:00
Philippe Tillet
d8fce83e7a [FRONTEND] Remade exception picklable 2021-12-21 22:14:06 -08:00