Philippe Tillet
29bdf7f546
Code quality: made the backend static
2015-07-30 21:15:37 -07:00
Philippe Tillet
89ee015f7f
General: Bugfixes here and there
2015-07-27 11:37:19 -07:00
Philippe Tillet
a2b533b9a8
Driver: made cl and cu attributes private in Handle<>
2015-07-23 09:40:18 -07:00
Philippe Tillet
cbe930398e
Code quality: Cleaned directory folder, variable names and mingw compliance
2015-07-21 13:29:23 -04:00
Philippe Tillet
f4615446c5
GEMM: More optimizations
2015-07-18 17:23:53 -04:00
Philippe Tillet
6ccf32904a
GEMM: Still optimizing
2015-07-18 16:06:17 -04:00
Philippe Tillet
753a9b1f3e
Benchmarks: now benchmaring all AlexNet sizes
2015-07-14 13:33:23 -04:00
Philippe Tillet
281fa9c7a6
Benchmarks: Now testing AlexNet's size
2015-07-10 16:05:28 -04:00
Philippe Tillet
e7cabf65ac
Tuning: Merged tune branch.
...
- Much cleaner and more concise source
- Better exceptions handling
- Checks local minima to see if retuning is needed.
Resolved conflicts:
bench/blas.cpp
include/isaac/backend/templates/mproduct.h
include/isaac/driver/buffer.h
lib/array.cpp
lib/backend/templates/mproduct.cpp
lib/driver/buffer.cpp
python/setup.py
tune/pysrc/autotune.py
tune/pysrc/dataset.py
tune/pysrc/misc_tools.py
2015-06-28 17:53:16 -07:00
Philippe Tillet
b32de3ac76
C++: More clBLAS routines
2015-06-25 08:12:16 -07:00
Philippe Tillet
9f7e34ba5d
C++: Added clBLAS sGEMM ABI (still buggy)
2015-06-24 07:51:27 -07:00
Philippe Tillet
a67476671d
Bench: Removed warnings in bench-blas when no external blas is defined
2015-04-29 16:11:32 -04:00
Philippe Tillet
cf5028d55b
Squashed feature branch:
...
* Added CUDA support
* Performance improvements
* API improvements
* Added "depth" parameter to GEMM
* Android cross-compilation
2015-04-29 15:52:21 -04:00
Philippe Tillet
5ff16bfcb6
Added cublas sgemm
2015-02-13 04:31:42 -05:00
Philippe Tillet
e453031094
More efficient access pattern in the GEMV kernel
2015-02-11 02:06:16 -05:00
Philippe Tillet
37fc98c532
Fixed bug on marix-vector products with vectorization
2015-02-10 03:09:41 -05:00
Philippe Tillet
a89f6d88be
Fix bug in operation-specific tuning
2015-02-09 01:58:32 -05:00
Philippe Tillet
a6d7671831
removing C++11 interface
2015-02-08 23:19:38 -05:00
Philippe Tillet
85fb438806
More convenient use of specific runtime tuning
2015-02-08 14:23:38 -05:00
Philippe Tillet
9c68704f09
Now using a list of event instead of a single one
2015-02-08 00:56:24 -05:00
Philippe Tillet
b768e913c9
Now using events to time autotuning
2015-02-06 22:11:03 -05:00
Philippe
7fc2348924
Fixed CUDA benchmark
2015-02-05 23:42:31 -05:00
Philippe Tillet
8f8b01938b
Rearranged benchmarking script
2015-02-05 23:11:16 -05:00
Philippe Tillet
e214927b16
Better control flow through options
2015-02-05 04:43:50 -05:00
Philippe Tillet
bbf2f0188e
Ported to C++11
2015-02-05 04:43:40 -05:00
Philippe Tillet
3a296ae3b7
Added a control flow API
2015-02-03 15:25:01 -05:00
Philippe Tillet
939ce15b45
Cleaner benchmarking code
2015-02-02 00:03:48 -05:00
Philippe Tillet
2afc574724
Implemented simple operation cache
2015-02-01 23:56:05 -05:00
Philippe Tillet
535706f35a
Some renaming; lower overhead in benchmark
2015-02-01 22:28:49 -05:00
Philippe Tillet
3b61842528
Lower overhead in the benchmarking source code
2015-02-01 18:59:27 -05:00
Philippe Tillet
3f1fa822f8
save
2015-02-01 15:58:05 -05:00
Philippe Tillet
d29f1252ad
Clearer array_expression with hopefully lower overhead.
...
Also removed pyc's
2015-01-31 22:01:48 -05:00
Philippe Tillet
f488274269
Fixed relational operators tests
2015-01-29 16:01:46 -05:00
Philippe Tillet
c7665021d1
reducing overhead; reverted custom CL/ header because CL/cl.hpp was buggy
2015-01-28 23:04:19 -05:00
Philippe Tillet
1246fbe9a8
More portable synchronization in blas-bench
2015-01-28 20:06:41 -05:00
Philippe Tillet
a6c513014f
Silly bugfix in cublas saxpy
2015-01-28 19:54:36 -05:00
Philippe Tillet
0dcf4d3617
Better exception handling, lowered CMake requirement ; blas-bench now benchmarks square matrices
2015-01-28 17:08:39 -05:00
Philippe Tillet
c37d8a2a81
Now using system CL include
2015-01-27 16:19:50 -05:00
Philippe Tillet
c12ec4cebd
tentative cuda benchmark integration
...
x
2015-01-27 15:32:59 -05:00
Philippe Tillet
a96c897cb3
Various fixes
2015-01-27 02:41:27 -05:00
Philippe Tillet
6d7f103ab1
Added GEMM benchmark
2015-01-25 18:19:19 -05:00
Philippe Tillet
da9defac04
No more random order in default initialization
...
Added linking to MKL/CBLAS
2015-01-24 14:51:48 -05:00
Philippe Tillet
69311b7982
Now ATIDLAS is standalone. Everything dynamic....
2015-01-12 13:24:06 -05:00
Philippe Tillet
8dd492de23
Now using ViennaCL for reference instead of "naive"
2014-11-06 16:15:22 -05:00
Philippe Tillet
146f54b32c
Improved benchmark suite
2014-11-06 07:07:27 -05:00
Philippe Tillet
e576d12d66
Added ViennaCL time
2014-10-30 14:20:14 -04:00
Philippe Tillet
f0a4255daa
Now actually benchmarking axpy
2014-10-30 14:11:50 -04:00
Philippe Tillet
71b4bde3ec
Better benchmarking
2014-10-30 13:04:33 -04:00
Philippe Tillet
83bce648a0
Now using the median for benchmarking
2014-10-29 17:03:24 +01:00
Philippe Tillet
740ba08c5c
Added basic BLAS1 benchmark
2014-10-27 05:35:04 -04:00