Philippe Tillet
1e076c131b
API: clearer interface for transposition
2015-10-01 21:58:59 -04:00
Philippe Tillet
feeb1e9862
Feature: Merged kernel-fusion branch
...
* Fuses multiple AXPY kernel
* Possibility to add thread-wise for loops in AXPY-like kernels
2015-09-30 15:31:41 -04:00
Philippe Tillet
cf2d88a0a2
Binding: now releasing profiles in clblasTeardown()
2015-08-25 19:35:05 -04:00
Philippe Tillet
67a35a62bd
Driver: now loading the backend dynamically on Linux
2015-08-25 17:06:51 -04:00
Philippe Tillet
efdbf5f4a6
Bench: Added LeNet sizes
2015-08-18 16:44:35 -07:00
U-AMR\ptillet
b34c611802
Code quality: Added consistency between int_t and size_t. Fixed warnings for Win64
2015-08-13 16:00:49 -07:00
Philippe Tillet
f7cb4ac960
Code quality: fixed implicit conversions from size_t to int_t
2015-08-13 14:30:11 -07:00
Philippe Tillet
ff4cf94df7
Code quality: significant cleaning of namespaces, etc..
2015-08-12 00:47:58 -07:00
Philippe Tillet
f60b82af25
Kernels: more generic temporary workspace checks
2015-08-10 10:19:50 -07:00
Philippe Tillet
db090d7942
Code quality: Large clean-up of the codebase and especially of the include/ folder
2015-08-06 12:05:12 -07:00
Philippe Tillet
df2d5e7d00
Models: cleaning of the global caching mechanism
2015-08-04 10:06:52 -07:00
Philippe Tillet
a8b8c684e3
Tinkering with python wrapper
2015-08-03 11:13:31 -07:00
Philippe Tillet
d3f82e535f
C interface: now flushing after clBLAS calls
2015-07-30 13:54:41 -07:00
Philippe Tillet
4715723e61
Driver: Fixed issue in ownership handling for BLAS
2015-07-26 21:13:28 -07:00
Philippe Tillet
0ef6654c5f
Code quality: removed dependencies on the C++ OpenCL wrapper
2015-07-26 10:05:16 -07:00
U-AMR\ptillet
8879a867d8
Code Quality: fixed compilation errors on MSVC
2015-07-20 18:05:31 -07:00
Philippe Tillet
cd155cb9e3
Code quality: Improved compliance to MSVC
2015-07-21 17:18:50 -04:00
Philippe Tillet
cf2dba43ef
Backend: A lot of bugfixes in dot() for handling shapes better
2015-06-30 17:55:57 -04:00
Philippe Tillet
e7cabf65ac
Tuning: Merged tune branch.
...
- Much cleaner and more concise source
- Better exceptions handling
- Checks local minima to see if retuning is needed.
Resolved conflicts:
bench/blas.cpp
include/isaac/backend/templates/mproduct.h
include/isaac/driver/buffer.h
lib/array.cpp
lib/backend/templates/mproduct.cpp
lib/driver/buffer.cpp
python/setup.py
tune/pysrc/autotune.py
tune/pysrc/dataset.py
tune/pysrc/misc_tools.py
2015-06-28 17:53:16 -07:00
Philippe Tillet
3525edd54c
BLAS: Added row-major support and tests
2015-06-27 15:22:26 -04:00
Philippe
8f19d2a69c
C++/clBLAS: Bugfix in GEMM
2015-06-27 13:54:26 -04:00
Philippe
4cce9d3efd
C: More clBLAS tests
2015-06-27 11:44:50 -04:00
Philippe Tillet
e6cecc5a09
C: Some fixes in BLAS
2015-06-26 08:08:22 -07:00
Philippe Tillet
b0cd25ac4b
Added C BLAS1 test
2015-06-25 23:12:26 -07:00
Philippe Tillet
b32de3ac76
C++: More clBLAS routines
2015-06-25 08:12:16 -07:00
Philippe Tillet
9f7e34ba5d
C++: Added clBLAS sGEMM ABI (still buggy)
2015-06-24 07:51:27 -07:00