[CODEGEN] Bugfix in membar pass (#124)
Membar pass on top of master is buggy with asynchronous copy. For example, it doesn't wait for asynchronous copies to complete before recoalescing accumulator in GEMM, which leads to undefined behavior when the program doesn't enter the loop. This PR proposes
This commit is contained in:
committed by
Philippe Tillet
parent
b7b05a560e
commit
5a51f3e529
@@ -92,7 +92,9 @@ void add_passes_to_emit_bin(ir::module &ir, driver::device *dev, int num_warps,
|
||||
liveness.run(ir);
|
||||
allocation.run(ir);
|
||||
prefetch_s.run(ir);
|
||||
// ir::print(ir, std::cout);
|
||||
barriers.run(ir);
|
||||
// ir::print(ir, std::cout);
|
||||
// ir::print(ir, std::cout);
|
||||
isel.visit(ir, *llvm);
|
||||
mod = driver::module::create(dev, std::move(llvm));
|
||||
|
Reference in New Issue
Block a user