Reverts openai/triton#671 It seems like for some reason this caused out-of-memory errors on some of our internal workloads. I'm reverting this so that HEAD can be used in production at OpenAI, and I will work on digging into this issue asynchronously.