I've tried XLA and observed no measurable benefit for most models, small benefit for some, and worse performance for others. That was a few months ago, so maybe things have changed since then. It won't be a huge difference no matter what they do, because the vast majority of time is spent in CUDNN and gemm/gemv anyway