I did some investigating, and I found it doesn't start using the GPU unless you have a lot of input (such as a long prompt).