I got inconsistent results when using Numba. when it worked well, it was way faster than Numpy, but sometimes it was slower. I wasn't able to figure out how to do AOT compilation, so I just went with f2py. if Numba has AOT compilation, I'd definitely use that over f2py though.
AOT compilation is in the works. Also you might have been using features that numba didn't support yet. They just added more numpy ops, array allocation and vector ops, so your code might be working now.