I think that's the point, actually. The fact that a sentence can be grammatically correct but not logically correct, or be both, but still not "make sense," is interesting in itself. It shows how difficult it is to determine whether a sentence is "valid" for speakers of that language.
(Chomsky's classic example of "grammatical but not logical" was "Colorless green ideas sleep furiously.")
Think of it like this: Chomsky was testing the brain's language parser by handing it weird things and seeing how it reacts. Like you might do with a new programming language: what happens if I try to add strings, or divide them? Is zero true? Is the string "nil" true? Is == different from ===? Can I pass a function into a function?
The brain's language center is undocumented, so we try throwing potential sentences at it and see what works or doesn't, then try to reverse engineer what it's doing. The buffalo sentence conforms to the rules we know about word order, and can be logically explained, but somehow it fails. Finding out why is part of the reverse engineering process.
Yes... the interesting question is whether it fails because of a "rule" we're not aware of, or because it's simply too complex. The human mind is recursive, but is it simply that it only have a "stack depth" of 3 or 4 and can't parse more deeply than that?
(Chomsky's classic example of "grammatical but not logical" was "Colorless green ideas sleep furiously.")