If you read the "Details" section of that very Wikipedia article, you'll see why you can't just inline the accept method. You need both visit and accept in order to simulate double dispatch. accept performs one dispatch, visit performs the second. I suspect you could further generalize the pattern to k dispatches for some arbitrary but fixed value of k, though it would be very tedious.
Some algorithms, like collision detection for example, are readily described with double dispatch. So in a language that only offers single dispatch, like C++ or Java for example, the visitor pattern actually leads to more elegant code for such algorithms.
I'm not saying I particularly like this way of doing things (I'd rather the language had direct support for multiple dispatch, really), but it's certainly not a dumb way of doing things as you seem to think. In fact it's rather clever, and in some cases the best option.
> If you read the "Details" section […], you'll see why you can't just inline the accept method.
Oops.
Still, it makes me feel very uneasy: the accept methods dispatch over behaviours, and the visit methods dispatch over the object. This means highly non-local code.
Collision detection looks like a good example. Okay. Now even with multiple dispatch, I can't suffer the idea of having up to n² methods, for n different classes to dispatch over. In the case of collision detection, I would try very hard to have no more than 3 or 4 types of collision volumes. Just one if I can help it.
Personally, I'm not sure I like multiple dispatch at all. When you need it, you expose yourself to combinatorial explosion. It's only polynomial, but it still scales poorly.
For something like collision detection, you're always going to have n² cases to handle, even with something like pattern matching. The question then becomes whether you have n² methods or n² branches. There's a "combinatorial explosion" regardless; you just can't avoid it for some things. Then it's just a question of how you structure the code.
Versus the pattern-matching approach (abusing CL's case expression for clarity):
(defun collide? (a b)
(case (list (type-of a) (type-of b))
((bus person) ...)
((person bus) ...)
((bus bus) ...)
((person person) ...)))
Beside potential performance implications of implementing one way or the other, you also have an open-closed conundrum. Multimethods leave the set of behaviors open, while the pattern-matching leaves it closed. That is, if a library wanted to define a new kind of collidable entity, it's trivial to add with a new library method. Not so easy with the pattern-matching approach. For some applications, this doesn't matter; for some it does.
You can avoid the n² problem if every object uses the same type of collision mesh (or bounding box, or bitmap, or whatever). Now the problem is to get from the object to the collision mesh, and that's O(n).
The bigger question is, is it practical at all for everyone to use the same type of collision object?
The answer to the bigger question is: not always. But let's assume that in our case, the answer is yes, and we can use a bounding box for everything. Let's also assume that when a collision is detected between two objects, the results of that collision vary depending on what those objects are. How do we handle that? Well, we're back at the n² problem again.
I once wrote a version of Asteroids in C++ in which every object had a particular bounding shape that was used for the detection. For the logic, I still had n² cases to determine the result of that collision. Looking back, I contend that it would have been cleaner to use the visitor pattern, since all the logic in my game was contained in various game objects except for the collision handling logic. [1]
It's good to be skeptical of design patterns. I've worked at several companies where liberal application of patterns really mucked up the code base. But in some cases, they really can be the best solution -- when used conservatively and carefully.
[1]: Looking back I also would have used an entity system rather than "classic" OOP, but that would not have solved the n² case dilemma.
Some algorithms, like collision detection for example, are readily described with double dispatch. So in a language that only offers single dispatch, like C++ or Java for example, the visitor pattern actually leads to more elegant code for such algorithms.
I'm not saying I particularly like this way of doing things (I'd rather the language had direct support for multiple dispatch, really), but it's certainly not a dumb way of doing things as you seem to think. In fact it's rather clever, and in some cases the best option.