Teaching Deep Convolutional Neural Networks to Play Go

gamegoblin · on Dec 15, 2014

I would be extremely interested in seeing this used as a pruning function for a state-of-the-art MCTS (Monte Carlo Tree Search) Go engine.

As it stands, your generic MCTS algorithm expands a game tree of nodes, and gives more attention to more promising branches, but it still must give attention to other branches to find out if they are promising or not (exploration vs. exploitation).

In the paper, they get the right move (right as defined by what an expert human would do) 44% of the time, but they also say the right move, if not the #1 choice, is often in the top few choices. According to their graph it's in the top 10 choices about 80% of the time, and in the top 30 choices about 98% of the time.

If in MCTS you could prune the branching factor of your tree search down from 300+ to ~30, that could be huge.

=====

I'd also be interested in seeing it used as the playout function of an MCTS engine.

As it stands, most playout functions use random, or random+quick heuristic to playout thousands (or millions) of random games to rate a position. I imagine if you used this, which can output an entire probability distribution of moves, you could do significantly better than random with a fewer number of games.

jpfr · on Dec 15, 2014

Integration in MCTS should be straightforward. And I would be surprised if the author's hadn't done it already. Normally, no actions are pruned away per-se. Instead, the available actions are initialised with a utility value [1] from an external oracle, i.e. the neural net. From then on, the normal MCTS procedure continues until the time or memory runs out.

[1] ..and a faked visit-count. So the algorithm beliefs that the action was already evaluated n times, which resulted in the given mean utility.

gamegoblin · on Dec 15, 2014

I don't think they've done it yet -- in the conclusion they say:

    The most obvious next step is to integrate a 
    DCNN into a full fledged Go playing system. For
    example, a DCNN could be run on a GPU in parallel with
    a MCTS Go program and be used to provide highly quality
    priors for what the strongest moves to consider are. Such
    a system would both be the first to bring sophisticated pat-
    tern recognitions abilities to playing Go, and have a strong
    potential ability to surpass current computer Go programs.

I agree that the integration should be exceedingly straightforward. I've written MCTS implementations (though not a Go implementation -- I used it on Connect 4 and other easy-to-code games), and it seems like you'd just plug it into your already-existing bias function.

The authors didn't mention my idea about using the probability distribution output from the network to guide the random playouts, which I would also be interested in.

xxxaaa · on Dec 15, 2014

From the paper: " The most obvious next step is to integrate a DCNN into a full fledged Go playing system. For example, a DCNN could be run on a GPU in parallel with a MCTS Go program and be used to provide highly quality priors for what the strongest moves to consider are. Such a system would both be the first to bring sophisticated pattern recognitions abilities to playing Go, and have a strong potential ability to surpass current computer Go programs. "

hyperpape · on Dec 15, 2014

There's a thread on this paper on the computer go mailing list. It's a quiet list, but features authors of several of the best programs.

http://computer-go.org/pipermail/computer-go/2014-December/0...

sanxiyn · on Dec 16, 2014

In the thread discussing this work at computer-go list, Aja Huang wrote: "Chris Maddison also produced very good (in fact much better) results using a deep convolutional network during his internship at Google. Currently waiting for publication approval, I will post the paper once it is passed."

http://computer-go.org/pipermail/computer-go/2014-December/0...

romaniv · on Dec 15, 2014

I wish there was something using DNNs that users could train and use without writing code or connecting to some centralized service. It would help to get the "feel" for how good these algorithms are. Something moderately general-purpose, I mean. However, it seems that mapping them to the problem domain is an art in itself.

modeless · on Dec 15, 2014

Mapping DNNs to a problem domain is an art, but it is easier than the "feature engineering" required to use other forms of machine learning.

hurin · on Dec 15, 2014

For those (like me) wondering how strong Feugo is, I found this: http://www.gokgs.com/graphPage.jsp?user=fuego19

w1ntermute · on Dec 15, 2014

An interesting Wired article on the topic: http://www.wired.com/2014/05/the-world-of-computer-go/

gnidan · on Dec 15, 2014

It should be pretty neat to see where this goes. I'm particularly curious what it would take to bridge the gap from 4-5 kyu play to dan-level rankings. I believe there are computer go programs that play around 5-dan level (e.g. Zen19D, CrazyStone), using Monte Carlo, of course. Will the techniques in this article be able to scale up to that?

rixoff · on Dec 16, 2014

Happy to see that this paper is freely available for all to download and read!

on Dec 15, 2014

[deleted]

dangerlibrary · on Dec 15, 2014

Specialized pattern recognition in extremely constrained environments is only a hair's width away from higher order intelligence, self-awareness, and a desire for independence.

Houshalter · on Dec 16, 2014

Welcome to the AI effect. Progress in AI, no matter how large or small, is almost never recognized as progress: https://en.wikipedia.org/wiki/AI_effect

>"It's part of the history of the field of artificial intelligence that every time somebody figured out how to make a computer do something—play good checkers, solve simple but relatively informal problems—there was chorus of critics to say, 'that's not thinking'." AI researcher Rodney Brooks complains "Every time we figure out a piece of it, it stops being magical; we say, Oh, that's just a computation."

>When IBM's chess playing computer Deep Blue succeeded in defeating Garry Kasparov in 1997, people complained that it had only used "brute force methods" and it wasn't real intelligence. Fred Reed writes "A problem that proponents of AI regularly face is this: When we know how a machine does something 'intelligent,' it ceases to be regarded as intelligent. If I beat the world's chess champion, I'd be regarded as highly bright."

Now we have computers beating games without using brute force methods (as well as being good at various other unrelated AI tasks, i.e. "general") and we see the same thing happening. This is an amazing achievement.

sp332 · on Dec 15, 2014

Just train it to predict the actions of people with self-awareness and a desire for independence, and presto!

narrator · on Dec 15, 2014

I think you're conflating intelligence with human instinct.

dangerlibrary · on Dec 15, 2014

My comment was a sarcastic reply to a now deleted comment about the singularity being near.