• Quick note - the problem with Youtube videos not embedding on the forum appears to have been fixed, thanks to ZiprHead. If you do still see problems let me know.

Artificial intelligence game....how does it work?

jay gw

Unregistered
Joined
Sep 11, 2004
Messages
1,821
http://www.20q.net/

Can someone explain how this works while not using any jargon? It's a game based on 'neural networks' and is amazingly accurate at predicting what object you're thinking about with just a few questions.
 
That's a pretty cool game! :)

I'm afraid my understanding of neural networks is of a level that I'm not really capable of explaining very well, i.e. I know something, but far from enough to be non-technical about it..

However, in short, there's a program that takes a bunch of input, hazards a guess at an output, and based on whether the input was right or wrong, it adjusts it's innards to get better at guessing right.
 
AI Program
Q19. I am guessing that it is a bat?

18. Does it move air? Yes.
17. Is it a vegetarian? Sometimes.
16. Can it be placed on your head? Irrelevant.
15. Is it annoying? No.
14. Is it a specific color? No.
Was it used over 100 years ago? Unknown.
13. Does it go inside other things? Yes.
12. Are there many different sorts of it? Yes.
11. Does it dig holes? No.
10. Can it swim? No.
9. Is it a predator? Sometimes.
8. Is it multicolored? No.
7. Does it have claws? Yes.
6. Can you see it in a zoo? Yes.
Does it live in the forest? Unknown.
5. Does it have short fur? Partly.
4. Would you give it as a gift? No.
3. Does it come in different colors? Partly.
2. Is it dangerous? Depends.
1. It is classified as Animal.

It sure does ask a lot of questions. Some I didn't really know how to answer.
 
Yea, thanks for that Jay. That was pretty cool. I've seen something like it where the program guesses which tv charactor or superhero you're thinking of. I like this one better.

Oddly, it took about 30 questions to guess I was thinking of a carberator. I did answer a few questions differently/incorrectly so I don't blame it. Here's something it told me after it finally guessed correctly. Check out the last one.

Uncommon Knowledge about a carburetor
Does it weigh more than a duck? I say Doubtful.
Is it small? I say Yes.
Does it come in a box? I say Yes.
Could you send it in the mail? I say Probably.
Is it heavy? I say No.
Is it normally planted in gardens? I say Yes
 
I just noticed this:

You were thinking of a bat.
Does it live in the forest? You said Unknown, I say Yes.
Was it used over 100 years ago? You said Unknown, I say Yes.
Is it a specific color? You said No, I say Yes.
Is it a vegetarian? You said Sometimes, I say No.

Contradictions Detected
The opinions of the game are its own and are based on the input of people playing the game. It does not matter if our answers disagree, as over time the game will change its answers to reflect common knowledge. If you feel that the game is in error, the only way to fix it is to play again.

We make it smarter. Unless we're dumb, then we make it dumber. Apparently, he game doesn't think that a bat can be a vegetarian.
 
While I've not seen the innards of this particular implementation, I know enough about computer science and AI to hazard a guess how this one works.

They claim everything the program knows was via interaction with human players. Assuming this is true, they must have implemented a very general purpose algorithim to run through a fixed set of questions, recording the response and giving up after a finite number of guesses (30?). Based on the input of the player, new questions and their answers are imported into the knowledge base.

The actual data structure used to record the questions and their answers is almost certainly a binary tree. Think of the initial questions as the "trunk" that all the possible decision paths to the final answer branch off from.

The "neural network" part comes into play during the questioning and learning phase. Each time the software makes a correct assumption about the final answer that question is weighted more. Incorrect assumptions are weighted less. So with enough players over time the knowledge becomes populated with enough common items to give the illusion of intelligence. But this is really no more than a simulation of deductive reasoning.

There are also some clever tricks one can do make the software look smarter than it is. For example, keep a tally of the "top 100" answers. I'm sure a small subset of total answers account for a majority of the questions. So it would make sense to guess early for these, which can lend some "gee-whiz!" type moments.

My latest game (I won!)
Q23. I am guessing that it is a yeti?
Yes , No , Close
22. Is it worth a lot of money? Yes.
Is it a nocturnal animal? Unknown.
21. Can it have hands? Yes.
20. I guess that it is a griffin? No.
19. Is it tall? Yes.
18. Can you lift it? No.
17. I guess that it is a grizzly bear? No.
16. Is it black? No.
15. Is it green or black? No.
14. Does it live in groups (gregarious)? Maybe.
13. Can you make money by selling it? Yes.
12. Are there many different sorts of it? Probably.
11. Does it purr? No.
10. Does it live in mountains? Yes.
Is it killed for its fur? Unknown.
9. Does it have a pointed snout? Maybe.
8. Is it brown? Sometimes.
7. Does it bring joy to people? No.
6. Is it considered valuable? Probably.
5. Is it usually colorful? No.
Can it be trained to obey commands? Unknown.
4. Can it growl? Yes.
3. Does it have paws? Yes.
2. Can it climb? Probably.
1. It is classified as Animal.
 
The "neural network" part comes into play during the questioning and learning phase. Each time the software makes a correct assumption about the final answer that question is weighted more. Incorrect assumptions are weighted less. So with enough players over time the knowledge becomes populated with enough common items to give the illusion of intelligence. But this is really no more than a simulation of deductive reasoning.

But isn't that what humans do from birth? Humans know nothing at birth, and then make assumptions and when those are rewarded, make them again in similar situations. That's until a new situation arises and then they have to make another guess. The human mind is mechanical like anything else.
 
EvilYeti said:
The actual data structure used to record the questions and their answers is almost certainly a binary tree.

I don't think so. There are 12 different answer alternatives for the questions. Having 29 levels with branching factor of 12 and one level with factor of 5, you have a tree of the size 5 * 12<sup>29</sup> = 98,906,797,416,570,752,637,062,621,429,760 leaves. Of course, the vast majority of that potential space is never visited so it doesn't have to be stored, but I still believe that the underlying data structure is not a decision tree.

I believe that they use some of the standard neural network classifier algorithms. Basically, there is one source node for each question whose possible values go over the possible answers plus an additional 'not yet asked' value. Then, the output nodes of the network contain all possible answers. Between the input and output layers there are intermediate nodes that connect all input nodes to all output nodes. I won't wager a guess on this internal topology since I'm not an expert on neural networks (I dropped the course since I had already too much courses for that spring selected). The system asks questions until one of the output nodes gets activated. Then it makes the guess. If it was wrong, it will disable that node and continue asking more questions.

The knowledge itself is stored in internal nodes. Each node has a number of inputs that come from the previous nodes and gives its output to the succeeding nodes. A node has some function where its output value depends on its inputs. And this is the place where neural network learning happens.

The network is initialized so that a large set of example objects are teached to it: for each sample object the answers for the questions are given as well as the desired result. Then, the learning algorithm alters the functions of the intermediate nodes so that the network will give the correct answer. After the initial teaching the net is put online, and after each set of questions the learning algorithm is run again, though it will make smaller changes to the functions because there is the possibility of conflicting data. Each time when it encounters a new object, it creates a new output node.
 
I saw a version of this at least five years ago... it's nothing to do with intelligence, obviously. The difference is, the first site I saw guessed only celebrities you were thinking of. I'll see if I can find it...

...nope.
 
LW said:
I don't think so. There are 12 different answer alternatives for the questions. Having 29 levels with branching factor of 12 and one level with factor of 5, you have a tree of the size 5 * 12<sup>29</sup> = 98,906,797,416,570,752,637,062,621,429,760 leaves. Of course, the vast majority of that potential space is never visited so it doesn't have to be stored, but I still believe that the underlying data structure is not a decision tree.

You are probably right, the reason my first thought was a binary tree was that "20 questions" if often used in teaching information and game theory, and that data structure is usually used in the examples. With 2<sup>20</sup> yes/no questions being enough to identify 1,048,576 unique objects by classification. I made the assumption that the multitude of answers were used to weight the "yes/no" decision and not a branch in and of itself.

I found a writeup of the technology behind the program, which I think is approachable enough for most folk.

Techie Snapshot of 20Q

The 20Q program is made up of three integral elements: the game engine, the application of user input to the knowledgebase, and the CGI/HTTP interface. The program is implemented by a sizeable program written in C.

Fundamentally, the knowledgebase is a matrix of objects by questions. Every element in the matrix represents whether the answer is yes or no. It also represents the weighting of the answer to a question for an object; unknown answers have a weighting of zero.

Every time a player answers a question, the URL of the answer contains three pieces of information: the current state of their game, the current question index, and their answer. To generate the next page, the game determines which objects are currently probable by comparing the current responses to the knowledgebase. If an element in the knowledgebase is unknown, the algorithm ignores that response for that object. The program then determines the number of yes and no answers for each question for the probable objects; this may include several unknown answers. The program then chooses the question that has the best ratio of yes-to-no answers (a 50/50 split would be ideal). The actual weightings are also worked into this calculation.

Finally, the game compares the probability of the objects and the effectiveness of the question and decides whether to guess at an object or ask another question (this is aided by some prodding at the appropriate time). This information is used to generate a new page, with new links, to reflect the new state.

At the end of a game, when the program knows the object, the program makes a pass over the knowledgebase and updates the answers for that object. If the knowledgebase doesn't have an answer, the value is set. However, if the answer that the player gave matches the answer in the knowledgebase, the weighting is incremented. If, on the other hand, the answer contradicts, the weighting is halved.

That is the heart of the system. The only other things that need to be done are things like: if the program is stumped, it asks the user for the object they were thinking of and then checks to see if it already knows it. If the program was stumped, or had to ask a lot of questions, or guessed about too many objects, it will ask for a new question. Last, the program looks at objects that are similar to the target object and draws conclusions; the author calls this "spontaneous knowledge".

This description makes Twenty Questions sound more like a database manager, and to a certain extent, that is part of the system. But from another perspective, every question is input into a 2D neural network, the objects are the output side. By performing a large number of simple calculations, the game is tolerant of missing knowledge and contradictory input.
 
That explanation is still confusing.

Fundamentally, the knowledgebase is a matrix of objects by questions.

What?

Last, the program looks at objects that are similar to the target object and draws conclusions; the author calls this "spontaneous knowledge".

This part is very interesting. It's able to teach itself.
 
jay gw said:
That explanation is still confusing.
Fundamentally, the knowledgebase is a matrix of objects by questions.
What?
It's a grid. On the top you have all the questions, on the side you have all the objects. In the cells you values that represent if the answer should be "yes" or "no" and how strongly it matters. I assume possitive values are "yes." If a particular object has .9 for one question, and .1 for another, then the first matters more than the second.
jay gw said:
Last, the program looks at objects that are similar to the target object and draws conclusions; the author calls this "spontaneous knowledge".
This part is very interesting. It's able to teach itself.
There are 3 ways to teach a NN. Fixed, supervised and unsupervised. In fixed networks, you give each value by hand, and they are set in stone. In supervised, you give the network a series of example inputs with their corresponding correct outputs. In unsupervised you give the network a bunch of example inputs, and it groups them however it sees fit. I guess this particular network uses a mix of supervised and unsupervised.
 
jay gw said:

This part is very interesting. It's able to teach itself.

Not really, it just groups together objects that share many of the same responses. The results, however, are often nonsensical. For example, check out what other things it thinks are like a bicycle:
a muffler (car exhaust), a wheel, a carburetor, a golf ball, a microscope, a sprinkler (lawn sprinkler), a unicycle, an oil filter (car part), a ball bearing, a piston, a turbine, a fuel pump.
Some are close, like wheel and unicycle, but the majority have nothing to do with a bicycle.
 
It's a grid. On the top you have all the questions, on the side you have all the objects. In the cells you values that represent if the answer should be "yes" or "no" and how strongly it matters. I assume possitive values are "yes." If a particular object has .9 for one question, and .1 for another, then the first matters more than the second.

People use matrices all the time to make decisions. How many objects would this matrix have to have?
 

Back
Top Bottom