First off, after watching the YouTube clip, I wondered for a second whether or not Watson was performing any sort of voice recognition. It turns out that it gets the Jeopardy clues electronically. Which is fine by me. Interpreting vocal sounds is an entirely different ball of wax than what Watson is actually accomplishing here. So what is Watson actually accomplishing?

Think about it. Jeopardy clues can be obscure. Here’s a clip of Ken Jennings winning $75,000 on Jeopardy.

The clue Ken answered was:

2 of the 4 Shakespeare plays in which ghosts appear on stage

To answer this correctly, there are some interesting problems Watson would have to traverse. As brought up in the IBM video, deciding which words are important is critical in correctly answering this question. One could treat each noun as equally important and pull up a database on ‘ghosts’ and another on ‘Shakespeare plays’ and another on the phrase ‘on stage’. Doing this on Google nets 30.2 million hits, 8.53 million hits, and 104 million hits, respectively. Going through through each hit and finding commonality between them would be a computational nightmare. Then again, Watson isn’t connected to the internet. It has to search its own database.

How would you or I solve this problem? Well, the ‘natural’ starting point would be ‘Shakespeare plays’. So we could think up all of the Shakespeare plays we know. Next would be ghosts. Which of these Shakespeare plays have ghosts? Furthermore, the ghosts have to appear on stage.

Even if Watson had every single word of all of Shakespeare’s plays in its memory, it would have to search through every single play and interpret whether or not a ghost walked on the stage or was just merely talked about. Then there’s the whole issue of a ‘natural’ starting point. How would Watson know which word is the most important. Moreover, how would it know that the group of words ‘Shakespeare plays’ is important.

And then there’s one more snag thrown into the mix: ‘2 of the 4’. Such an inconspicuous phrase, but deadly nonetheless. ‘2’. ‘of’. ‘the’. ‘4’. The words by themselves hold little meaning, but in order to answer correctly, interpreting this phrase is key. Watson cannot give the first play with a ghost in it, nor can it give all four — it must give only 2 of the 4 plays. Go back to the IBM vid and you’ll see that the first answer that Watson gets wrong was a ‘2 of the’ question.

But see, this is where some sort of vocal recognition system would help. Go back and listen to Alex Trebek say the clue. He puts emphasis on the phrase ‘2 of the 4’ and ‘ghosts’. Watson gets the clue electronically. It does not hear the emphasized phrases. To interpret the emphasis would require answering a question in the domain of speech processing. But still, if Watson had access to such interpretations, then ranking the importance of words and grouping words together would be that much easier.

I’m starting to ramble here, but there is a lot of interesting questions that need to be solved in order for Watson to perform; I guess we’ll see how it does this fall!