Turing Test, Etc.

by Peter Seibel, © 1992

I

Can machines think? If you believe what Madison Avenue tells you, you might think so—these days you can buy a “smart” dishwasher, or a car that “knows” when its lights are on or when the door is ajar and can tell you so in a soothing baritone. Hollywood, of course, loves the idea of thinking machines—from the menacing HAL from 2001: A Space Odyssey, to the lovable R2-D2 and C3P0 from the Star Wars series.

The personification of machines probably began when people ascribed to their new, and sometimes balky automobile the personality of the old, and sometimes balky horse it replaced. But a more recent development is that our machinery’s personae have taken on a human character. In the middle of this century, the first electronic digital computers were built and suddenly the question, “can machines think?” was a going concern. In the October, 1950 issue of Mind: A Quarterly Review of Psychology and Philosophy, a British mathematician, Alan Turing, posed the question as a serious inquiry. His answer in the affirmative and his proposal of a simple but powerful test that would answer the question of whether a particular machine is thinking gave birth to Artificial Intelligence (AI)—a brand new field of intellectual inquiry and a whole new way of looking at one of the oldest unanswered questions of the human experience: what is a mind?

I stopped by the Boston Computer Museum a few weeks ago to pick up a stack of computer printouts. I met a museum employee on the fifth floor, to whom I gave $24.00 for the 169 pages of printout in a manila envelope. The employee joked that it felt like some sort of secret deal. I said in my best espionage agent accent, “You gif me ze secret computer information, and I vill gif you ze money.” The envelope had the words “Turing test transcripts” penciled on the front and inside were the results of the first ever public trial of the test Turing proposed forty years ago—the transcripts from eighty conversations between ten human judges, six computers, and two human confederates. I wasn’t particularly optimistic about the chances of any of the computers giving any evidence of thinking but I did let myself dare to hope that there would be some slight glimmer of intelligence displayed in the transcripts.

During World War II Alan Turing led the team that cracked the top-secret German Enigma code, the code the German high command used to communicate to their U-boat fleet the positions of the Allied ships that fleet was sinking almost daily. According to one author, “It is fair to say that we owe much to Alan Turing for the fact that we are not under Nazi rule today.” His life ended in tragedy only nine years after the end of the war when at the age of forty-one he was driven to suicide by the British government’s persecution of his homosexuality—he was “convicted” of homosexuality and sentenced to be given injections of female sex hormones for a year. Yet despite his early and unnecessary death, his legacy as a mathematician, logician, and computer scientist is one of the most impressive of recent history. He founded the field of theoretical computer science and proved one of its most fundamental results—that all computers are essentially equal: given the proper instruction set, any computer can emulate any other. We shall see later how this result also has implications for both mathematics and AI. And, as I mentioned before, Turing was the father of AI. Today computer scientists, psychologists, and philosophers are still debating the possibility of AI in the same terms he laid out in his 1950 Mind article.

The biggest problem with the field of AI in 1950, as it is now, is that nobody could agree on how it is that we are intelligent. That is, how is it that we are conscious, self-aware, rational beings. If we knew the basic mechanism for producing intelligence then the question of whether a machine is intelligent would be that would be fairly straightforward. If we knew the general mechanism that produces intelligence we would be able to look at the mechanism of a particular machine and say either, “Yes, that mechanism bestows intelligence on that machine” or “No, that machine is still no smarter than my electric can opener.” In the absence of such knowledge some other sort of definition is needed. Turing proposed a “looks like a duck, quacks like a duck, it’s a duck” definition of intelligence. Don’t worry what’s going on inside of the machine, he said, observe its outward behavior and if it is indistinguishable from the outward intellectual behavior of a human then it is, by his definition intelligent.

He called his test the “Imitation game” but a modified version of it is known today simply as “the Turing test.” He proposed a game played initially by three humans—a man, a woman, and a judge of unspecified gender. At the start of the game the judge is in communication with the other two humans by teletype machine but can’t see them and does not know which teletype machine is connected to which gender person. The judge has to determine the gender of the person at the other end by asking questions via the teletype machine. But both of the “players” are trying to convince the judge that they are female. The test of the computer is achieved by replacing the man with a computer. If the judge can’t make the gender distinction any more successfully when a computer is imitating a man imitating a woman then the computer is said to have “passed” the test and to be intelligent. The modern version leaves out the gender question. Instead, the judge has simply to decide whether an entity of unknown origin at the other end of a computer connection is a human or a machine. If a machine is judged to be a human then it passes. (It’s not clear what it means if a human is judged a computer.)

Early in November an advertisement appeared in Boston newspapers calling for people to assist in judging and administrating a contest. The advertisement didn’t say what sort of contest it was but applicants were required to be able to speak English and to type. The forty people who responded to the ad were invited to the Cambridge Center for Behavioral Studies for an interview. At the interview they were told that the Boston Computer Museum and the Center were going to conduct a limited version of the Turing test—the first ever trial after forty years of discussion in philosophical journals and around computer-lab coffee machines. They were going to set up ten computer terminals, each connected to either a computer program or a human confederate. But because there is no chance that a present-day program would have any chance in the full Turing test—a free-for-all conversation, no topic too large or too small—the Museum decided to limit the test by restricting each program—and the human confederates—to a specific topic such as Burgundy wines or Shakespeare’s plays. Ten judges would have a fourteen minute conversation with each terminal and then try to determine whether each terminal was “manned” by a human or by a program. The contest was sponsored by business man Hugh Loebner who has offered a $100,000 dollar prize to the author of the first program to pass the full Turing test and who paid the $1,500 dollar prize in this year’s contest. Not a computer scientist himself, he explained his interest in AI to the New York Times: “I’m in favor of 100 percent unemployment. I’ve always wanted computers to do all the work.”

One of the forty people who answered the advertisement was Cynthia Clay, a writer and Shakespeare enthusiast. When the interviewers found out that she was a walking encyclopedia of Shakespeare lore they decided that she might be a perfect confederate. When I spoke to her she said that she didn’t worry about people judging her humanity—to her the test was just a great chance to talk about Shakespeare for three hours.

During the contest nearly all the judges asked what her favorite Shakespeare play is. To keep things interesting for herself she answered differently every time, a maneuver that confused onlookers at the museum. She also warned the judges that it is bad luck to name the Scottish play (MacB**h) by its proper name unless you are working on a production of it, and when asked to interpret Lady MacBeth’s character mentioned all the “s sounds in her speeches which make her sound very sinister. Snaky.” This sort of analysis earned her the rating of “the most human” of all the programs and confederates, but two judges scored her as a computer, probably because she seems to know more about Shakespeare than is humanly possible. When I read the transcript of the test at home I was pretty sure that she was a human but I couldn’t help hoping that somehow she was a clever program.

The biggest problem with the Turing test, as Turing realized, is that it is only of minor interest, philosophical or otherwise, if there isn’t any reason to believe that a computer might pass it some day. Because of this Turing devoted the bulk of his 1950 article to anticipating the various arguments against the possibility of artificial intelligence. Some of these arguments are of more historical interest then philosophical, such as the “Theological Objection” and the “Argument from Extra-Sensory Perception.” The Theological Objection—“Thinking is a function of man’s immortal soul. God has given an immortal soul to every man and woman, but not to any other animal or to machines. Hence no animal or machine can think.”—is still raised occasionally but almost exclusively by people outside the field. The “Argument from Extra-Sensory Perception” is unheard of today but in his paper Turing treats it quite seriously. He considers this argument—that humans might be able to use E.S.P. to determine whether the entity at the other end of the teletype machine was a human or a computer—“quite a strong one” because “the statistical evidence, at least for telepathy, is overwhelming.”

Other arguments, however, are still cited. One of the most popular today, especially among people outside the field of AI is what Turing called the “Heads in the Sand Objection.” Turing put it in a somewhat blunter form than it is normally heard but his paraphrase is about right: “The consequences of machines thinking would be too dreadful. Let us hope and believe that they cannot do so.” Other more thoughtful objections that we still see in various forms today are what Turing called the “Lady Lovelace’s Objection” and the “Argument from Consciousness.” Lady Lovelace’s objection, named for the woman whose memoir is the primary source of information about one of the first ever computers, Babbage’s Analytical Engine, and based on some of her observations of this early computer, says that a computer can’t do anything new or surprising. Turing responds that he is frequently surprised by computers, such as when one of his programs gives a different answer to some calculation than he expects. This may seem flip but Turing is actually making a fairly subtle philosophical point—just because we know a certain fact doesn’t mean that all the logical consequences spring immediately to mind. In other words, a computer programmer doesn’t necessarily know exactly what his program is going to do. For example, most chess programs are better chess players than the programmers who wrote them. If having written a program gave the programmer absolute knowledge of what it would do the programmer would always be able to predict his programs next move and should be able to beat the computer.

The other objection, which is perhaps the favorite today, is the “Argument from Consciousness” which Turing takes from the Lister Oration of 1949 by Professor Jefferson:

Not until a machine can write a sonnet or compose a concerto because of thoughts and emotions felt, and not by chance fall of symbols, could we agree that machine equals brain—that is, not only write it but know that it had written it. No mechanism could feel (and not merely artificially signal, an easy contrivance) pleasure at its successes, grief when its valves fuse, be warmed by flattery, be made miserable by its mistakes, be charmed by sex, be angry or depressed when it cannot get what it wants.

The problem with this argument is that there it is hard to imagine a way to know what if anything the computer is “feeling” unless one is the computer. Turing proposes that Professor would rather “accept the imitation game as a test” rather than “adopt the extreme and solipsist point of view” that we cannot know that any entity exists besides ourself. We shall see later however, that even if Professor Jefferson would be willing to accept the test, there are other philosophers who are far less tractable.

Rose-Ann San Martino answered the same advertisement as Cynthia Clay and went for an interview. At the interview she had her first experience with a human-simulating computer program. The program was ELIZA, a rather simple program, written nearly thirty years ago by an M.I.T. computer scientist, Joseph Weizenbaum who is now on the committee running the test. ELIZA imitates a practitioner of Rogerian therapy—a style of therapy in which the therapist helps his clients find their own solution to their problems by simply echoing or restating what the clients say. When ELIZA was first introduced it created quite a stir because its responses seemed so human. In the AI world there is an anecdote about the vice-president of a computer company who, not realizing he was connected to ELIZA, thought he was talking via teletype to a person whom he threatened to fire. Even Weizenbaum’s secretary, who knew it was a program, once asked Weizenbaum to leave the room so she could talk privately with the program. The interviewers had Rose-Ann and other prospective judges talk with the program for a few minutes to make sure she was comfortable having a conversation through a computer and to see what sorts of questions she would ask. At the end of the interview the interviewers asked Rose-Ann and the other applicants whether they thought the program, which was running on a laptop computer with no wires connecting it to anything, was a human or a computer. Over half of the applicants said it was a human.

Rose-Ann left the interview unsure whether she had been talking to a program or a person and still pretty much in the dark about how she might be able to tell a human from a clever computer program. At the test she assumed the entities at the other end of the terminals were “human until proven computer” and just tried to have interesting conversations. Her first conversational partner was a lucky choice—Cynthia Clay, the Shakespeare expert who wanted nothing more than to have a nice chat about Shakespeare. But because Cynthia was her first partner Rose-Ann couldn’t really tell if she was a human or a computer since she didn’t really have any idea how well computers would be able to carry on a conversation.

Later, after having conversations like the following with a program that was supposed to know “things a seven-year-old girl might know” it was easier to tell:

    Rose-Ann:  Hi are you in the second grade?

    Terminal:  Hello I'm Lori

    Rose-Ann:  Hi Lori what are you learning in school?

    Terminal:  As I said -- Hello I'm Lori

    Rose-Ann:  Lori do you go to school?

    Terminal:  I don't know.

    Rose-Ann:  Ok tell me abouit math.

    Terminal:  Acts are the way we make changes.

    Rose-Ann:  OK whta changes are you making?

    Terminal:  I don't know.

    Rose-Ann:  Tell me about learning to read

    Terminal:  Acts are the way we make changes.

    Rose-Ann:  Lori you are not maiking any sense.
               Do you have any friends?

    Terminal:  That isn't very good.

This program was the most irritating according to Rose-Ann: “It was either a really obnoxious person or a really bad program.” But other programs, even if they were less irritating, didn’t make much more sense. Rose-Ann said that the two biggest giveaways that a terminal was a program were that it repeated its answers and that it simply didn’t answer the questions.

II

Today, almost all discussion of the Turing test and the possibility of AI focuses on a 1980 article written by a philosopher from the University of California at Berkeley, John R. Searle. The article, “Minds, brains, and programs,” was published in Behavior and Brain Sciences whose editors solicited responses to the article from professionals in the fields of psychology, philosophy, as well as AI. They received twenty-seven responses including one which started “This religious diatribe against AI, masquerading as a serious scientific argument, is one of the wrongest, most infuriating articles I have ever read in my life.”

In his article Searle attacks a branch of AI known as “strong” AI. The central thesis of strong AI, also known as the computational hypothesis, is quite simple—if we give a computer the appropriate program it will be intelligent, conscious, self-aware, feel emotions, etc. In the words of one strong AIer, our brains are “machines made of meat,” and our “mind” is simply the right computer program. (Weak AI, on the other hand, only claims that computers can be useful in understanding our own intelligence.) Because Turing proved that all computers are equivalent, if our brain is a simply a computer running a program then, in theory, it would be possible to discover the program and translate it onto a electronic computer. Searle claims that intelligence is not simply a matter of computation or symbol manipulation. His argument is known as the Chinese Room argument, after the thought experiment he proposes in the article.

Searle’s Chinese Room thought experiment runs basically like this: Imagine a man—a monolingual native English speaker—locked in a room. The door has a slot in it just big enough to pass index cards in and out of the room. In the room are stacks of index cards with Chinese characters printed on them and a large rule-book, written in English. The rule-book is a set of instructions for correlating different strings Chinese characters. The rule-book gives pictures of the characters but no definitions. It might say, for example, “if you see the following sequence of characters, 你好吗？ send out this sequence of characters, 很好，谢谢. If the man read Chinese he would know that the first set means, “How are you?” and the second “I’m fine, thanks.” But the rule-book doesn’t tell him this and he doesn’t need to know it to follow the rule—all he has to do is recognize the shapes of the characters. In the room next door there is also a slot in the door and cards with Chinese characters on them but no rule-book. Instead there is a native Chinese speaker of Chinese. Outside in the hall are a bunch of other native Chinese speakers. They start to slip index cards with Chinese characters on them into the two rooms. The Chinese man who has been told he is going to have a conversation via index cards finds the cards with the characters he needs to formulate his response and sends them out through the slot. The English speaker, who has been told to follow the rules in the rule-book, looks up the incoming characters and follows the instructions in the rule-book to formulate his response, which he then sends out through his slot. The crucial assumption in this thought experiment is that the English speaker can pass this Chinese Turing test. That is, after an extended conversation, the people in the hall cannot determine which room contains the native Chinese speaker.

But, says Searle, isn’t it clear that the English speaker doesn’t understand anything. He looks up the symbols by their shape and follows rules that don’t tell him anything about what the characters mean. If you were to let him out of the room and ask him questions in English about the conversation he had just participated in he wouldn’t be able to tell you anything about it. Searle concludes that no computer program could ever understand anything because programs do exactly what the English speaker was doing, they follow rules to manipulate meaningless symbols. (That is meaningless to the computer—the words on a computer screen have meaning to any literate English speaker. But to the computer they are just meaningless strings of zeros and ones.)

Ultimately Searle is trying to prove that it is impossible to write an artificially intelligent computer program but, as he points out, “precisely one of the points at issue is the adequacy of the Turing test.” His thought experiment—or Gedankenexperiment as he puts it—is logically flawed because his first assumption is that there is a rule-book (i.e. an unintelligent entity) that would allow the English speaker to pass the native-Chinese-speaker Turing test. Yet the argument has an intuitive force—if we imagine we could write a computer program that could carry on conversations certainly we could translate the program into a rule book and obviously neither the English speaker nor the rule book understands anything. One response to this argument, popular with many of Searle’s opponents, is to “bite-the-bullet” and claim that, nevertheless, someone or something is understanding the conversation. The best attempt at “biting-the-bullet” that anyone has come up with so far is the “systems reply” which Searle credits to computer scientists at Berkeley. The systems reply says, “Yes, you are right that the man in the room doesn’t understand the Chinese, but he is only part of the whole system of the room the way an individual neuron is only part of the system of our brain. Of course he doesn’t understand the conversation any more than a neuron understands anything, but the system—the room with all the cards, the rule-book, and the man—does understand.” Searle responds to the systems reply, fine, let the man memorize the rule-book and do all of the manipulations in his head. Now the man is the system and he still doesn’t understand. At this point, proponents of the systems reply pause to swallow the first bit of bullet they bit off and prepare to take another bite. They argue that Searle is making a mistake when he assumes that the man in the room would know if his mental processes were being used to understand Chinese. If it were really possible for the man to memorize all the rules and do all the manipulations in his head, then there could be a “passenger” personality, an entity that is using the man’s brain but is not the man, and who exists only because the man is doing the calculations prescribed by the rule-book. It is not clear what happens if the man stops thinking about the rule-book. Does the “passenger” personality go in and out of existence depending on whether the program is “running,” that is whether the man is applying the rules in the book to Chinese characters?

At this point it is fair to observe that all this bullet biting is leading down some pretty strange paths. Maybe it would be easier to attack Searle’s argument from a different approach. Maybe the problem is in his initial assumption that a rule-book exists that would enable the man to pass the Turing test. There are certainly grounds for denying Searle’s right to make this assumption. For example, assuming the rules are of the sort described earlier (if you receive this string of characters send out that string of characters), a book with enough rules to carry on even a short conversation would be so large that it wouldn’t fit in the universe, even if every rule was inscribed on a single atom of the book.

But assume that the rule-book authors used a more clever scheme than this—there are still reasons to believe that Turing’s test is a powerful and subtle measure of intelligence that would not easily be fooled by a simple rule book. The best reason is that the Turing test is based on language and language is a powerful and subtle tool of human intelligence.

Certainly language is one of the most powerful means we have of getting inside another person’s mind. Consider this article. Hopefully it seems to you that it is the product of a thinking mind. Perhaps you can even detect something about the personality of that mind. The intuitive appeal of Searle’s argument is that we all know what a rule book is—simple, mechanical. But it’s hard to imagine a simple mechanical process that could generate this essay. Even writing a program that will summarize an already-written story in different words is quite difficult: one fairly successful program written in the late 70s, FRUMP, which read and summarized newspaper stories was given a story that started with the sentence, “The death of the Pope shook the world.” FRUMP summarized, “There was an earthquake in Italy. One person died.” Any rule book that would deal with language is going to have to be a lot more complex than what we ordinarily think of when we think of a rule book.

One philosopher, Robert French, thinks that language is too powerful a tool and thus that the Turing test is too powerful—it isn’t a test for intelligence, but for intelligence and a human experience. He points out that language enables the Turing test not only to detect intelligence, but also to probe the extremely complicated and convoluted networks of associated ideas that “are the product of a lifetime of interaction with the world which necessarily involves human sense organs.” He gives an example of a being exactly like a human, except with its eyes on its kneecaps instead of on its face. He claims that even this difference in perspective—which clearly doesn’t have anything to do with intelligence—would cause a big enough difference in ideas about the world (for example long pants and crawling would have totally different meanings) that could easily be detected by an astute judge in a Turing test.

None of this, however, proves anything either way about the validity of the Turing test. The best either side can do is to provide examples that get at our fundamental intuitions about thinking and language. But unfortunately arguments of this sort might turn out to be irrelevant. As we shall see later, it might be impossible to argue for the possibility of AI without adopting some form of the “bite-the-bullet” argument.

The winner of the BCM Turing test, “Whimsical Conversation” was a modified version of the commercially available program PC Therapist. The program was not written as an AI project, but simply to help people “let off steam, complain, or say anything” according to the programmer’s bio sheet provided by the museum. It has a collection of thousands of quotations such as, “I just got wonderful news from my real estate agent in florida—they found land on my property,” and “A liar should have a good memory.” It also has a few rules for transforming what the judge says and spitting it back. For example, if a judge says “you [something]” then “Whimsical” is liable to spit it back changing the “you” to an “I” and even changing the verb form of the verb “to be” and the case of the object, and prefacing it with a short question. For example, “you’re really trying to let me have it,” comes back as, “Does it please you to believe I am really trying to let you have it?” Of course this doesn’t always work. When one of the judges asked, “Can’t you once just answer me a direct question with a direct answer?” the program replied “You like to think I once just answer you, don’t you?” After this response the judge warned the program that it had better “get in the habit of answering people directly” or it might “end up in the corner at Lechmeres,” a local appliance store.

The programmer even programmed “Whimsical” to make typing errors such as typing the wrong letter and then backing up to correct it, all of which was shown on the judges terminals which received each message keystroke by keystroke. Several programs tried this tactic to add a bit of a human touch but some were more successful than others—one program made a typo, backed up to fix it, and then typed the correction ten times faster than any human could.

These problem with these kinds of tricks—the typing foibles and echoing the judges sentences—is that they don’t seem to reflect any sort of understanding by the computer but rather the cleverness of the programmer. But for some people these tricks were good enough: Rose-Ann said that she really felt “Whimsical” had a personality and that she felt “a sort of sadness” when she found out, after the contest, that it wasn’t a person.

Opponents of Searle might like to avoid the strangeness of the bullet-biting arguments by arguing that his rule-book could never exist. But this might not be possible because of Turing’s result mentioned earlier—that all computers are fundamentally equivalent. To understand what this has to do with Searle and with the Turing test, it is necessary to understand in a little greater detail what Turing showed.

Turing’s demonstration of the equivalence of all computers originally came out of his attempt to answer a question posed by the mathematician David Hilbert in 1900. The question was, does there exist an algorithm that will determining the truth of falsehood of any mathematical statement?1 Loosely defined, an algorithm is a mechanical procedure with a finite number of steps that will always solve a certain problem, given enough time. We use algorithms all the time—when you deal a hand of bridge you use an algorithm. The problem you are trying to solve is to evenly divide the deck among four players. The algorithm can be written in three steps: Step 1: give a card to the person to your left; Step 2: give a card to the person to the left of the person you last gave a card to; Step 3: if there are still cards in your hand go back to Step 2 and continue, otherwise stop. If you follow this procedure you will automatically distribute thirteen cards to each player. This might seem simple and uninteresting, but imagine how you might have to deal a hand of bridge if you were not allowed to use an algorithm. One way would be to divide the deck by feel into what seems like four even piles and give one pile to each player. Then have the players count their pile. If not everyone has thirteen cards collect the cards and try again. If you are unable to evenly divide the deck by feel you might never get to play bridge. Clearly the first method is less frustrating though the second could conceivably be faster (say if you divided right the first time.) But even the second method is not algorithm-free—it cheats by allowing the players to use an algorithm (counting) to find out if you had succeeded in dealing the deck properly. In mathematics coming up with a possible theorem is analogous to splitting the deck. The analog to the counting algorithm would be the algorithm Hilbert wanted to find.

To answer this question Turing had to formally define an algorithm. Because an algorithm is a mechanical process he proposed to define an algorithm as a set of operations that can be performed by a certain type of machine. The machine he proposed, now called a Turing machine, is really not a machine at all, but an abstraction.

Picture an infinitely long cash register tape with lines dividing it into squares like this: … [][][][][][][][] … Now imagine a machine that can move the tape either left or right one square at a time and which has a window through which it can see one, and only one square. On each square of the tape there is either a 0 or a 1 and the machine can read the character in the window and then change it our leave it the same. A program for this sort of machine is a list of “states.” Each state is a function which takes as its input the character in the window and returns the character to put on that square, the next state, and what direction to move the tape (or to stop.) For example the state A could be stated in English as: if there is a 1 in the window then leave the 1 unchanged, leave the state set to A, and move the tape left; if there is a 0 in the window then change it to a 1, set the state to B, and move the tape left. It is fairly simple to demonstrate a simple three-state Turing machine that will add two numbers, each represented as a string of 1’s, putting the answer in the same notation 2. Programs with larger numbers of states can perform extremely complicated operations. Turing showed was that it was quite reasonable to define an algorithm as any operation that can be programmed on a Turing machine in a finite number of states. Under this definition the answer to Hilbert’s question is no, there is no algorithm which will determine the truth of falsehood of any mathematical theorem.

But the result that we are more interested in is one he proved as a part of his answer to Hilbert’s question. He showed that the program, that is the set of states, for any Turing machine could be encoded as a string of 0’s and 1’s and fed into a “universal” Turing machine that would then effectively be that particular Turing machine. By definition, all computer programs are algorithms, which is just another way of stating Searle’s claim that computers can only manipulate symbols. This means that any computer program that can be run on any computer could be encoded on a strip of paper tape and run on a Universal Turing Machine. This fact has serious implications for the strong AI claim that the right computer program will give a computer intelligent self-awareness. That claim, with Turing’s result in mind implies that a Turing machine, slowly chunking a paper strip back and forth and changing 0’s to 1’s and 1’s to 0’s could be a self-aware entity. In other words, if we want to claim that thinking is computation then we have no choice but to bite the bullet and admit that in theory the Chinese room could be an thinking entity.3 This seems an awfully strange idea, but on the other hand the idea that our own consciousness is simply the result of a combination of electrical currents in our brain is pretty strange itself.

At the end of the BCM Turing test, the confederates got to come out of their room and meet the judges. Cynthia Clay laughed with the two judges who had thought she was a computer, while reporters tried to get the opinions of the various computer experts in attendance. Many experts expressed some surprise or even disappointment at how easily the judges were fooled. Dr. Weizenbaum, the programmer who wrote ELIZA thirty years ago, told the Wall Street Journal he was “disturbed” by the ease with which some of the judges were fooled. Ned Block, a professor at M.I.T. told reporters that the Turing test was not a good measure of machine intelligence because “it’s too easy to fool people.” Yet all the experts claimed they could tell almost immediately which terminals were being run by humans and which by computers. Reading the transcripts myself, I was usually able to tell within a page or two which was which. Daniel Dennet, a philosophy professor from Tufts and the chairman of the Loebner prize fund committee, reminded people that this test was only a warm-up for the true, unrestricted Turing test. Dr. Robert Epstein, who directed the test, is already looking forward to future tests, but thinks it will be a while before there are any programs ready to try the full Turing test. But if and when the day comes when a computer passes the full test, both Dennet and Epstein are absolutely ready to grant the computer full status as a self-aware, conscious entity. “And,” says Epstein “if anyone doubts it they would have to argue with the computer.”

1 Mathematical statements, also known as “theorems” are the entire product of a mathematician’s work. A simple theorem is if you add an even number and an odd number you get an odd number. Mathematics is simply a collection of theorems and mathematicians are constantly trying to find more true statements to add to their collection. The difficulty lies not, however, in finding the statements, nor even, necessarily in knowing whether or not they are true. The difficulty is in proving or disproving them.

2 If we represent the state I just gave as an example in the notation A:(1 → 1,A,left; 0 → 1,B,left) then the other two states are B:(1 → 1,B,left; 0 → 0,C,right) and C:(1 → 0,stop). There is no need to say what to do if the computer reads a “0” in state C because this cannot happen. To use this program to solve a sample problem “3+4” by hand first notate the problem as the string “11101111.” Then put your pencil (the window) on the left most “1” and your mind in state A. Change the character and move the paper under your pencil (in that order) as indicated by the state you are in. And then execute the next command based on what state you switch to. You should end up with the string “1111111” to the left of you pencil which is on the “0.” This string—whatever is to the left of your pencil—is the answer. This program can add any two numbers separated by a “0” and could be modified, by adding a few extra states, to add any finite number of numbers separated by “0’s.”

3 This might not be absolutely true. If someone could show that thinking was not strictly computational—that is that some non-mathematical difference between two computers could make the difference between thinking and not thinking, even when the computers were running the same program. Strong AIers, however, want to avoid this argument if they possibly can because that allows someone like Searle to say that the only “computer” that can run a program and achieve thought is the human brain.