Sunday, June 15, 2014

The Turing Test for A.I.

The Turing test is a test of a machine’s ability to exhibit intelligent behavior equivalent to, or indistinguishable from, that of a human. In the original illustrative example, a human judge engages in natural language conversations with a human and a machine designed to generate performance indistinguishable from that of a human being. All participants are separated from one another. If the judge cannot reliably tell the machine from the human, the machine is said to have passed the test. The test does not check the ability to give the correct answer to questions; it checks how closely the answer resembles typical human answers. The conversation is limited to a text-only channel such as a computer keyboard and screen so that the result is not dependent on the machine's ability to render words into audio.

The test was introduced by Alan Turing in his 1950 paper "Computing Machinery and Intelligence," which opens with the words: "I propose to consider the question, 'Can machines think?'" Because "thinking" is difficult to define, Turing chooses to "replace the question by another, which is closely related to it and is expressed in relatively unambiguous words."  Turing's new question is: "Are there imaginable digital computers which would do well in the imitation game?"  This question, Turing believed, is one that can actually be answered. In the remainder of the paper, he argued against all the major objections to the proposition that "machines can think".

In the years since 1950, the test has proven to be both highly influential and widely criticized, and it is an essential concept in the philosophy of artificial intelligence.

Alan Turing

Researchers in the United Kingdom had been exploring "machine intelligence" for up to ten years prior to the founding of the field of AI research in 1956.  It was a common topic among the members of the Ratio Club, who were an informal group of British cybernetics and electronics researchers that included Alan Turing, after whom the test is named.

Turing, in particular, had been tackling the notion of machine intelligence since at least 1941 and one of the earliest-known mentions of "computer intelligence" was made by him in 1947.  In Turing's report, "Intelligent Machinery", he investigated "the question of whether or not it is possible for machinery to show intelligent behaviour" and, as part of that investigation, proposed what may be considered the forerunner to his later tests:

It is not difficult to devise a paper machine which will play a not very bad game of chess. Now get three men as subjects for the experiment. A, B and C. A and C are to be rather poor chess players, B is the operator who works the paper machine. ... Two rooms are used with some arrangement for communicating moves, and a game is played between C and either A or the paper machine. C may find it quite difficult to tell which he is playing.

"Computing Machinery and Intelligence" (1950) was the first published paper by Turing to focus exclusively on machine intelligence. Turing begins the 1950 paper with the claim, "I propose to consider the question 'Can machines think?'"  As he highlights, the traditional approach to such a question is to start with definitions, defining both the terms "machine" and "intelligence". Turing chooses not to do so; instead he replaces the question with a new one, "which is closely related to it and is expressed in relatively unambiguous words." In essence he proposes to change the question from "Can machines think?" to "Can machines do what we (as thinking entities) can do?" The advantage of the new question, Turing argues, is that it draws "a fairly sharp line between the physical and intellectual capacities of a man."

ELIZA and PARRY

In 1966, Joseph Weizenbaum created a program which appeared to pass the Turing test. The program, known as ELIZA, worked by examining a user's typed comments for keywords. If a keyword is found, a rule that transforms the user's comments is applied, and the resulting sentence is returned. If a keyword is not found, ELIZA responds either with a generic riposte or by repeating one of the earlier comments.  In addition, Weizenbaum developed ELIZA to replicate the behaviour of a Rogerian psychotherapist, allowing ELIZA to be "free to assume the pose of knowing almost nothing of the real world." With these techniques, Weizenbaum's program was able to fool some people into believing that they were talking to a real person, with some subjects being "very hard to convince that ELIZA [...] is not human." Thus, ELIZA is claimed by some to be one of the programs (perhaps the first) able to pass the Turing Test, even though this view is highly contentious.

Kenneth Colby created PARRY in 1972, a program described as "ELIZA with attitude". It attempted to model the behaviour of a paranoid schizophrenic, using a similar (if more advanced) approach to that employed by Weizenbaum. In order to validate the work, PARRY was tested in the early 1970s using a variation of the Turing Test. A group of experienced psychiatrists analysed a combination of real patients and computers running PARRY through teleprinters. Another group of 33 psychiatrists were shown transcripts of the conversations. The two groups were then asked to identify which of the "patients" were human and which were computer programs. The psychiatrists were able to make the correct identification only 48 percent of the time — a figure consistent with random guessing.

In the 21st century, versions of these programs (now known as "chatterbots") continue to fool people. "CyberLover", a malware program, preys on Internet users by convincing them to "reveal information about their identities or to lead them to visit a web site that will deliver malicious content to their computers". The program has emerged as a "Valentine-risk" flirting with people "seeking relationships online in order to collect their personal data".

Some human behavior is unintelligent

The Turing test requires that the machine be able to execute all human behaviors, regardless of whether they are intelligent. It even tests for behaviors that we may not consider intelligent at all, such as the susceptibility to insults, the temptation to lie or, simply, a high frequency of typing mistakes. If a machine cannot imitate these unintelligent behaviors in detail it fails the test.

This objection was raised by The Economist, in an article entitled "Artificial Stupidity" published shortly after the first Loebner prize competition in 1992. The article noted that the first Loebner winner's victory was due, at least in part, to its ability to "imitate human typing errors."  Turing himself had suggested that programs add errors into their output, so as to be better "players" of the game.

Some intelligent behavior is inhuman

The Turing test does not test for highly intelligent behaviors, such as the ability to solve difficult problems or come up with original insights. In fact, it specifically requires deception on the part of the machine: if the machine is more intelligent than a human being it must deliberately avoid appearing too intelligent. If it were to solve a computational problem that is practically impossible for a human to solve, then the interrogator would know the program is not human, and the machine would fail the test.

Because it cannot measure intelligence that is beyond the ability of humans, the test cannot be used in order to build or evaluate systems that are more intelligent than humans. Because of this, several test alternatives that would be able to evaluate superintelligent systems have been proposed.  

No comments:

Post a Comment