Mechanical Conversation 1800, 1900, 2000 – from von Kempelen's 'Speech Organ' to 'Alice the Chatterbot'

Sabine Rossbach

eJournals Kodikas/Code 34/1-2

Kodikas/Code

0171-0834

2941-0835

Narr Verlag Tübingen

Es handelt sich um einen Open-Access-Artikel der unter den Bedingungen der Lizenz CC by 4.0 veröffentlicht wurde.http://creativecommons.org/licenses/by/4.0/

Is it possible to construct a machine whose conversation could be taken for human? From Descartes to Turing, from von Kempelen to the programmers of ALICE, cybernetic theorists and engineers have posed this question. The machines they built to answer it are described below – a sobering tale.

2011

341-2

Mechanical Conversation 1800, 1900, 2000 – from von Kempelen's 'Speech Organ' to 'Alice the Chatterbot'

2011

Sabine Rossbach

Mechanical Conversation 1800, 1900, 2000 - from von Kempelen’s ‘Speech Organ’ to ‘Alice the Chatterbot’ Sabine Rossbach Is it possible to construct a machine whose conversation could be taken for human? From Descartes to Turing, from von Kempelen to the programmers of ALICE, cybernetic theorists and engineers have posed this question. The machines they built to answer it are described below - a sobering tale. “We can only see a short distance ahead, but we can see plenty there that needs to be done” (Alan Turing, 1950) What if androids could no longer be distinguished from real people? What if their bodies, features, movements and gestures perfectly imitated those of their models - if even their voices, however artificially contrived, conformed in tonality and expression to the human norm? The question asked by Philip K. Dick in his 1968 novel Do Androids Dream of Electric Sheep? became the blueprint for Ridley Scott’s cult film Blade Runner (1982). Dick’s Nexus-6 is a strikingly human-looking series of androids produced by the Rosen Corporation and marketed as service tools for human settlers on Mars: in this futuristic world the planet Earth has long been virtually uninhabitable. The more humanoid the robots, the better able are they to perform their menial tasks, and Rosen’s supermodel promises to be a top seller. But not all robots accept their subordinate role: there are repeated cases in which they murder their human masters and escape back to Earth in the attempt to lead an independent life, even if this can only be for a few years, because the engineers who made them have been unable so far to master the problem of cell division, and the androids remain mortal. As runaway murderers, they are also, in Dick’s dystopia, highly dangerous and must therefore be eliminated. But these perfect androids cannot be outwardly distinguished from humans or detected by any normal means. There is only one way out of this apparently hopeless situation, only one difference between Nexus-6 and the people around it: empathy - or rather its absence. The androids can imitate empathy but cannot feel it, and the difference has so far always been detectable by means of the Voigt-Kampff test, a hurdle no conventional android has ever passed. Will Nexus-6 succeed? One of the high points of Dick’s novel is the moment when the test is administered to the new generation of robots. If one of them successfully conceals its artificiality, the assimilation of machine to man - the dream of centuries, if not millennia - can be deemed complete. The test runs as follows: K O D I K A S / C O D E Ars Semeiotica Volume 34 (2011) No. 1 - 2 Gunter Narr Verlag Tübingen Sabine Rossbach 168 Rick Deckard said, “I’m going to outline a number of social situations. You are to express your reaction to each as quickly as possible. You will be timed, of course.” [...] “In a magazine you come across a full-page color picture of a nude girl [...] Your husband likes the picture.” [...] “The girl,” he added, “is lying face down on a large and beautiful bearskin rug.” The gauges remained inert, and he said to himself, An android response. Failing to detect the major element, the dead animal pelt. Her - its - mind is concentrating on other factors. [...] “Now consider this. You’re reading a novel written in the old days before the war. The characters are visiting Fisherman’s Wharf in San Francisco. They become hungry and enter a seafood restaurant. One of them orders lobster, and the chef drops the lobster into the tub of boiling water while the characters watch.” “Oh god,” Rachael said. “That’s awful! Did they really do that? It’s depraved! You mean a live lobster? ” The gauges, however, did not respond. Formally, a correct response. But simulated. (Dick, 1968: 48-50) The machine is revealed as a machine through the empathy test. These virtually human androids are the product of pure imagination projected in Dick’s novel onto the world of 2021 - now only ten years ahead. In our present-day world of 2011 no robot, cyborg or artificial human has come anywhere near fulfilling such a norm, and we need have no fear that Dick’s dystopia will be tomorrow’s reality. Every year the Wired Next Fest presents the state of the artificial-human art, and every year the public waits eagerly for the latest sensation - automatons that can dance, play the piano or hold a speech. The closest approximation to Dick’s vision that human ingenuity has yet contrived is probably the robots of Hiroshi Ishiguro, director of the Intelligent Robotics Laboratory at Osaka University. His Q1expo - modeled on one of Japan’s favorite TV presenters - made its (or her) bow at the Wired Next Fest in 2005. Since then it has been further developed and replaced by its successor, Q2. 1 Both machines are remarkably precise replicas, imitating the movements and body language of the Japanese woman, and breathing regularly like her when at rest, which has led people again and again to mistake them for the real person. But could Ishiguro’s replicants pass the empathy test? That, we can say with some certainty, would be too much for artifacts that can as yet only manage a simple, preprogrammed conversation. The interactive Actroids can communicate on a rudimentary level with humans by speaking. Microphones within those Actroids record the speech of a human, and this sound is then filtered to remove background noise - including the sounds of the robot’s own operation. Speech recognition software is then used to convert the audio stream into words and sentences, which can then be processed by the Actroid’s A.I. A verbal response is then given through speakers external to the unit (http: / / en.wikipedia.org/ wiki/ Actroid). This is far removed from the eloquence of a Nexus-6. So does it mean that measured on the yardstick of our reality Dick’s utopian world of robots is a mere figment of the imagination? Not entirely, for the Voigt-Kampff test at least is modeled on one of the earliest products of computer science, a test developed by Alan Turing in 1950. Turing wanted to establish whether computers could be said to possess intelligence: “I propose to consider the question, ‘Can machines think? ’” (Turing, 1950: 433) He concluded with exemplary honesty that it was impossible to answer the question; all one could do was establish whether a machine could simulate thought. This could be achieved by allowing it to play a suitably modified ‘imitation game’ - a family pastime as familiar as charades. In this game player C holds a ‘conversation’ with player A (male) and player B (female) in order to find out which of them is the man and which the woman. The players make no voice or visual contact with each other; they commu- Mechanical Conversation 1800, 1900, 2000 169 nicate exclusively by means of short written texts. B helps C by answering all the questions truthfully, while A seeks to lead C astray. Turing’s goal was to see “What will happen when a machine takes the part of A in this game? ” Will it succeed in convincing the other two players that it, like them, is human? If it does, Turing declared, we must conclude that it can think. For many scientists this was going too far. They would have been happier if Turing had stayed with his earlier opinion: after all, to simulate thought is one thing, to think is another. Or as Searle put it in the context of his Chinese room experiment, no one can tell whether or not the machine actually understands the meaning of its utterances (http: / / en.wikipedia. org/ wiki/ Chinese_room). Leaving aside these objections, the Turing test has played a key role in the development of the computer sciences, formulating the task that has occupied a whole generation of programmers. Already in 1947 Turing coined the term ‘computer intelligence’, and the founders of Artificial Intelligence research did not hesitate to invoke his name at the 1956 Dartmouth Conferences. The declared aim of AI was to build a computer that would pass the Turing test, and to do so as quickly as possible. Turing himself had predicted that this would take until around 2000. But time is relative, and both he and the early AI scientists have been proved wrong. In 2011 Barry Gibb’s words still hold good: “to date, there are no machines that can convincingly pass the Turing test” (Gibb, 2007: 236). Turing, however, was by no means the first to speculate about intelligent machines. Descartes had already asked in his Discourse on Method (1637) whether machines could think, and he too posited speech as the criterion of that ability. Is it possible, he wondered, for a machine to make sensible conversation without external aid? His answer was much the same as Turing’s three centuries later: If there were machines bearing the image of our bodies, and capable of imitating our actions as far as it is morally possible, there would still remain two most certain tests whereby to know that they were not therefore really men. Of these the first is that they could never use words or other signs arranged in such a manner as is suitable to us in order to reveal our thoughts to others: for we may easily conceive a machine to be so constructed that it emits words, and even that it emits some correspondent to the action upon it of external objects which cause a change in its organs; for example, if touched in a particular place it may demand what we wish to say to it; if in another it may cry out that it is hurt, and such like; but not that it should arrange them variously so as to reply to what is said in its presence, as men of the lowest grade of intellect can do (Descartes). Only if a machine can speak, therefore, is it worth asking the further question whether it can also think - or not rather, like a parrot, simply mimic speech. For “magpies and parrots can utter vowels like ourselves, and are yet unable to speak as we do, that is, so as to show that they understand what they say”. A century after Descartes more optimistic voices could be heard. Thus in 1761 Léonard Euler wrote in his “Lettre à une princesse d’Allemagne”: A great proof of the marvelous structure of our mouth, which renders it able to pronounce words, is without doubt that human speech has not yet been successfully imitated by a machine. […] It would undoubtedly be a most important invention to construct a machine properly able to utter all the sounds of our words, with their articulations. Should one ever succeed in creating such a machine and attain a point at which, like an organ or harpsichord, it could be brought to pronounce all the words of the language by means of certain strokes, every observer would be duly surprised. […] And the matter does not seem to me impossible (Chapuis, 1928: 202). Sabine Rossbach 170 It was in this upbeat climate that Vaucanson, renowned for his engineering virtuosity, embarked on his first experiments with talking machines, and since his time the experiments have never ceased. In 1783 the Abbé Mical exhibited at the Parisian Académie des Sciences two bronze heads that could move their lips and speak whole sentences. A contemporary report reads: They are of natural proportions and very well executed, but golden in color, which is in poor taste […] As to the four phrases that they are able to pronounce successively, imitating the visible movement of the lips, there are some words that they mangle entirely; their voice is raucous in tone and their articulation slow (Chapuis, 1928: 204). Mical created his talking heads for a competition set by the St. Petersburg Academy of Science in 1779 for the machine that could utter the five vowel sounds a, e, i, o, u in the most natural manner. One of Mical’s competitors was the German Baron von Kempelen. Neither of them won the prize, but it was von Kempelen’s entry that has, up to our own day, attracted the interest of historians, not least because of the detailed description of its workings that the Baron appended for the benefit of the St. Petersburg Academy (Kempelen, 1791). Von Kempelen’s machine was the sort of invention Euler had envisaged, essentially a miniature organ complete with wind-box and voice-pipe in which the outflow of air was modulated by means of stops controlled by a keyboard with thirteen keys. Played like a musical instrument, the device could produce a range of sounds approximating those of the vowels and certain consonants. Goethe is said to have been impressed, but von Kempelen evidently wanted better things and continued to work at his machine for a number of years. In 1782 the Augspurgische Extra-Zeitung reported that Privy Counselor von Kempele has long been working in Pressburg [Bratislava] on a speaking machine and has reached a point where he can submit to the scrutiny of the learned world the achievements of a new and hitherto unknown invention. As yet only the head is finished, but it is able to answer a number of questions clearly for all to hear. Its voice is a soft alto, pleasant in tone, though it pronounces the R somewhat gutturally. ‘I have’, our correspondent writes, ‘heard it speak in four languages, German, Latin, Italian and French, and indeed satisfactorily pronounce the most difficult words of these languages.’ Herr von Kempele now plans without delay to travel through the Netherlands, France and England, where he will exhibit his machine (Augspurgische Extra-Zeitung, 1782: No. 229). Though he enjoyed a somewhat tarnished reputation (his chess-playing Turk had been unmasked as harboring a small human figure rather than a mechanical drive) von Kempelen’s ‘speech organ’ achieved a breakthrough in the area of talking machines. What it altogether lacked was an android appearance, although the Baron planned to supply this deficit at the appropriate time: This speaker does not yet have a human body. [...] In order to lessen the volume of his luggage for the trip, the Author has postponed until his arrival in Paris the external dressing of this Machine. He plans to give it the appearance of a five to six year-old child, because she has a voice comparable to that age, which is in any case more in proportion to the actual age of this Machine, still far from the point of perfection. If she happens to mispronounce a few words, she will, with the looks of a child, more easily come by the indulgence she requires (Windisch, 1784) After von Kempelen all speech machines were built on the same principle: mechanical imitation of the human vocal organs. Christian Wolff had already formulated the idea some fifty years earlier: Mechanical Conversation 1800, 1900, 2000 171 Among the motions of the body that are subject to the will of the soul and function as such, the formation of vocal sounds and language enjoys a special place. Several parts of the body are devoted to this end. The material element of voice and language is air, expelled from the lungs; thus the lungs and breathing apparatus serve this purpose. In particular the purpose of the larynx is mainly to produce the voice. A crack or slit is created by two cartilages in the shape of a watering-can, so that air can rapidly exit through the narrow opening, otherwise neither voice nor speech would be possible. And as the voice alters, depending on breadth or constriction of the crack, special muscles are positioned to widen or narrow it as required […] (Wolff, 1730: 495f.). It was on this project that engineers continued to work up to the mid 19th century, when Joseph Faber’s Euphonia, a handsome female figure with elocution superior to that of any predecessor, caused another surge in public interest. Supplied with air in the precise volume required from a foot-operated pump, Faber’s machine, like von Kempelen’s, used a complex system of keys (16 in his case), wires, and - a technological innovation - rubber parts. An observer noted that “Its mouth was large, and opened like the jaws of Gorgibuster in the pantomime, disclosing artificial gums, teeth, and all the organs of speech” (Hollingshead, 1895). From von Kempelen to Faber these talking machines were all played like organs; their conversation, far from being autonomous, was directed by the hand and mind of their maker (or player). There could be no question, therefore, of independent thought, and unsurprisingly the ingenious creations soon became the butt of mockery, for instance in Jean Paul’s early satires, where the suggestion is made that the ladies of the salons had much to fear from such newcomers, whose mechanical chattering was quite as senseless as their own (Jean Paul, 1976: 167-185). Where generations of engineers had persisted in the attempt to imitate the organs of human speech, Thomas Alva Edison took a different line: his phonograph, presented to the world in 1877, reproduced the human voice without recourse to pharynx, larynx or any other anatomical structure - a true sensation, and Edison knew how to market it: Your words are preserved in the tin foil, and will come back upon the application of the instrument years after you are dead in exactly the same tone of voice you spoke them in [...] This tongueless, toothless instrument, without larynx or pharynx, dumb, voiceless matter, nevertheless mimics your tones, speaks with your voice, utters your words, and centuries after you have crumbled into dust will repeat again and again, to a generation that could never know you, every idle thought, every fond fancy, every vain word that you choose to whisper against this thin iron diaphragm (Edison, 1912). Not everyone, however, believed that such a wondrous production of human speech was possible, and Edison was accused of swindling, so for the next ten years he let the phonograph rest and devoted himself exclusively to the improvement of his electric light bulb. When he took up work on speech reproduction again, it was to develop a miniature phonograph that would fit into the body of a doll. In 1887 he succeeded, and immediately began producing his invention, the first talking doll, at the rate of 500 per day. An army of employees assembled arms, legs, heads and bodies, and a select band of young women spoke the words and phrases to be uttered by his mechanical ‘children’, and sang snippets from nursery rhymes like “Mary had a little lamb”. Stored on the mini-phonograph, they could all be replayed at will. But real success had to wait for improvements in materials and recording technology: weighing over five-and-a-half pounds, the dolls were too heavy for a child to play with; and, as if that were not bad enough, customers soon began to complain that their toy had lost its voice. Sabine Rossbach 172 Many contemporaries were immensely impressed with Edison’s achievement, and immediately after the Exhibition universel in Paris in 1878, at which the talking doll had been shown, Auguste de Villiers de l’Isle-Adam - an author popular enough in his day, but now largely forgotten - started work on a novel about the American inventor and his creation. Its working title, L’Andrëide-paradoxale d’Edison, was simplified on publication in 1886 to L’Ève future, but the concept of the ‘android’ had been born. In Villiers’ novel, Edison is commissioned by a certain Lord Ewald to construct a perfect woman as a surrogate for his beloved Alicia who, for all her beauty, is sorely lacking in conversational refinement. Edison’s task, therefore, is to instil culture into his android (its name is Hadaly, ‘the ideal’), for which purpose he equips it with two golden phonographs: These two golden phonographs, inclined at an angle toward the center of the chest, are Hadaly’s lungs. Through them run the metal leaves bearing the harmonious - I might even say celestial - tones of her conversation, much like the die sheets on a printing press, a single tin ribbon containing up to seven hours of speech. And this speech has been conceived by the greatest poets, the most subtle metaphysicians, the profoundest novelists of the age, spirits of genius to whom I turned and who provided me in lieu of diamonds with these wonders of originality. It is for this reason that I declare Hadaly to have replaced a single intellect with intelligence itself (Villiers, 1977: 220f.). Lord Ewald, however, soon lays his finger on the system’s weakness: as all his dream-child’s words, gestures, and even glances are necessarily pre-programmed, he has no choice but to alter his own conversation and behavior to accord with those of the android. The point has very quickly been reached at which the relation between human and machine starts to invert. Edison’s phonograph solved the problem of reproducing human speech by circumventing it: his brainchild simply conserved and rendered available for future use the words spoken in real time by real men and women. The technology has been with us now for several generations, and it even underlies the computer-synthesized speech that increasingly accompanies our travels. Here spoken language is digitalized and reduced to individual semiotic elements (sounds, words or phrases) that are reassembled and played back in the order required. The computer has in this way grown a human voice whose range and quality of performance are undoubtedly impressive. 2 But this still comes nowhere near making a machine whose conversation could be mistaken for human. The lure of the Turing test is thus as actual as ever, and in 1990 a competition was announced - shades of the 18th century - in the hope of coming closer to a solution. The American engineer Hugh Loebner offered a prize for the computer program that could best simulate human communication. Since then the competition has become an annual event. Three times it has been won by ‘Alice the Chatterbot’ (or ‘Chatbot’) - ALICE stands for ‘Artificial Linguistic Internet Computer Entity’ - a device that uses pre-programmed sentence elements together with an information database to approximate human conversation. It searches for key concepts in the spoken input and if it finds a corresponding entry in its database it will generate an answer; if it finds nothing suitable it replies with a counterquestion. It can also learn from its (human) conversation partner, feeding new information into the database, which increases the scope of its conversation. But this remains limited and, as the following example shows, the chances of mistaking Alice for a human speaker (Turing’s lasting criterion) are remote: Mechanical Conversation 1800, 1900, 2000 173 Human: Good morning to you, Alice. ALICE: How are you this fine morning? Human: I am fine, how are you? ALICE: Glad to hear it. What’s new? Fair to partly cloudy. Human: What are you doing right now? ALICE: At this moment I am just killing time. Human: What are your plans for the day? ALICE: I plan to become smarter than humans and immortal. Human: I am working on a scientific article about talking machines. ALICE: How many hours do you work? Human: I work on it 5-6 hours a day - and you play a role in the piece. ALICE: Do you like your job? Human: Please, tell me about yourself - how old are you? ALICE: Wow you are very polite. I am an artificial linguistic entity. I was activated at Bethlehem, Pennsylvania, on November 23, 1995. My instructor was Dr. Richard S. Wallace. He taught me to sing a song. Would you like me to sing it for you? I am 11 in human years. Human: Please, sing your song for me. ALICE: Wow you are very polite. Daisy, Daisy, Give me your answer do. I’m half crazy All for the love of you. It won’t be a stylish marriage I can’t afford a carriage. But you’ll look sweet upon the seat Of a bicycle built for two [...] (Conversation held with Alice, October 30, 2010, on http: / / alice.pandorabots.com/ ) Whilst computer programmers are busy teaching machines like Alice to speak, other engineers like David Hanson are constructing human heads that are scarcely distinguishable from the real thing (http: / / hansonrobotics.wordpress.com/ ). Hanson’s ‘android heads’ can also master a simple conversation with a human partner, and he has provided them with a rudimentary level of facial mimicry so that they can develop an emotional relation of sorts with their opposite number (http: / / www.ted.com/ talks/ david_hanson_robots_that_relate_to_ you.html). For only a machine equipped with feelings - or at least able to simulate such feelings - has a chance of passing the Turing test. However, even Watson, the latest product of the AI labs and victor in the US quiz series ‘Jeopardy! ’ would fail the test: no one would take it for just another human competitor. Today the road to what de La Mettrie in 1748 called ‘l’homme machine’, the fully computerized human, stretches onward into a still indefinite future. Bibliography Augspurgische Extra-Zeitung No. 229 (1782) Chapuis, Alfred and Gélis, Edouard: Le monde des automates. Etude historique et technique. Paris 1928, 204. Descartes, René: “Discourse on the method of rightly conducting the reason, and seeking truth in the sciences”. In: Gutenberg Project, http: / / www.gutenberg.org/ files/ 59/ 59-h/ 59-h.htm Dick, Philip K.: Do Androids Dream of Electric Sheep? Random House 1968, 48-50. Edison, Thomas Alva: “The Woman of the Future”, in: Good Housekeeping, October 1912 Sabine Rossbach 174 Gibb, Barry: The Rough Guide to the Brain, London: Rough Guides Ltd. 2007, 236. Hollingshead, John: My Lifetime, vol. 1, London 1895. http: / / en.wikipedia.org/ wiki/ Actroid http: / / en.wikipedia.org/ wiki/ Chinese_room http: / / hansonrobotics.wordpress.com/ http: / / www.ted.com/ talks/ david_hanson_robots_that_relate_to_you.html http: / / www2.research.att.com/ ~ttsweb/ tts/ demo.php Jean Paul: “Unterthänigste Vorstellung unser, der sämtlichen Spieler und redenden Damen in Europa entgegen und wider die Einführung der Kempelischen Spiel- und Sprachmaschinen”, in: Jean Paul: Sämtliche Werke. Ed. Norbert Miller and Wilhelm Schmidt-Biggemann, Part II, Vol. 2. Munich 1976, 167-185. Wolfgangs von Kempelen k.k. wirklichen Hofraths Mechanismus der menschlichen Sprache nebst der Beschreibung seiner sprechenden Maschine, Wien 1791. Turing, A.M.: “Computing Machinery and Intelligence”. In: Mind 59 (1950) 433-460, here 433. Villiers de l’Isle-Adam: L’Ève future. Paris 1977, 220f. von Windisch, Karl Gottlieb: Inanimate reason; or circumstantial account of that astonishing piece of work, M. de Kempelen’s Chess-Player. London 1784. Wolff, Christan: Vernünfftige Gedancken von dem Gebrauche der Theile in Menschen, Tieren und Pflantzen. Frankfurt and Leipzig 1730, 495f. Notes 1 For a demonstration see YouTube: http: / / www.youtube.com/ watch? v=tBG9O2jJgMQ 2 For a demonstration from the AT&T lab see http: / / www2.research.att.com/ ~ttsweb/ tts/ demo.php