Automated musical instruments date at least as far back as the beginning of humankind's written record. From a distance, many of these instruments seem like extremely extravagant, if not futile, ways of accomplishing simple tasks. Consider al-Jazari's clepsydrae (to be discussed in detail anon) that employ remarkably baroque means of marking the passage of time, including at times ensembles of musical automata to sound the hour - would a simple graduated cistern with a tap in the bottom not serve the same purpose? Why music, and why humanoid? What drives people to imbue machines with aesthetic beauty, or to build machines that produce aesthetic output, and why are such machines often built in the image of humans? Perhaps one reason amongst many is humankind's desire for musical companionship. Perhaps the timekeeping functionality of a clepsydra is only an excuse to have regular small interactions with musicians at home, in a context where it is not practical to use real human musicians. Because Kiki is also motivated by the idea of robotic musical companionship, it is useful to look more closely at this work.
Moreover, the drive to create creative machines is not futile. Closer inspection reveals that the study of musical robots and automata has always held a unique role in the advancement of technology, according to the concurrent relationship between technology and society. For example, the water-pressure-regulating mechanism that was originally designed by Apollonius of Perga to solve the engineering challenge of continuously driving air through a flute was eventually expanded by al-Jazari into a variety of similar mechanisms, in part for his clepsydrae. One of these mechanisms, once society decided to combat sanitation issues with indoor plumbing, became the flushing mechanism still used in modern toilets. Similarly, Vaucanson, who built one of the first automatic looms, which famously inspired the ire of the Luddites, also famously built two of history's most sophisticated musical automata. These automata were widely admired and marveled, despite using similar technology as the contested loom; this perhaps helped in some small way to improve society's overall feelings about that technology in this critical time at the start of the industrial revolution. Today, the current state of technology has made interactive musical robots possible for the first time in history. This has opened up a wide variety of potential applications, such as music education, physical therapy, improvisatory performance, new models of human-machine interaction, control theory, biomimicry, hard AI problems pertaining to creativity, and the psychology of affect. Can interactive musical robots help music students acquire certain skills more effectively than practicing alone? Can these robots help elderly people maintain their cognitive skills as they age? Would benefits need to be limited to those with prior musical training? Can composers or choreographers create new forms of artistic expression using musical robots? Can everyday noisemaking objects that are typically considered a nuisance, like beeping microwaves and cellphones, be improved with techniques developed for musical robots such as timbrally rich sounds and context-awareness? Can artificial appendages built to meet the timing and precision requirements of playing an instrument be fruitfully employed in prostheses or manufacturing? Can models of human-robot interaction developed in musical contexts improve how we interact with computers more generally?
Overviews of some important modern musical robots have been compiled in [7] and [18]. The works presented therein may be considered as previous work to this dissertation. What these modern overviews do not establish as well as a deep historical one is humankind's long-standing desire for musical companionship, and the long-term relationship between musical machines and technology more generally. So rather than repeating the modern studies, this chapter prevents a very long view into the history of the field.
The first attempt to mechanize a musical instrument is no doubt lost to history. By the 4th century BCE there already existed sophisticated mechanically wind-fed organs in advanced stages of development, and at that point the practice was probably centuries old [19]; the earliest ones probably operated by means of mechanical bellows. The ancient history of mechanical instruments is probably even older still, if we consider predecessors to the organ, such as the bagpipes or aeolian harp, to be `mechanically' wind-fed. In the first few centuries BCE, the Greeks laid the foundations for the fields of Hydrostatics, Pneumatics, and Hydraulics, which together provided more sophisticated means of supplying energy to a wide variety mechanical devices, including musical automata. One very famous example from the 1st century AD, is the altar organ in Section 77 of Hero's Pneumatica [20], which is fed air via a windmill-driven piston, although a human would then presumably finger the organ. In the middle of the 1st millennium AD, scholars in the Middle East (such as the Banu Musa in Baghdad) began importing and translating these and other scientific manuscripts from all over the world. In the following centuries, while the Dark Ages consumed Europe, the techniques of building all manner of mechanical device, including mechanical musical instruments, flourished in the Middle East. A complete history of the organ, or mechanical instruments generally, will be out of scope for the present paper. However, even in these very early times, there developed a tradition of building humanoid musical automata, which, although functionally identical to non-humanoid mechanical instruments, nonetheless betray a different way of thinking about such devices. These humanoids are worthy of separate consideration, and shall form the bulk of the following discussion.
In the 3rd century BCE, Archimedes, the father of hydrostatics, invented the first known humanoid musical automaton. Although his original treatise does not survive, his ideas are preserved in later Arabic translations [21][22]. This treatise describes a very large and elaborate clepsydra (water clock). The basic premise of a water clock is that water will drain from a tap in the bottom of a cistern at an even rate until the cistern is empty, and this can be used to mark the passage of time. In principle, time could be measured with a simple graduated cistern. In practice, however, the kinetic energy in the water falling from the cistern was used to power complex mechanical devices which show the passage of time. In Archimedes' clock the falling water drives a water-wheel which slowly moves a figurine of an executioner with a sword forward along a track, past figurines of several fettered prisoners with hinged heads. Every canonical hour, the executioner knocks the head off one of the prisoners, so the time may be read by counting the beheaded figurines. This clock contains many other elaborate devices including that shown in Figure 2 (a). This is a humanoid figurine holding a Byzantine whistle, which is connected via a pipe to a large vessel with two chambers. The upper chamber is connected to the lower chamber via a cylindrical syphon. After water falls through the other mechanisms in the clock, it is collected in the top chamber of the cistern. Once the water in that chamber has reached the top of the siphon, which will take 12 canonical hours, the water will be syphoned into the lower chamber. As that happens, the air in the lower chamber is expelled out through the tube and, subsequently, the whistle. This results in a loud whistling sound that, according to the author, can be heard from a `considerable distance'. The whistle signals that the clock's main water reservoir is empty, and must be manually refilled with the water which is now at the bottom of the flute cistern.
It is worth noting that this treatise also contains a description of a fake tree with mechanical birds that have Byzantine whistles hidden inside them, which operate on the same principle. Mechanical birds have their own history, which perhaps culminates in the Parisian music-boxes of the 19th century, although they will not be discussed further in the present history.
In the 1st century AD, Hero of Alexandria wrote Pneumatica, in which he describes many wind and water powered devices [23]. This treatise contains several automatic singing birds and musical devices, operating on similar principles as those described by Archimedes. One notable example is described in Section 49, depicted here in Figure 2 (b). A figurine holding a trumpet stands on top of a hemispherical chamber within a sealed pedestal. A pipe connects the trumpet to the interior of the hemisphere. The hemispherical chamber has many small holes in the bottom. The pedestal (and consequently the hemisphere) are filled partially with water. A person can expel the water from the hemisphere (into the pedestal) by blowing into the bell of the trumpet. When the person removes their breath, the water will flood back into the hemisphere through the holes in the bottom, thereby expelling air through the trumpet.
One problem with both of these models is that once all of the water has flowed from one chamber into the other, the water must be manually transferred back in order for the android to play again.
This latter problem was solved in the late 3rd century BCE by Apollonius of Perga, in a treatise which survives in a copy in the same manuscript as Archimedes' [21], which is explained in [19]. He again describes a mechanism for feeding air into a flute. The mechanism is driven by water falling from a cistern, which is filled by a stream so it is continually full. Below the cistern is a waterwheel driving a system of gears and levers, constituting a hydraulic pump. The levers open and close valves which causes water to flow alternately into two chambers below. While one chamber fills with water, its air is expelled out into the flute through a no-return valve; at the same time, the absence of water falling into the other chamber trips a lever which causes it to drain. Alternating the chambers in this way allows air to be fed continuously into the flute with no further human intervention. Part of his design is shown in Figure 3.
The instruments discussed in the previous sections feed air into the flute but do not finger it. An obscure work entitled `The Instrument which Plays by Itself' by the famous 9th century scientists Muhammad, Ahmad, and al-Hasan, sons of Musa, describes a machine that does this, and a translation is given [19].1 Air is continuously fed into the flute using a mechanism similar to that of Apollonius, but which also contains a means of regulating air pressure by expelling water from whichever chamber is currently filling when the pressure becomes too great (i.e. when all of the flute's finger-holes are closed). The finger-holes of the flute are covered with little hinged flaps which plug the holes, but which can be raised by levers. The levers are connected to a notched barrel, similar in construction to those found in barrel organs or modern music boxes. The barrel spins on account of sharing an axle with a water-wheel, and when a raised notch on the barrel comes into contact with a lever, it causes the flap on the corresponding finger-hole to be opened. The authors suggest making the barrel of large enough diameter to contain several repetitions of one melody in a half rotation, and several repetitions of another melody on the other half. They also suggest that the barrel can be twice as long as needed to provide a second set of melodies. An auxiliary mechanism controls the flow of water onto the water-wheel such that the tempo of the music continually speeds up and slows down to provide musical interest. The entire machine, which is roughly 150 cm (5 feet) tall, is hidden within the body of a humanoid figurine. The authors also suggest that a lute or psaltery player can be made in the same way. It is unclear whether they built this, but they do provide some details about how it would be tuned and how it could be made to play in unison with the flute. As an interesting side note, the authors also describe a method of recording the movement of the android's fingers by engraving into a large wax-coated cylinder, which then can then be used to make new melody barrels (through an unclear process). This is an interesting predecessor to early 20th century audio recording techniques.
The Banu Musa automaton is often cited (e.g. on the internet) as being the first programmable machine. This assertion appears to be made on the misunderstanding that the barrel which drives the fingers is fitted with movable pegs. In the text, the authors describe the barrel as being smooth and having notched rings (one for each finger) fitted over it. My reading of the text is that the rings and barrel are permanently affixed to the device, and the authors make no mention of their being manipulated after construction. On the other hand, this does appear to be the first recorded use of a pegged cylinder, which subsequently becomes the standard controlling mechanism in musical automata. This also appears to be the first automata that is capable of playing complete pieces of music, as opposed to single tones. It should also be noted that another book by the Banu Musa called `The Book of Ingenious Devices' [24] contains no mention of musical automata (aside from a whistle that sounds when its base is dunked in water), although it seems to be commonly confused with the similarly-named book by al-Jazari, to be discussed presently.
In The Book of Knowledge of Ingenious Mechanical Devices [27], al-Jazari describes several musical automata, including what seem to be the earliest descriptions of mechanical percussionists. Of particular interest are a number of percussion androids. Chapter 2 of Category I (the Water Clock of the Drummers) describes a large and elaborate clepsydra (clock) that contains, amongst other things, two cymbalists, two drummers with generic drums slung over their shoulders and played with curved sticks, and one seated drummer with two kettle-drums and curved sticks. The main illustration is shown in Figure 4 (a).
On every hour, "the musicians perform with a clamorous sound which is heard from afar". The two generic drummers and two cymbalists are wooden, and each have one hinged arm, a hollow body, and one hollow leg. A copper cable is attached to the moveable part of the arm, and is threaded through the interior of the body and leg into a hidden chamber below. Pulling on the cable causes the arm to raise, and releasing it causes it to fall and strike the drum, or the cymbals to clash. The kettle-drummer is similar except that both arms are moveable. Within the hidden chamber below, the copper cables are tied to levers. A water-wheel spins an axle that has pegs protruding from it, which press down upon and subsequently release the levers, causing the drummers to play. The construction of the drummers is shown in Figure 4 (b). The pegs are arranged in a pattern that is used in all of this author's percussion instruments - "Since two of the three ends of the pegs are close together, the fall of the drumsticks on the drum is varied - [first] two raps then one rap - and likewise with the cymbal". Every hour, a sufficient quantity of water has drained out of the main cistern to cause a bucket to tip over (via a Rubegoldbergian contraption involving a falcon and a marble), which spills its water onto the water-wheel that drives the musicians.
This clock also has two figurines holding trumpets. However, the trumpets are props which do not produce sound, and the figurines share a single sound-producing mechanism which is hidden underfoot. The mechanism is essentially an ad-hoc, copper, jar-shaped whistle (with no ball) whose construction is described in detail in Chapter 1 of Category I. Air is supplied to it using the same principle used by Archimedes, as is depicted in Figure 4 (b). This artifice is used in all of al-Jazari's `flute' and `trumpet' automata.
Chapter 1 of Category I describes an even more elaborate water clock that has, amongst other things, a cymbalist, generic-drummer, kettle-drummer, and two trumpeters with similar construction and operation as above. Chapter 3 of Category II describes an amusement device for drinking parties that serves wine and plays music every 20 minutes. It has a flutist, tambourine player, lutenist, and drummer (playing an odd two-sided drum held in the lap). These are made of jointed copper. Their operation is similar as above, with the following exceptions: The tambourine player's arm has two joints, and is connected somewhat differently; The interior of the flute player's arm contains a whistle with a ball, of slightly different construction than elsewhere, and the entire contraption is run on wine instead of water (those were good times)! The lute player has a moveable arm, but it is unclear whether it ever actually makes contact with the lute, or if it just moves the arm for show. The following chapter (Chapter 2 of Category IV) is a boat which contains two tambourine players, a flautist, and a harpist. It is said of the harpist, "Both her hands are constructed so as to move, with their fingers over the strings but not touching them", indicating that this device was decorative but did not make sound.
This last device (Chapter 2 of Category IV - a mechanical boat with four musical automata for a drinking party) is particularly famous, and many wild and false claims have been made about it, and it is worth clarifying the record on a few points. First, the original text does not support the claim that the drummer was programmable, as it reads, "To the axle a short peg is fitted... A single peg on the axle is not sufficient, so two pegs are fitted close to each other opposite this peg, so that the movement of the hand gives two beats and one beat". Nowhere does the author state that the pegs are configurable. Moreover, Al-Jazari's text is clear and technically precise and comprehensive, and does not mention facial movements, or any body actions in the musicians beyond the simple movement of the arms.
The Book of Knowledge of Ingenious Mechanical Devices also contains several chapters on perpetual flutes, in the manner of Apollonius, whom al-Jazari cites by name. They provide several alternative mechanisms for causing water to alternately fill and drain from two chambers, all driving air through a `jar' whistle of a single pitch.
This period saw an inflorescence of automated musical instruments and mechanical organs; a detailed overview is given in [29]. This history perhaps begins with Althanasius Kirchers's eccentric 1650 treatise MVSVRGIAE VNIVERSALIS [28], which contains a chapter (in Liber IX, Pars V, pp. 308 ff.) on the construction of Omnis Generis Instrumentis Musicis Automatis. This includes a novel hydraulic barrel-organ, which is somewhat backwards-looking for its use of falling water to supply energy and its similarities to the Banu Musa flautist, and forward-looking in its use of a pegged barrel to control a keyboard. The same chapter also includes an automated carillon made by similar means but driven by weights suspended on ropes and pulleys; and a variety of other things. These devices are shown in Figure 5. Around this time, horologists in the Black Forest began to incorporate similar mechanisms into increasingly elaborate clocks, which began to attract the attention of composers such as Haydn, who adapted 32 of his previous works explicitly for flute-clock (H.XIX 1 - 32). By the very end of the 18th century (extending into the 19th), such devices started becoming very elaborate and attempted to recreate entire wind orchestras. Notable are the so-called Panharmonicons, made first by Maelzel (premiered in Paris in 1807) [30] and immediately copied by others such as Gurk (premiered in Germany in 1810) [31]. Maelzel's contained 7 ranks of pipes imitating woodwind and brass instruments, and several percussion instruments. In 1812, Beethoven wrote a `Battle' symphony for the Panharmonicon (the subsequently orchestrated version is his Opus 91) in exchange for some hearing devices made by Maelzel [32]. This appears to be the first piece composed specifically for a machine, although it subsequently became the subject of a legal battle between Beethoven and Maelzel. Although Maelzel built three Panharmonicons, they evidently failed to attract much attention and were eventually destroyed. During this period, a number of remarkable humanoid musical androids were also built, to be discussed anon.
In the 1730s, Jacques de Vaucanson, who invented an important predecessor to the Jacquard Loom, and who is perhaps most famous for his shitting duck automaton, developed two highly sophisticated musical automata. These are shown in Figure 6. One plays a standard orchestral flute, and the other plays tabor and pipe (a small drum played with one hand and three-hole fife played with the other). He presented a technical description of the flute player to the Royal Academy of Sciences in 1738, and the tabor player in substantially less detail in a missive, both appearing in [33], and each reprinted in [34] (s.v. `Andriode' and `Automate', respectively) and many other places subsequently. The flute player appears to be the first automaton in history to play an actual musical instrument that was built to be played by a human. Thus, Vaucanson's description presents a thorough analysis of the mechanics of flute playing from an engineering perspective (although contemporary flautists disagreed with the finer points of his analysis [35], Chapter IV Section 14). The automaton itself was biologically-inspired, and informed by this analysis. The embouchure had four degrees of freedom plus a tongue for musical articulation; the fingers were covered in leather to imitate the softness of human skin.2 The automaton had three sets of bellows, each driven by different weights to produce different strengths of breath, as necessary to produce notes in different registers and different dynamics. The mechanisms controlling the lips, fingers, and bellows's valves were connected via cables to levers that rested on a pegged cylinder, similar to that in the Banu Musa flautist or modern music boxes or barrel organs; the entire device was presumably wound up via a crank in the back of the plinth, and once set in motion it would play the music engraved on the cylinder. A good diagrammatic reconstruction of this mechanism can be found on page 81 in [37]. Less is known about the tabor and pipe player, but Vaucanson reports that a much greater range of air pressure is needed to produce the different pitches in the pipe. He also notes that in order to sound good, every note needs to be articulated individually by the tongue, which human players do not achieve well in fast passages. He reports that his automaton outperforms humans in this task, and this appears to have been the first instance in history of a machine with the ability to produce music that would be too difficult for a human to produce. It was able to strike the drum with a variety of velocities, and play a variety of strokes and rolls, but no further information is provided on how this worked.
In the 1760s and '70s, the Swiss clockmaker Pierre Jacquet-Droz built several remarkable and very famous automata, amongst which is a figurine that plays a small reed organ; the organ is functionally separate from the figurine, and the force of the figurine's hands depresses the keys. In 1784, Peter Kintzing and David Roentgen built a similar automaton that plays hammer dulcimer, which they subsequently presented to Queen Marie Antoinette. Both automata still exist and function, and videos of them are widely available.3 4 In roughly the 1820s, another Swiss clockmaker, Henri Maillardet created several automata which were strongly inspired by those of Jacquet-Droz. Amongst them was an organist automata, which is described in [38] s.v. `Androides' (vol II p. 61) and an exhibition advertisement reprinted in [31].
In 1837, an account of a remarkable violinist automaton appeared in [39]. The life-size automaton, built by one Mr. Marreppe, was evidently exhibited at the Royal Conservatory in Paris. It played virtuosic music that was compared to that of Paganini and Ole Bull, owing to a range of extended techniques, very fast playing, and large dynamic variation. It played both solo and with an orchestra, and as such it seems to have been the first automaton to, in a limited sense, interact with human musicians. It was also reported that the automaton started playing on the conductor's cue, and could `obey the direction of the conductor'. Unfortunately, all further information about this automaton seems to be lost.
In 1810, the German and eventual orchestrion manufacturer Friedrich Kaufmann built a musical automaton that holds a trumpet. Within the automaton is a 12-note reed organ, the air from which exits through the trumpet, creating a trumpet-like sound. An article in the August 1950 edition of Mechanix Illustrated erroneously stated that this was built in 1910 and that it was the first robot in history. This automaton still exists. 5
It is unfortunate that Innocenzo Manzetti is not better known in the English-speaking world. As an inventor, he is the unrecognized inventor of the telephone and steam-powered automobile, amongst other things [40]. In the 1840s, he began building the flute-playing automaton pictured in Figure 7 (b) [41][42]. By 1849 a prototype was complete; it worked like a barrel-organ, and it had a metal cylinder with raised bumps hidden in its abdomen. The mechanism was wound-up like a clock, and as the barrel rotated, the bumps impinged on levers attached to cables which made the fingers, lips, tongue, and eyes move. It presumably had a mechanical bellows hidden in the chair on which it was seated, and could provide four gradations of air pressure. It could play about 20 pieces by this method. As such, it was structurally similar to Vaucanson's automata. However, in the following decades, Manzetti made a number of important and novel improvements. First, he wanted it to be capable of playing arbitrary melodies. Therefore, during the 1850s, he replaced the metal linkages with rubber-like pneumatic tubes which he built [43]. He also built a special harmonium (pump reed-organ), which he connected to the automaton via pneumatic tubes. Thus, whatever notes a keyboardist depressed on the keyboard would be sounded by the flute automaton. This was probably the first pneumatically actuated automaton, as well as the first distinct use of a musical controller. These improvements also remove the flautist from a strict definition of `automaton'; although it still carries out complex actions automatically (e.g. fingering and tonguing), it does so only in response to a human's actions. This represents a marked departure from previous musical androids, as it puts the focus on the interaction between the human and machine. An account from 1865 [44] reports that assistants operated the bellows, although in 1866 Manzetti built a battery-powered air pump, perhaps the first of its kind, making this the first known electrically-powered automaton. The flautist was also designed such that when the bellows were started, the tubes in its knees and arms would inflate, causing it to rise from its chair and bring the instrument to its lips, and its porcelain eyes would rove around. Manzetti also expanded the idea of the musical controller. The same 1865 account reports that he ran a tube from his harmonium in his studio (on Giocondo St. in Aosta), out the window and into the Aosta Cathedral across the street, and devised a mechanism whereby he could use his harmonium to control the organ in the cathedral. This may be the first example of telematic musical performance.
These works, taken as a whole, are a celebration of humankind's long-standing engagement with aesthetic machines, and are explorations of deeply cybernetic questions pertaining to artificial creativity. But cybernetic questions alone do not explain the existence of these works. Given a musical machine, why make it humanoid? Why go to the trouble to make it rise from its seat and roll its eyes around as it plays? Even if it is just a parlour trick, why would anyone be impressed by it? Perhaps the answer is that people are fascinated by the idea of robotic musical companionship; that they could have a musical machine that not only plays music, but also substitutes for a human musician when the latter is not practical.