The Voice in the Soot: Humanity's Earliest Known Recording

Image extracted from the video Let There Be Sound, Part 4 by First Sounds / CC BY.

This is the first known dated recording of the human voice, recorded in 1860 but only played back for the first time in 2008:

Au Clair de la Lune, 1860

Au Clair de la Lune, March 2008 release by First Sounds / CC BY.

Creepy, isn’t it? It reminds me of photographs from around the same era, where people sometimes had a zombie-like quality about their faces. The recording is a line from the French folk song Au Clair de la Lune – sung, it seems, by a woman or child. The history of its recording, and the process that modern scientists went through to play it back, is an intriguing story that is the focus of this post.

Today we take it for granted that we can enjoy our favourite music or videos anytime we wish. But people living 200 years ago had no such luck – if you wanted to listen to music, you would have had to play it yourself, or go to a concert. French printer and tinkerer Édouard Léon-Scott de Martinville (1817-1879) felt particularly strongly about this, lamenting the fact that the great performers of the stage would die without leaving behind a single trace of their gift to future generations. His fascination with the idea of being able to capture fleeting sounds in a permanent way led him to invent the phonautograph.¹

Édouard-Léon Scott de Martinville from First Sounds / CC BY.

How the Phonautograph Worked

In the previous post we saw how sound waves are formed from the back-and-forth vibration of air molecules, and are detected by our ear when they cause the eardrum to vibrate in the same pattern. If we could somehow record the pattern of this vibration, we would have a recording of the sound.

Drawing of phonautograph — Image of Phonautograph by First Sounds / CC BY.

Scott modelled his machine after the human ear, using a horn to collect the sound (imitating the ear canal) and a parchment membrane to play the role of the eardrum. To the membrane he attached a stylus made of pig bristle, positioned so that it touched a piece of soot-coated paper wrapped around a cylinder. The cylinder was turned by hand so that the bristle would trace a line in the soot. In the presence of sound waves, the vibrations of the membrane were transferred to the stylus, which vibrated perpendicular to the direction of motion of the paper. The stylus thus traced out the pattern of the vibrations over time, producing a pictorial record of the sound:

Scott was interested in the idea of being able to read the sounds drawn on paper, in order to see the nuances and expressiveness of great oral performances. It seems that the possibility of playing back his recordings never occurred to him. It would be left to inventors Charles Cros and Thomas Edison to develop separate methods of doing so seventeen years later, though Edison beat Cros to world fame. Edison’s phonograph recorded sound in much the same way as Scott’s machine, except that the trace was indented in tin foil by a needle, instead of drawn in soot. The more robust tin foil could be played back by reversing the process – a needle would follow the recorded trace, and the sound would be reproduced by a membrane connected to the needle.

Edison’s phonograph took the world by storm in 1877, but Scott was left in obscurity, bitterly bemoaning the failure of Edison and of the media to acknowledge his contribution. His soot traces, or phonautograms, were left to languish in the files of the Paris patent office for a century and a half – until they were re-discovered in 2007 by audio historians.

Resurrecting the Voice

How do you take a fragile line traced out in soot and play it back? One can easily imagine that we could scan the image into a computer, which could convert the up-and-down undulations of the trace into a series of signals, which then tell the membrane of a loudspeaker to vibrate according to the trace pattern. (Even this turned out not to be as simple as it might sound, due to the imperfect nature of the trace.²)

But how fast should one do the playback – that is, how much time does a segment of a particular length on the phonautogram correspond to? This problem is further compounded by the fact that Scott’s cylinder was cranked by hand, so that its rotational speed was not constant. It turns out that this variation in speed causes large fluctuations in frequency, enough to make a recorded melody unintelligible. This is what the Au Clair de la Lune phonautogram sounded like when played straight back at constant speed, before any processing:

Au Clair de la Lune, Wobbly Playback

Audio extracted from Humanity’s First Recordings of its Own Voice, Act 7 by First Sounds / CC BY.

I rather think the fluctuating pitch gives it a bit of an alien-music-type feel…

Fortunately, Scott had adopted the practice of simultaneously recording the trace of a tuning fork next to the trace of the sound he was interested in. If you take another look at the close-up image of the phonautogram, you’ll see that the traces come in pairs – the lower trace in each pair is the constant vibration of the tuning fork, while the upper trace is the singing voice (evidently Scott had also invented stereo recording!). He was also kind enough to have noted on the phonautogram the frequency of the fork’s vibration – 500 simple vibrations per second.

The research team could thus correct for the wobbliness, by varying the speed of playback to keep the frequency of the reference trace at a constant 500 Hz. This is the result:

Au Clair de la Lune, Even Playback

Audio extracted from Humanity’s First Recordings of its Own Voice, Act 7 by First Sounds / CC BY.

You can now recognise the melody of the folk tune! But the soot recording is hardly of high fidelity, full of pops and clicks. Modern audio processing techniques allowed the researchers to remove the clicks to some extent, and apply equalisation filters to get rid of unwanted noise, to obtain the clip we heard at the beginning of this post:

Au Clair de la Lune, Cleaned Up

Au Clair de la Lune, March 2008 release by First Sounds / CC BY.

David Giovannoni, one of the researchers, described thus his reaction to hearing the voice sing to us from 148 years ago: “When I first heard the recording as you hear it … it was magical, so ethereal… The fact is it’s recorded in smoke. The voice is coming out from behind this screen of aural smoke.”³ The haunting recording, and the image it conjured up of an unknown woman singing to us through the veil of time, quickly captured the imagination of the public.

But the story doesn’t end here. A few months after they released this recording, the team tried to play back another phonautogram from the same collection, a recording of spoken Italian. Played back at 500 Hz, it sounded like the Chipmunks talking! When they tried it at half the speed, though, it sounded more like normal speech⁴:

Aminta Opening Lines, 500 Hz Playback

Aminta Opening Lines, 250 Hz Playback

Opening Lines from Tasso’s Aminta, played back at 500 Hz (audio extracted from Humanity’s First Recordings of its Own Voice, Act 7) and at 250 Hz by First Sounds / CC BY.

It turns out that when people wrote “500 simple vibrations per second” in the nineteenth century, they meant 250 Hz in modern terminology. What happens when the Au Clair de la Lune recording is played back at the correct speed?

Au Clair de la Lune, Correct Speed

Au Clair de la Lune, May 2010 release by First Sounds / CC BY.

Evidently this is the voice of a man, singing the song very slowly. In fact, Scott implied in his notes accompanying the Italian recording that he was the one speaking – it was therefore likely that he had been the singer of the song as well. It doesn’t conjure up an image quite as romantic as that of an unknown girl singing to us from long ago — but perhaps it is rather more fitting that the first ever recording of a human voice should be the voice of the inventor himself!

The researchers had in fact already known of the ambiguity in the phrase “500 simple vibrations per second” when they first published the recording. They had settled on the faster playback speed at first because it had sounded the most right to them, perhaps due to the unusually slow tempo of Scott’s singing. In my opinion, though, the song recording played at the slower speed sounds much less creepy than the original release. The otherworldly quality of the latter might be due to the unnatural formant shift caused by the incorrect playing speed – the same reason that speech sounds like the Chipmunks when played back twice as fast as normal. I’ll talk about formants in a future post on how the human voice works…

If you’re interested in hearing more of Scott’s recordings, or learning more about the researchers, check out the website of First Sounds, the group who did the playback.

¹Édouard-Léon Scott de Martinville (n.d.). [Source]

²Humanity’s First Recording of Its Own Voice (2014). [Source]

³ M. Smit and agencies, Telegraph.Co.Uk (2008). [Source]

⁴NPR.Org (n.d.). [Source]

2 Comments on "The Voice in the Soot: Humanity’s Earliest Known Recording"

Sort by: newest | oldest | most voted

Guest

landwaster

Share On Twitter Share On Google

Using Audacity, on the very final audio clip at the end of this article (Au Clair de la Lune, May 2010 release) apply Effect -> Sliding Stretch -> Initial Tempo 273, Final Tempo -59. This makes the tempo match how the song is actually sung and you can almost make out the words.

Hanner

Did you happen to post a recording of this somewhere?

How the Phonautograph Worked

Resurrecting the Voice

Leave a Reply