The Center for Grassroots Oversight

This page can be viewed at http://www.historycommons.org/entity.jsp?entity=matsumi_suzuki_1


Profile: Matsumi Suzuki

Related Entities:

Matsumi Suzuki was a participant or observer in the following events:

The release of an audio message by a man thought to be Osama bin Laden (see November 12, 2002) sparks several publications to run stories about the authentication of the voice on the tape. These articles make several points about voice analysis of apparent bin Laden recordings:
bullet Machine analysis: Some aspects of voice identification are done my machine. Voice authentication software measures the acoustic qualities of a person’s voice, such as pitch, loudness, basic resonances, frequency, and amplitude. (Knight 11/13/2002; Kenneally 11/15/2002) This produces spectrographic information and can also be used to look for specific features of a voice, such as a nasal quality. In addition, every person creates the same sounds using a slightly different set of basic pitches, so the set of frequencies in bin Laden’s vowels, like those in “ea” from “fear,” will be marginally different from anyone else’s. By examining this frequency detail for every vowel and comparing them to previous examples, a machine analysis can tell if they are the same and were all said by him. (Kenneally 11/15/2002) However, “People hardly ever pronounce the same word the same way twice, even in the same utterance,” says Robert Berkovitz, a speech analyst with Sensimetrics Corp. (CBS News 11/13/2002)
bullet Human analysis: Some aspects of voice identification are done by humans, who are, according to Slate, “very good at doing the kind of thing most people do subconsciously—telling if someone comes from a particular region by recognizing basic vowel and consonant qualities.” For example, a human analyst can tell whether the “Ye” sound in “Yemen” is of the right length and stress for bin Laden’s dialect. (Kenneally 11/15/2002) Experts listen to previous recordings of bin Laden, and compare them syllable by syllable. (Knight 11/13/2002; Kenneally 11/15/2002) Experts can also verify whether words on a tape generally match those uttered by someone of bin Laden’s age and educational background. (Kenneally 11/15/2002)
bullet Quality of tape: According to Slate, the November tape is “allegedly very noisy and possibly went down a phone line at some point.” (Kenneally 11/15/2002) However, the New Scientist reports, “Voice analysis experts say the quality of the recording appears good enough to determine if the recording is genuine.” It also quotes Steve Cain of Forensic Tape Analysis, a company that received snippets of the tape from US media, who says, “It seems like it is at least clear enough and there’s enough amplitude of that unknown speaker’s voice that if you had a known sample of bin Laden it would be possible.” (Knight 11/13/2002)
bullet Splicing: Analysis can determine whether a tape is spliced together. Potential red flags include hitches in timing and rhythm, removal of background noise, and different pitch to accommodate for differences in background noise. (Kenneally 11/15/2002)
bullet It makes no difference to voice analysis what language a recording is in. (CBS News 11/13/2002)
bullet Uncertainty: The New Scientist quotes Tomi Kinnunnen, an expert in computer analysis of speech at the University of Joensuu, Finland, as saying: “There is always the possibility of error.… But if you have a clean sample with little noise, you can quite reliably say [who it is].” (Knight 11/13/2002) However, according to Slate, human and machine analyses can be “formidable,” but “neither type of analysis can say with 100 percent certainty that the speaker on the tape is bin Laden or anyone else.” (Kenneally 11/15/2002) CBS finds that intelligence analysts are convinced the tape is from bin Laden, but “they will never be sure,” because “Computer voice analysis lacks the accuracy of fingerprint or DNA identification and can be hamstrung by a skilled impersonator or low-quality recording.” “You can say with some probability, but you can never be sure,” says Kenneth Stevens, a Massachusetts Institute of Technology expert on speech analysis and synthesis. “Where there’s a combination of strong motivation and relatively weak science, there’s an opportunity for deception,” adds Berkovitz. “You can’t put the voice in a slot and have it come out saying, ‘This is Joe Smith.’” (CBS News 11/13/2002)
bullet One analyst, Matsumi Suzuki of Japan Acoustic Lab, Tokyo, says that, although the recording seems genuine, the speaker sounds ill. (Knight 11/13/2002)


Creative Commons License Except where otherwise noted, the textual content of each timeline is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike