Purpose
The Nationwide Speech Project (NSP) corpus is a corpus of spoken language containing recordings of young male and female talkers from six regions of the United States. Speech samples include isolated words, sentences, passages, and interview speech. The purpose of the Nationwide Speech Project was to develop a corpus of spoken language that can be used in acoustic and perceptual studies of regional dialect variation in the United States (Clopper & Pisoni, 2006).
If you are interested in obtaining speech samples from the NSP corpus for use in acoustic, perceptual, or pedagogical projects, please contact Cynthia Clopper. Some of the materials are also available through the Linguistic Data Consortium. Please note that not all of the materials are currently available for distribution. Decisions regarding distribution will be made on a case by case basis.
Features of the NSP Corpus
Talkers
The talkers included in the NSP corpus were five male and five female lifetime residents of six dialect regions in the United States: New England, Mid-Atlantic, North, Midland, South, and West (see Figure 1). These regions are based on the dialect regions described in Labov, Ash, and
Boberg's (2006) Atlas of North American
English.
Figure 1. Map of the hometowns of the NSP talkers. Dark dots indicate male talkers. Light dots indicate female talkers.
Apart from gender and regional dialect, the talkers in the NSP corpus were fairly homogeneous. Table 1 provides demographic information about the 60 talkers.
Age | 18-25 years old |
Native Language | English |
Mother's Native Language | English |
Father's Native Language | English |
History of Hearing or Speech Disorder | None |
Race/Ethnicity | White |
Materials
A range of speech materials was obtained from each talker, including isolated words, sentences, passages, and interview speech. Examples of the materials collected for the NSP corpus are shown in Table 2.
Materials Set | Examples |
hVd Words (N=10) | heed, hid, head |
CVC Words (N=76) | mice, dome, bait |
Multisyllabic Words (N=112) (from Carter & Clopper, 2002) |
alfalfa, nectarine |
High Predictability Sentences (N=102) (from Kalikow, Stevens, & Elliott, 1977) |
Ruth had a necklace of glass beads. The swimmer dove into the pool. |
Low Predictability Sentences (N=52) (from Kalikow, Stevens, & Elliott, 1977) |
Tom has been discussing the beads. She might consider the pool. |
Anomalous Sentences (N=52) (see Clopper et al., 2002) |
Bill knew a can of maple beads. The jar swept up the pool. |
Passages (N=2) | Rainbow Passage (Fairbanks, 1940) Goldilocks Passage (Stockwell, 2002) |
Interview Speech (5 minutes) | hometown, hobbies, travel experiences |
Targeted Interview Speech (N=10 target words) |
sleep, shoes, math |
Recording Conditions
All of the recordings were made in a sound-attenuated booth. Using homegrown software, the utterances were recorded in individual .aiff sound files on a Macintosh laptop at a sampling rate of 44.1 kHz with 16-bit encoding.
Acoustic Vowel Data
The acoustic vowel data summarized by Clopper, Pisoni, and de Jong (2005) is available here.
References
Carter, A. K., & Clopper, C. G. (2002). Prosodic effects on word reduction. Language and Speech, 45, 321-353.
Clopper, C. G., Carter, A. K., Dillon, C. M., Hernandez, L. R., Pisoni, D. B., Clarke, C. M., Harnsberger, J. D., & Herman, R. (2002). The Indiana Speech Project: An overview of the development of a multi-talker multi-dialect speech corpus. Research on Spoken Language Processing Progress Report No. 25 (pp. 367-380). Bloomington, IN: Speech Research
Laboratory, Indiana University.
Clopper, C. G., & Pisoni, D. B. (2006). The Nationwide Speech Project: A new corpus of American English dialects. Speech Communication, 48, 633-644.
Clopper, C. G., Pisoni, D. B., & de Jong, K. (2005). Acoustic characteristics of the vowel systems of six regional varieties of American English. Journal of the Acoustical Society of America, 118, 1661-1676.
Fairbanks, G. (1940). Voice and Articulation Drillbook. New York: Harper.
Kalikow, D. N., Stevens, K. N., & Elliott, L. L. (1977). Development of a test of speech intelligibility in noise using sentence materials with controlled word predictability. Journal of the Acoustical Society of America, 61, 1337-1351.
Labov, W., Ash, S., & Boberg, C. (2006). Atlas of North American English. Berlin: Mouton de Gruyter.
Stockwell, P. (2002). Sociolinguistics: A Resource Book for Students. London: Routledge.
Links to Related Pages
Lay Language Paper on the NSP for the Acoustical Society of America
The Do You Speak American? Project at PBS
Linguistic Atlas Projects in the United States
Speech Accent Archive at George Mason University
International Dialects of English Archive at the University of Kansas
Contact Information
For more information about the NSP or to obtain materials from the NSP
corpus, please contact Cynthia Clopper (clopper.1 AT osu.edu).