The features of phonemes result from the ways in which they are articulated. What properties of the acoustic stimulus encode these articulatory features? This issue has been particularly well researched in the case of voicing. In the pronunciation of such consonants as /b/ and /p/, two things happen: The closed lips open, releasing air, and the vocal cords begin to vibrate (voicing). In the case of the voiced consonant /b/, the release of air and the vibration of the vocal cords are nearly simultaneous. In the case of the unvoiced consonant /p/, the release occurs 60 ms before the vibration begins. What we are detecting when we perceive a voiced versus an unvoiced consonant is the presence or absence of a 60-ms interval between release and voicing. This period of time is referred to as the voice-onset time. Similar differences exist in other voiced-unvoiced pairs, such as /d/ and /t/. Again, the factor controlling the perception of a phoneme is the delay between the release of air and the vibration of the vocal cords.
Lisker and Abramson (1970) performed experiments with artificial (computer-generated) stimuli in which the delay between the release of air and the onset of voicing was varied from –150 ms (voicing occurred 150 ms before release) to +150 ms (voicing occurred 150 ms after release). The participant’s task was to identify which sounds were /b/’s and which were /p/’s. Throughout most of the continuum, participants agreed 100% on what they heard, but there was a sharp switch from /b/ to /p/ at about 25 ms. At a 10-ms voice-onset time, participants were in nearly unanimous agreement that the sound was a /b/; at 40 ms, they were in nearly unanimous agreement that the sound was a /p/. Be- cause of this sharp boundary between the voiced and unvoiced phonemes, per- ception of this feature is referred to as categorical. Categorical perception is the perception of stimuli as belonging in distinct categories and the failure to perceive the gradations among stimuli within a category.
Other evidence for categorical perception of speech comes from discrimi- nation studies (see Studdert-Kennedy, 1976, for a review). People are very poor at discriminating between a pair of /b/’s or a pair of /p/’s that differ in voice- onset time but are on the same side of the phonemic boundary. However, they are good at discriminating between pairs that have the same difference in voice- onset time but one item of the pair is on the /b/ side of the boundary and the other item is on the /p/ side. It seems that people can identify the phonemic category of a sound but cannot discriminate sounds within that phonemic cat- egory. Thus, people are able to discriminate two sounds only if they fall on dif- ferent sides of a phonemic boundary.
There are at least two views of exactly what is meant by categorical per- ception, which differ in the strength of their claims about the nature of percep- tion. The weaker view is that we experience stimuli as coming from distinct categories. There seems to be little dispute that the perception of phonemes is categorical in this sense. A stronger viewpoint is that we cannot discrimi- nate among stimuli within a category. Massaro (1992) has taken issue with this viewpoint, and he has argued that there is some residual ability to dis- criminate within categories. While there is discriminability within catego- ries, it is typical to find that people can better make discriminations that cross category boundaries (Goldstone & Hendrickson, 2010). Thus, there is increased discriminability between categories (acquired distinctiveness) and decreased discriminability within categories (acquired equivalence).
Another line of research that provides evidence for use of the voicing fea- ture in speech recognition involves an adaptation paradigm. Eimas and Corbit
(1973) had their participants listen to repeated presentations of the sound da, which involves the voiced consonant /d/. The experimenters reasoned that, if there were a voicing detector, the constant repetition of the voiced consonant might fatigue it so that it would require a stronger indication of voicing. They presented participants with a series of artificial sounds that spanned the acous- tic range across distinct categories of phonemes that differed only in voicing— such as the range between ba and pa (as in the Lisker & Abramson, 1970, study mentioned earlier). Participants then indicated whether each of these artificial stimuli sounded more like ba or more like pa. Eimas and Corbit found that some of the stimuli participants would normally have called the voiced ba, they now called the voiceless pa. Thus, the repeated presentation of da had fatigued the voiced feature detector and raised the threshold for detecting voicing in ba, mak- ing many former ba stimuli sound like pa.
Although there is general consensus that speech perception is categorical in some sense, there is considerable debate about what the mechanism is behind this phenomenon. Some researchers (e.g., Liberman & Mattingly, 1985) have argued that this reflects special speech perception mechanisms that en- able people to perceive how the sounds were generated. Consider, for instance, the categorical distinction between how voiced and unvoiced consonants are produced—either the vocal cords vibrate during the consonant or they do not. This has been used to argue that we perceive voicing by perceiving how the con- sonants are spoken. However, there is evidence that categorical perception is not tied to humans processing language but rather reflects a general property of how certain sounds are perceived. For instance, Pisoni (1977) created nonlinguistic tones that had a similar distinguishing acoustic feature as present in voicing—a low-frequency tone that is either simultaneous with a high-frequency tone or lags it by 60 ms. His participants showed abrupt boundaries for speech signals. In another study, Kuhl (1987) trained chinchillas to discrimi- nate between a voiced da and an unvoiced ta. Even though these animals do not have a human vocal track, they showed the sharp boundary between these stim- uli that humans do. Thus, it seems that categorical perception depends on neither the signal being speech (Pisoni, 1977) nor the perceiver having a human vocal system (Kuhl, 1987). Diehl, Lotto, and Holt (2004) have argued that the pho- nemes we use are chosen because they match up with boundaries already present in our auditory system. So it is more a case of our perceptual system determining our speech behavior than vice versa.
■ Speech sounds differing on continuous dimensions are perceived as coming from distinct categories.
search here
Friday, 15 March 2019
Perception- Categorical Perception
Cognitive and effective processes
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment