CSD 232 Home       Distinctive Features

Distinctive Feature Theory

Mark Tatham

[copyright © 1999 Mark Tatham]


Classical Phonetics used the place-manner classification system for consonants and the high-low / front-back system for vowels. The main purpose here was clearly to enable the phonetician to specify how particular sounds were made with respect to their articulation. There was, however, an important spin-off from these systems: it became possible to use the features or parameters of the classification system to label whole sets of sounds or articulations. Thus we might refer to: 'the set of all plosives' (seven in English - which is the seventh?), or 'the set of all voiced plosives' (three in English), or 'the set of all voiced alveolar plosives' (one only in English) - and so on, cutting horizontally and vertically around the consonant matrix. Similarly, for vowels, 'the set of all front vowels', or 'the set of all rounded vowels', and so on.

As a consequence of being able to label sets of sounds in this way it became possible to describe the behaviour of various sets. So, for example, it was possible to say that the set of voiced plosives devoice in word-final position or that all vowels lengthen before voiced plosives in the same syllable, and so on. So rules no longer had to be about the contextual behaviour of individual sounds - but in terms of how sets or classes of sounds behave. We now had the ability to capture and express generalisation - an important theoretical principle in linguistics: generalisations must be expressed whenever possible.

It was not until Transformational Generative Grammar came along, though, that these generalisations became formalised in phonological theory. Morris Halle's 'Sound Pattern of Russian' [The Hague: Mouton, 1959] was really the first influential textbook in modern phonological theory (just two years after Noam Chomsky's 'Syntactic Structures' [The Hague: Mouton, 1957], the first influential textbook in modern syntactic theory). The Generative Phonologists adopted the theory of distinctive features from the earlier Prague School of Linguistics (see N.S. Trubetskoy Grundzüge der Phonologie , Göttingen: Vandenhoeck and Ruprecht [1958] - appearing also in translation: Principes de Phonologie and Principles of Phonology) - a much more formal representation than that of the classical phoneticians.

There is a good description of modern DF theory in the book: Understanding Phonology by C. Gussenhoven and H. Jacobs [London: Arnold, 1998], Chapter 5; and the landmark description in The Sound Pattern of English by N. Chomsky and M. Halle [New York: Harper and Row, 1968 and Cambridge, Mass.: MIT Press, 1991], Chapter 7. The classic text on DF theory is R. Jakobson, G. Fant and M. Halle's Preliminaries to Speech Analysis [Cambridge, Mass.: MIT Press, 1963].

Distinctive Feature Theory 

The use of distinctive features in phonology enables us to capture 'natural classes', and, by extension, to generalise regularly occurring phenomena and to formulate predictions about the behaviour of class members. If we wanted to hypothesise about human processing of phonology we would use this idea to suggest that human beings process the patterns of phonology as part of speech planning in terms of these classes rather than in terms of individual segments. The regularity of patterning in phonology is part of the evidence for this claim - but the claim is more solid when based on the evidence that when the users of a language make up new words they do so by producing utterances which obey the rules of the natural classes their sounds fall into.

There have been various sets of distinctive features proposed as the parameters of segment description and classification. The original set appeared in Jakobson, Fant and Halle (see above), and consisted of around 14 features. Chomsky and Halle (see above) had around 45 features, explaining that they found the original set of 14 somewhat inappropriate for characterising some subtleties in phonology.

Most modern phonologists argue (following JFH) for a binary system of indexing features: a segment either possesses or does not any one particular features. Clearly, with a binary system of indexing the maximum number of features needed to uniquely classify the sounds of a language like English (with around 45 phonemes) would be six, giving us 26 or 64 segments. More would be needed to uniquely classify the sounds of all the languages of the world or indeed all possible human languages. Larger sets of features were chosen because it was felt that it was appropriate to sacrifice mathematical simplicity in favour of a feature labelling system which appeared to related these phonological features with the phonetic set of Classical Phonetics. Thus the meaning of the features became more transparent.

These ideas are embodied in three principles surrounding the distinctive feature set:

  1. It should be able to characterise all contrasting segments in human languages;
  2. It should be able to capture natural classes in a clear fashion;
  3. It should be transparent with regard to phonetic correlates.

The distinctive feature set most usually found is approximately that of M. Halle and G.N. Clements 'Problem Book in Phonology' [Cambridge, Mass.: MIT Press, 1983], which is based on the Chomsky and Halle set. You can read in detail about these features in the Gussenhoven and Jacobs book. Chomsky and Halle have a lengthy description of their own set.


Redundancy is an important aspect of phonology which is captured by the use of distinctive features. Consider for example the fact that all segments in English which are [+nasal] are also [+voice]. We could say that to specify [+voice] for segments like [m] and [n] is to fail to capture this redundancy. The main distinctive feature here is the nasality - the voicing is secondary and entirely predictable: all nasal consonants are voiced. [Remember we are talking abstract phonology, not phonetics.]

Remember that one of our principles was to set up a system to capture all the segmental contrasts in the world's languages. Well, we can do that and show where there is no contrast: there is no contrast, nor possibility of contrast, where there is redundancy. If nasals are always voiced, then there cannot be a contrast with voiceless nasals. Two things follow from this in the way we use features in the theory:

Omitting feature markings where there is redundancy means literally leaving the redundant cells blank in the distinctive feature matrix. The fact of the redundancy is captured by separate rules which take the general form:

if X then Y

or, in our specific example:

if [+nasal] then [+voice].

But why would we want to capture this redundancy, except to show that it is a regularity in the way segmental features pattern? We want to do this because it is easy to show that speakers of a language know about the redundancy. Let's look at an example: there are three nasals in English, the nasal alveolar stop [n], the bilabial nasal stop [m] and the velar nasal [ng]. If we ask an English speaker to 'invent' a new nasal - say, a palatal nasal like the one found in French 'angeau' - they will also automatically make it [+voice]. It's as though they know that nasals must be voiced - which is another way of stating the rule above.

The original important text on this point was R. Stanley's 'Redundancy rules in phonology', published in the journal Language in 1967 (vol. 43). These redundancy rules were called 'segment structure rules' to contrast them with another type: 'sequence structure rules'. The latter capture a speaker's knowledge of redundancy in the specification of segments themselves in patterned sequences. Thus if we have the sequence at the beginning of a syllable in English: CCCV..., then the first C must be [s] - a completely redundant situation, since all we need to know is that there is a consonant there, followed by two others. In fact there are heavy constraints also, of course, on the remaining two consonants: the second one must be a plosive and the third must be liquid of some sort (or a semi-vowel). Some phonologists have pointed out that the onset of a syllable strings consonants together with increasing sonority until you get to the vowel nucleus (the supreme sonorous segment), followed by a coda of consonants of decreasing sonority - though there are exceptions to this principle.

Make a distinction then between the use of features for characterising the contrastive properties of phonological segments, and using them to indicate redundancy. A 'incomplete' distinctive feature matrix uses blanks to indicate redundancy (and let you know where cells are the subject of redundancy rules), whereas a 'fully specified' distinctive feature matrix has all cells filled with either a + or a - .

There is much more to Distinctive Feature Theory than dealt with here. You should consult some of the recommended textbooks on phonology to find a fuller treatment - or attend one of the Department's courses in phonology! The main thing to remember is that DF theory is a significant step forward in classification from the rather crude phonetically-based ideas of Classical Phonetics. Remembering, however, that it is essentially a concept in abstract phonology (rather than phonetics), its principal importance lies in how it lends itself to capturing the generality of phonological processes and the structure of segments in phonology.