Publications of Dr. Martin Rothenberg:

The Control of Air Flow During Loud Soprano Singing

Martin Rothenberg, *Donald Miller, Richard Molitor, and *Dolores Leffingwell
Journal of Voice, Vol. 1. No. 3, pp. 262-268, 1987.

Department of Electrical Engineering, *Department of Music, Syracuse University, Syracuse, New York

Previous research on the special characteristics of the professional singing voice has at least partially explained how singers can commonly use much higher lung pressures than nonsingers without vocal damage or excessive air flow during the voiced sounds. In this study, the control of air flow during the unvoiced consonants is examined for an operatic-style soprano. It was found that this singer could maintain a low average air flow during the consonants even though the lung pressure reached values over five times those used during normal conversational speech. The air flow was kept low primarily by the use of a number of mechanisms involving rapid, accurate, coordinated valving of the air flow at the point of articulation and at the glottis.
Key Words: Singing voice Soprano-Pressure-Air flow-Glottis.

During the past year, our laboratory has been especially interested in what happens in loud soprano singing, especially toward the top of the range, where lung pressures are quite high. Figure 1 shows how air pressure varies in different types of vocalization. In conversational speech, lung pressure is generally below ~10 cm H2O. In loud speech, one may use pressures of ~10 - 15 cm H2O (1). In loud singing, however, pressures can be >40 or 50 cm H2O. A typical nonsinger with a "weak" but otherwise medically normal voice who tried to vocalize with pressures as high as 40 cm H2O, would soon begin to experience throat irritation, and some degree of hoarseness might be expected to follow. In addition, one might expect an excessive expenditure of air at such high pressures, which would shorten the length of line that could be sung in one breath. How then does a singer control the air flow at high lung pressures so that the voice is not hurt? We are considering only the control of air flow and not the numerous other esthetic factors that may be important in singing. Air flow can be reduced in several different ways during voiced segments, as described in Fig. 1 (bottom). One way to reduce airflow is to increase the adductive tension, that is, to produce a pressed voice, but if continued over a prolonged period, a very pressed voice may fatigue and perhaps injure the adductive laryngeal musculature. A professional singer may be able to do something different to reduce air flow.

In male voices, and perhaps in female voices in the lower ranges, we have shown in previous discussions and articles that air flow can be controlled by using the principle of inertive acoustic loading, in which the inertia of air flow in the respiratory system and vocal tract, especially nearer to the larynx, is used to reduce air flow and strengthen the voice quality. Thus, a principle does exist that is probably active in the lower pitch ranges and can reduce the air flow without sacrificing carrying power of the voice or causing vocal problems. Sopranos in the higher pitch ranges can use yet an- other principle to reduce air flow safely. This is the principle of tuning the first formant (the lowest vocal tract resonance) to closely match the voice fundamental frequency (2). This mechanism can reduce both the average air flow and the peak air flow during the glottal cycle.



It is known, and Johan Sundberg (3) has documented with some precision, that sopranos in their higher pitch ranges tend to tune the lowest vocal tract resonance to the pitch being sung. The effect of such tuning on air flow was not generally recognized until recently, however. This effect is illustrated in Fig. 2. If the glottis-the space between the vocal folds-opens and closes regularly, with a complete, or nearly complete, closure between the open periods, and if the vocal tract tuning is such that the first formant is tuned to the voice funda- mental frequency, the returning air pressure pulse from the previous glottal open periods is in such a phase as to oppose the air flow through the glottis and reduce air flow. This results in a louder voice being produced by means of a reduced air flow. The solid line in Fig. 2 represents what air flow would be if this principle were not active, that is, if this tuning did not occur. With the tuning, when the vocal folds are open for the second air pulse, the returning air pulse from the first pulse (and any previous pulses) causes a back pressure that drives down the air flow.

The mechanical soprano in Fig. 3 can be used to illustrate this principle, since readers probably have an intuition for this resonance effect from daily life but may not yet have made the proper associations. The spring and weight represent the soprano's vocal tract, with the glottis at the hand. The singer's mouth is the end of the weight, with the motion of the weight analogous to air flow at the mouth. The hand going down simulates an air pulse out of the glottis. If the hand is moved up and down very slowly, the air going out of the mouth (the motion of the weight) follows the air coming out of the larynx, and there is no returning pressure pulse (compression of the spring). But as the hand is moved taster and approaches the resonant frequency of the spring, the motion of the hand and that of the weight becomes out of phase; when the hand goes down, the weight is coming back up. As the hand goes down to simulate the air coming out of the glottis, the weight is compressing the spring and pressing back up. thus reducing hand motion (reducting the glottal air flow).

Another demonstration is possible using a child's toy: a rubber ball connected to a paddle by a long rubber string. To keep the ball rebounding from the paddle, the operator must move the paddle at the resonant frequency of the string and ball. The ball will then always come back in time to oppose the motion of the paddle.

Lately, we have also been investigating how air flow might be reduced in consonants. If a soprano uses four or five times the lung pressure in singing that she uses in speech one may also expect a lot more air flow during those consonants for which the vocal folds abduct. Consider first an unvoiced consonant in which the glottis and vocal tract are both open, as in the /h/ of /aha/. If this is said slowly in a very loud voice, during the /h/ one can lose most of the air stored in the lungs. In saying /apa/, produced with an aspirated /p/, one could lose a great deal of air during the release of the /p/ if the lung pressure were high.

What techniques do trained singers use to modify their articulations to control .this air flow in order not to lose an excessive amount of air during un- voiced consonants produced with an abduction of the vocal folds? Figure 4 shows how air flow is basically controlled during such unvoiced consonants. In the simplest case. that of an /h/ , no strong vocal tract constriction reduces the air flow. Figure 4 shows what happens to glottal area and the resulting air flow during /aha/ .At the onset of the /h/, the glottal area increases, with the glottal air flow pulses getting more and more breathy and progressively weaker (after a possible initial small increase in amplitude), until finally the vocal fold oscillations may stop altogether, even though there is air flow through the glottis. Then, as the vocal folds adduct once again, the air flow decreases and vocal fold oscillations resume. During a fully articulated /h/, maximum air flow occurs near the middle of the /h/, along with no voice. The air flow is high be- cause the remainder of the vocal tract is relatively open, so there is no constriction above the glottis to reduce the flow (although this is not quite true in all cases). Thus, in an /h/, the rise and fall of air flow generally follows the increase and decrease in glottal area; as the glottis opens and then closes, the average air flow increases and then decreases.



During most other unvoiced consonants, however, a vocal tract constriction is made while the glottis is open. For example, if one makes a constriction between the tongue tip and the forward edge of the palate, or alveolar ridge, /asa/ can be produced instead of /aha/ .Air flow will start to increase as the larynx opens but, as the constriction for the /s/ is made, air flow will decrease again, only to increase once again as the tongue separates from the palate. The final decrease in air flow is caused by readduction of the vocal folds. Thus, two simultaneous motions, the tongue motion to make the /s/ and the opening-closing motion of the larynx, must be closely coordinated. Keep in mind the resulting pattern of air flow for the /s/, as shown in Fig. 4, when the pattern appears again in our measurements. Air flow increases, with a pronounced dip in the middle that coincides with the maximum constriction of the /s/.


An example of an unvoiced stop consonant is /t/. The primary difference between the production of an /s/ and a typical production of /t/ is that a complete vocal tract closure is attained in the /t/, so that the articulatory constriction reaches zero area. Thus, the sequence /ata/ results when a closure of the tongue tip to the alveolar ridge is added to an approximately simultaneous abduction-adduction gesture of the vocal folds. The resulting air flow pattern will be an increase in air flow toward the end of the vowel, as the vocal folds begin to abduct, and then a rather sudden drop to zero flow as the tongue blocks the air flow. The explosion of the /t/ then results in the even more sudden onset of a period of high air flow (the aspiration). The air flow finally decreases again as the vocal folds readduct to normal voicing. In this way, adding a tongue closure at the alveolar ridge changes /aha/ to /ata/.

Figure 4 (right) lists the various ways in which air flow can be kept low during intervocalic unvoiced consonants similar to those shown in Fig. 4 if lung pressure must be kept high for proper production of neighboring vowels. These different methods are later discussed further in conjunction with an actual example.
Actual measurements during speaking and singing are shown in Figs. 5 and 6. The traces presented are from one of the two singers tested and are typical of those from both singers in the aspects discussed. Figures 5 and 6 are examples of our soprano's rendition of the very popular phrase "I saw Hap pat his hidden map again." (The first thing one must learn in voice research is how to make up sentences that have all the sounds that one wishes to measure.) The melody chosen for the sung phrases had a general rising trend followed by a falling trend, with a maximum pitch on the word "his."



Figure 5 illustrates a spoken version of the sentence. The lower trace is the air flow, as measured at the mouth with a wire-screen pneumotachograph mask designed to produce only a small distortion of the voice. The top trace is a measure of lung pressure obtained by placing a miniature pressure sensor in the esophagus by means of a catheter through the nose and down the pharynx, as reported by Leanderson et al. (pp. 258-261). This method yields only a rough approximation of tracheal pressure, but one that is sufficient to show the general variations in pressure from sound to sound and the general pressure level.

The air flow trace shows the different patterns we have discussed. As predicted by our theoretical model, in the /s/ in "saw" the air flow first rises, then dips as the tongue constriction is made. and rises again as the constriction is released. In the /h/, air flow merely rises and falls. Air flow is quite large in the /h/, reaching almost 2 L/s.

Two types of /p/ occur in the sentence. Because of the stress rules of English, the location of the /p/ in "map again", at the end of a word and before an unstressed vowel causes it to be produced without much aspiration. This means that not much air flow occurs in the explosion of the /p/. On the other hand, in English, a highly aspirated /p/ occurs when two /p/s adjoin across a word boundary and are followed by a stressed vowel, as is the case in the /p/ in "Hap" and the /p/ of "pat." The expected result is a very high air flow in the release of the /p/ in "pat." Throughout the spoken sentence, air flow during the vowels is very small.

Consonantal air flow can be reduced in the five ways listed above in Figure 4. First, lung pressure can be reduced during consonants. During the aspirated double /p/, there appeared to be no reduction in pressure in the spoken version of the sentence; indeed, pressure rose slightly in preparation for the aspiration of the /p/. (If respiratory driving forces were unchanged, lung pressure would drop during the aspiration, as high air flow deflates the "balloon" formed by the lungs. This pressure decrease can be compensated for by having the respiratory muscles produce a slight increase in pressure before the aspiration.) Aside from such small changes. however, the lung pressure during the spoken voice stayed approximately constant.

Consonantal air flow can also be reduced by using a tighter articulatory constriction, holding the constriction slightly longer to allow more time for the vocal folds to readduct, opening the glottis less or closing the glottis faster. These other mechanisms can be illustrated by comparing pressure and air flow patterns of the spoken phrase with those of the phrase when sung at the highest pitch range we used-going between D5 and G5, then back to D5 (587- 784 Hz). The two repetitions of the sung phrase in Fig. 6 show that our findings were consistent and repeatable.



Figure 6 shows that lung pressure was quite high and tended to increase with pitch in singing between D5 and G5. Air flow in the sung consonants is generally not any more than in those during speech, however, despite the higher levels of lung pressure. Indeed, air flow at some points is actually less than during speech. Unlike the spoken example, during the double /p/ between Hap and pat, pressure actually decreased slightly, to reduce air flow. We can hypothesize that the singer unconsciously tried to bring down the pressure slightly during the stop closure in anticipation of the aspiration. If this hypothesis is true, however, she could not bring the pressure down too much, because there is a limit to the speed with which the respiratory musculature can vary lung pressure (4). The rate of variation shown in Fig. 6 is about the fastest at which pressure can be changed.

Although this double /p/ sung in the highest pitch range tested did show a small decrease in pressure which would reduce air flow in the aspiration, similar decreases in pressure did not occur at the lower pitches. Apparently, this technique for reducing air flow occurred only at extremes in lung pressure for this singer. The peak lung pressure in the sung phrase was 40 cm H2O, which is at least four times the pressure the subject uses during conversational speech and significantly more than the peak pressures attained in the lower pitch ranges.

Other techniques were apparently used to limit air flow during the sung examples. The abduction- adduction gesture at the glottis was much faster in the singing; instead of the 200 ms taken in speech, the time taken in singing was only ~125 ms. This is about as fast as the larynx can open and close (5). In an example of another technique to reduce air flow, with the sung /p/ in pat, the timing was such that the larynx closed just as the lips were opening, to make a very short aspirated period as compared with the highly aspirated /p/ in the spoken version. This means that the timing during singing was very precise: There appeared to be only just enough aspiration to identify it as a /p/ rather than an unvoiced /b/.

The air flow pattern during the sung /s/ was similar to the pattern during speaking. The subject tended to lose a little more air during the final part of the sung /s/ but air flow was generally not much more than in conversational speech, despite the higher lung pressure. This indicates that a somewhat more constricted articulation was used during singing.



Last, air flow patterns between the sung repetitions were consistent. For example, the laryngeal- articulatory coordination in the two examples of the /s/ in "saw" were comparable. The coordination of the opening gesture of the larynx and the closing gesture of the tongue was almost the same in both repetitions even though the movements were being made at close to a physiologically maximum speed. The coordinations were also similar between repetitions during the /sh/ sequence.

The comparison of patterns made by the oscillations in air flow caused by the vibrato is also very interesting. For example, the vibrato patterns during the final syllables were almost the same between sung repetitions, including the timing of the vibrato with respect to the articulation of the /g/ in "again."

Although vibrato-synchronous oscillations in air flow could also be caused by other factors, such as vertical larynx movement or an ab-adductory component in the vibrato, we believe that a significant component of the strong variations in air flow during the vibrato was caused by the tuning of the lowest vocal tract resonance to match the fundamental frequency of the voice. At the points in the vibrato cycle in which the voice fundamental frequency approaches the vocal tract resonance, a lower air flow occurs; when it moves away from the resonance, a higher air flow occurs. That air flow variations caused by the vibrato may be very large shows how important the tuning factor could be in controlling air flow.

In summary, our research indicates that a soprano singing loudly toward the upper part of her range must use a number of techniques to keep air flow at a level similar to that in speech, even though the lung pressure attained in singing may be more than four times that used in speech. The high pressures are required for the proper production of the vowels, but will "spill over" to the unvoiced consonants because of the limitation on the speed at which the respiratory musculature can vary lung pressure. Thus, both vowels and consonants must be adapted to the higher pressures. It is likely that improper control of air flow can have significant esthetic and/or medical implications.

Acknowledgments: The research discussed here was supported by a research grant from the National Institutes of Health. Bethesda,.MD.


1. Schutte H. The efficiency of voice production. Gröningen. Netherlands: Kemper, 1980.

2. Rothenberg M. Cosi fan tutte and what it means or nonlinear source-tract acoustic interaction in the soprano voice and some implications for the definition of vocal efficiency. In: Sasaki C. Baer T, Harris K. eds. Vocal fold physiology: laryngeal function in phonation and respiration, San Diego. CA: College Hill Press. 1986;254-63.

3. Sundberg J. Formant technique in a professional soprano singer. Acustica 1975:32:89-96.

4. Rothenberg M. The breath-stream dynamics of simple-released-plosive production. Bibliolteca Phonelica. 6. Basel. Switzerland: S. Karger. 1968.

5. Rothenberg M. The glottal volume velocity waveform during loose and tight glottal adjustments. In: Rigault A. Charbonneau R. eds. Proceedings of the Seventh International Congress of Phonetic Sciences, The Hague: Mouton. 1972: 380-8.

Papers online
Glottal Enterprises