Chapter 2 Web Topics

2.1 Measuring Sound Pressure


Biologists measure sound pressures of animal signals for one or more of the following reasons:

  • To record and characterize the temporal and spectral patterns in those signals
  • To measure absolute signal intensities
  • To use multiple microphone arrays to monitor locations of sound-producing animals
  • To undertake playbacks to animals to determine the amount of information or value of information provided by acoustic signals

In all cases, a sensor called a microphone is used to capture the sound signals and convert pressure waveforms into electrical waveforms. The electrical waveforms are then fed into measuring devices or recorded for later analysis. We examine each of these steps and approaches briefly below.

Microphone types

Broadly defined, a microphone is any sensor that responds to ambient variations in sound pressure and converts these variations into an electrical analogue that can be measured or stored. An ideal microphone would respond equally to all frequencies over a broad range, to all amplitudes, and to even the most rapidly varying waveforms. No microphone meets all of these requirements perfectly, and none works equally well for air, water, or substrate recording. One thus needs to match the microphone to the context and the task. Some basic types of microphones currently used for monitoring animal sounds include:

  • Condenser microphones: A condenser (or capacitor) microphone consists of two metal plates arrayed parallel to each other. The plate that faces the sound source must be very thin, and is often a sheet of plastic with an ultra-thin layer of metal coated onto one side. This is called the “diaphragm” of the microphone. The other plate can be thicker. An insulating ring filled with air separates the two plates and allows for slow equalizing with ambient pressure through a tiny hole. A voltage is then applied across the plates causing electrons to aggregate on one plate, and to become scarce (relative to the status when no voltage is present), on the other plate. The excess of electrons on one plate and shortage on the other depends on the voltage applied and the distance between the two plates. When a propagating sound arrives at the microphone, the pressure of the air inside the ring separating the plates stays relatively constant at ambient levels whereas the pressure outside the diaphragm rises and falls with the sound waves. When the exterior pressure is greater than inside, the diaphragm is bent into the ring cavity and closer to the second plate and when the exterior pressure drops below that inside the cavity, the diaphragm is bent outwards and away from the other plate. As the distance between the diaphragm and the second plate varies, the electrical field force between them varies, causing the appropriate numbers of electrons on each plate to change. Electrons thus move into or out of the plates, and this current can be detected as a varying voltage across a resistor in series with the voltage source. The result is an electrical replica of the sound pressure variation that can then be stored, measured, and analyzed. Condenser microphones tend to have very low electronic self-noise levels and respond similarly to all frequencies over a broad range including into the ultrasonic region. The use of very thin and small diameter diaphragms improves the response of the microphone to sounds with sudden transients and minimizes overshooting and ringing effects. These microphones do however require a carefully regulated source of polarizing voltage for the two plates, and may not function properly in high humidity. However, recent designs use the output of the condenser to modulate a radio-frequency (RF) carrier signal instead of just feeding the output into amplifiers. These RF designs result in extremely low-noise, flat frequency response, and relatively humidity-insensitive microphones that are now very popular among terrestrial field recordists. They do not work under water because the incompressibility of water limits the range of movement that would be induced in the diaphragm.
  • Electret microphones: Electret microphones work on the same principle as a condenser microphone but differ in that the thin membrane responding to sounds consists of a material that is permanently polarized electrically (in a manner similar to a magnet that is permanently polarized magnetically). No additional voltage source is thus required to detect movements of the membrane in a sound field. Unfortunately, the greater thickness required for the diaphragm slows down its movements and limits its responsiveness to higher frequencies. Newer models place the polarized material on the back plate and use the same metalized diaphragm found in condenser microphones. This makes their frequency range as good as that for a condenser microphone. Electrets are cheap to produce, and can be made very small. Some electrets are sensitive to high humidity. Electret and condenser microphones are the two most widely used types of sensors for monitoring of terrestrial animal sounds.
  • Piezoelectric microphones: This type of microphone consists of a crystal or similar material that generates a voltage when accelerated. Ceramic phonograph cartridges are a historical example. Modern uses include accelerometers (which measure the rate at which a solid substrate changes speed in a given direction when vibrating), and hydrophones (microphones used to monitor sounds under water). Piezoelectic microphones do not require a polarizing voltage, but do require amplification given the low electrical signals generated. A related device called a piezoresistive accelerometer changes its resistance as a result of pressure variation and, given an exterior source of voltage, can monitor very low frequencies propagating in solid substrates.
  • Dynamic microphones: In a typical dynamic microphone, a wire coil is attached to the inside surface of a thin plastic diaphragm. When sound waves force the diaphragm to oscillate in and out of the microphone cavity, the coil passes back and forth over a magnet generating a varying electrical current in the coil that emulates the pressure variations. Alternatively, the magnet is attached to the diaphragm and sound waves move it into and out of a fixed coil. Many studio microphones have this dynamic design. A limitation is that attaching the diaphragm to a coil or magnet limits the diaphragm to slower movements and thus lower frequencies than is possible with condenser and electret microphones. In addition, the signal generated is usually very small, requiring significant amplification before usage. On the other hand, dynamic microphones do not require a polarizing voltage, and are completely immune to humidity problems.
  • Ribbon microphones: Ribbon microphones, like dynamic microphones, produce an electric signal by letting sound waves vary the position of a metallic object in a magnetic field. However, they do this without a diaphragm by suspending a thin corrugated strip of metal between magnets. The microphone casing is open on both ends and the ribbon is set into motion by the difference in sound pressures at the two ends of the casing. As the ribbon moves in the magnetic field, it generates electrical currents that are analogues of the variations in pressure gradient at the two ends of the casing. Ribbon microphones tend to be very fragile because the supporting attachments of the ribbon must be small to maintain ribbon mobility. This limits their utility in field situations.
  • Laser vibrometers: These devices bounce a laser beam off of a vibrating surface and compare the outgoing and reflected frequencies of light. When the surface is moving towards the laser source, the reflected light is Doppler-shifted (see text for definition) to a higher frequency than that emitted; when the surface is moving away from the source, the reflected light is Doppler-shifted to a lower frequency. The device monitors these light frequency deviations over time and thus can track the changing velocity of the surface as it moves repeatedly towards and then away from the laser vibrometer. This is largely a laboratory tool as the measured surface must be relatively immobile except for the vibrations. However, it has been used successfully to measure vibrations on the surfaces of frog vocal sacs and ears, and the substrate signals of leafhoppers on plant leaves and stems.

More information: More details on these and other kinds of microphones can be found at:

Microphone directionality

Microphones differ dramatically in their directionality. An omnidirectional microphone picks up all sounds from any direction. For measurements of ambient noise levels, say in a tropical forest, this is a desirable feature. However, when recording individual animals ambient noise can significantly reduce the utility of the resulting recordings. Field recordists thus favor more directional devices that maximize the signal being generated directly in front of the microphone and reduce the amplitudes of signals arriving from other directions. There are several options available to increase directionality:

  • Cardioid microphones: These microphones have a sensitivity (polar) pattern that is heart-shaped (cardioid): sounds coming from in front of the microphone are strongly favored and sensitivity drops off steadily as the angle of the sound source moves to the sides of the microphone. Sensitivity is least for sounds behind the microphone. Most commercial microphones have a cardioid sensitivity pattern.
  • Ribbon microphone: Because a ribbon microphone measures the pressure difference at the two ends of its casing, it is inherently bi-directional. Sounds that are located to the sides of the microphone axis tend to arrive at the two casing ends with the same amplitude and phase and thus generate no movement in the ribbon. Sounds originating at either end of the microphone produce the strongest response. While they can be highly directional, ribbon microphones are often too fragile to be of broad use in field recording.
  • Hyper-cardioid (shotgun) microphones: These consist of long hollow tubes with slits cut at intervals along the length and a condenser or electret sensor at the end of the microphone furthest from the sound source. The microphone is most sensitive to sound sources located along the main axis of the tube and very insensitive to sound sources to either side. It has a second, but lower, peak of sensitivity for sound sources behind the microphone. These microphones work by canceling out sounds at the sensor that arrive both directly from the opposite end of the tube and from the slits in the side. There is no amplification of arriving signals, but only elimination of lateral noise. Shotgun mikes are widely used by field biologists to record animal sounds with moderate to high signal-to-noise ratios.
  • Parabola and microphone combinations: Another way to obtain directionality is to aim a large metal or plastic parabolic reflector at the sound source and record the reflected and focused sound waves with a microphone placed at the parabola’s focal point. This combination has the advantage over shotgun microphones that it amplifies those sounds (without adding concomitant electronic noise) that originate along the main axis of the parabola and excludes sounds from other directions. It has the disadvantage that it can become highly frequency dependent when the wavelengths of the sounds of interest are as large or larger than the parabola. Given the weight and width of very large parabolas, this method has largely been used to record small to moderately-sized animals (which tend to produce higher frequency sounds than large animals).
  • Substrate recording: A host of recent studies have examined how sounds produced by arthropods propagate inside plant leaves and stems. One difficulty with such recordings is that a single microphone placed on a plant part may not be at a location in which bending and other internal waves cause detectable surface vibrations. Recent work suggests that placing at least two microphones at right angles is more likely to capture any propagating waves. The article “A method for two-dimensional characterization of animal vibrational signals transmitted along plant stems” in the Journal of Comparative Physiology A has details (McNett, G. D, R. N. Miles, D. Homentcovschi, and R. B. Crocroft. 2006. 192: 1245–1251).

Sound recorders

Biologists often want to record the waveforms of animal sounds for later pattern analysis. This has usually meant moving some magnetic medium over a metal recording head that converts the electrical signals from microphones or microphone amplifiers into a reasonably permanent magnetic record on the medium. Running the medium back over a playback head at a later time then reconverts the magnetic record back into electrical signals for visualization and analysis. The earliest magnetic recorders used spools of metal wire, but these were soon displaced by magnetic tape. The early 1950s saw the introduction of the first field portable tape recorders and this began a golden age of animal sound recording. Reel-to-reel tapes were replaced by cassette tapes in the 1970s, which enhanced portability even further. Up to this point, waveforms were recorded in analog fashion on the magnetic materials. However, in the mid-1980s, digital magnetic tapes (R-DATs) became available and quickly replaced the analog cassette as a preferred recording format. While digitizing throws away tiny segments of the original waveforms, the losses are usually undetectable by most human ears at commercial rates (44.1 kHz or 48 kHz), and digitizing at even higher than that can be used to eliminate detectable losses for species such as katydids and bats as well. The digital environment provides the opportunity to store a much broader range of amplitudes accurately than affordable analog systems. Digital recordings can also be copied and cloned without loss of information, a problem that plagued analog magnetic systems that often suffered medium decomposition over time. Digitization has also now solved one additional problem with field recording: as long as some medium has to be moved physically to make a recording, there is always a chance for dirt, humidity, or insects to gum up the working parts. Today, modern digital recorders can store sounds on a flash or other solid memory device without moving a single part. This is an enormous advance and modern field recorders are now remarkably robust. See below for links to a more detailed review of available recording devices.

Sound level meters

Biologists interested in the pattern structure of a sound are mainly interested in the relative variation in sound pressure. Put another way, they are interested in the shape of a recorded waveform, and not in the absolute amplitudes. However, there are times when measuring the absolute values of the sound pressures are essential to the study. Examples include comparisons of maximal sound amplitude among displaying males, attenuation of animal sounds in different habitats, measures of ambient noise that signaling animals must exceed to be heard, etc. Measuring absolute sound pressures requires a calibrated device called a sound level meter. Consider that arriving sound pressures are usually converted into physical movements by a microphone diaphragm. These movements are then converted into electrical signals that in turn are likely to be amplified before they can be read from the position of a needle on a meter. Each step of this process involves a conversion of one replica of the waveform into another. At best, each new replica is a proportional (linear) version of the prior one and all frequencies in the sound are treated equally. However, that is not always possible and most microphones have some frequency dependence. Even if successive versions are proportional copies, the proportionality constant is likely to be different for different steps. In short, the only way to determine what a given needle reading means is to calibrate the entire system against sounds of different frequencies and known pressure amplitudes.

Sound level meters record the amplitude of a signal in decibels relative to some standard. The scale is thus logarithmic. See Web Topic 2.3 for definitions and alternative standards. Available models differ in how accurately they are calibrated and how often they must be recalibrated. The most accurate are Type I meters which tend to be quite expensive. Most environmental noise standards allow for measurements at the Type II level, a lower accuracy, and this is often sufficient for biological measurements in the field. While most sound level meters assume that one is making a measurement in the far field, where sound intensity can be measured completely by measuring pressure, some meters allow for measurements in the near field where both pressure and medium velocity contribute to overall sound intensity. These are naturally more expensive than the standard far field devices.

Most sound level meters have the microphone sensor located at the end of a pointed or tubular end of the meter. This is designed to minimize reflections from the body of the meter that might interfere with measurements at the sensor. Sound level meters that work with hydrophone sensors or that measure acceleration due to substrate vibrations may have the sensor located on an even longer probe. The electronics and meter needle in most meters can be adjusted to respond slowly (thus reducing the potentially over-riding effects of transients in the sound) or instead to record the peak value of an impulse sound by having the needle “stick” at the highest value detected. Most models also provide both “flat” (unaltered) and alternative (A, B, and C) “weightings” that convert absolute pressure levels into values that more accurately match the sensitivity of the human auditory system. Meters may use either analog or digital processing of the incoming sounds. Analog meters tend to be easier to use, but digital meters allow for more accurate measurements over a wider range of amplitudes. Note that reported sound level measurements must specify the distance from the sound source at which the measurement was taken: because sound attenuates as it propagates, this distance will critically affect the values recorded.

Recently, applications have been developed to use smartphones as sound level meters. A recent review can be found at See also: Mennill, D.J and K.M. Fristrup. 2012. Obtaining calibrated sound pressure levels from consumer digital audio recorders. Applied Acoustics 73:1138-1145.

Microphone array recording

A final type of sound recording task is to identify where a sound source is located. In open environments, this can sometimes be accomplished visually by looking for concurrent physical movements of the sound producing animal. However, if animals do not show visible movements when vocalizing, it can be challenging to determine who made which sound. While a human observer may be able to determine the direction and distance to a sound source acoustically, this is really only practical for high frequency sounds. Trying to decide by ear which elephant in a study site emitted a low frequency rumble is usually impossible. In water or forest, there are no visible cues about who made a sound at all. Assigning specific sounds to specific animals has become of great importance as researchers tackle networking: the interactions between local assemblages of animals. When songbirds countersing, which bird makes which song and do other neighbors enter into the exchange? Do singing humpbacked whales match each other’s song themes? Which birds contribute to a dawn chorus and is there any synchrony in the process? These and other problems require accurate assignment of sounds to specific individuals.

The current availability of laptop computers has solved this problem by allowing one to deploy an array of microphones at known locations and record from all microphones simultaneously. Microphones can be connected to the computer by cables or radio links. Alternatively, one can distribute multiple automatic recording units (ARUs) that record sounds simultaneously and use satellite signals to embed timing information for later synchronization. However recorded, the multichannel recordings are then analyzed with software that uses the arrival times of any given sound at the separate microphones to compute the location of its source. If the animals are sufficiently territorial, caller identities can then be determined from the locations. If they are not, concurrent videos of animal positions or other visual information can be used to identify which animal was in a given location at that time.

Some recent descriptions of this method and its use:

  • Basic logic: Spiesberg, J. L. and K. M. Fristrup. 1990. Passive localization of calling animals and sensing of their acoustic environment using acoustic tomography. American Naturalist 135: 107–153.
  • Recent publications on method:
    • Mennill, D. J., J. M. Burt, K. M. Fristrup, and S. L. Vehrencamp. 2006. Accuracy of an acoustic location system for monitoring the position of duetting tropical songbirds. Journal of the Acoustical Society of America 119: 2832–2839.
    • Mellinger, D. K., K. M. Stafford, S. E. Moore, R. P. Dziak, and H. Matsumoto. 2007. An overview of fixed passive acoustic observation methods for cetaceans. Oceanography 20: 36–45.
    • Mennill, D. J. 2011. Individual distinctiveness in avian vocalizations and the spatial montioring of behaviour. Ibis 153: 235–238.
    • Blumstein, D. T., D. J. Mennill, P. Clemins, L. Girod, K. Yao, G. Patricelli, J. L. Deppe, A. H. Krakauer, C. Clark, K. A. Cortopassi, S. F. Hanser, B. McCowan, A. M. Ali, and A. N. G. Kirscehl. 2011. Acoustic monitoring in terrestrial environments using microphone arrays: applications, technological considerations, and prospectus. Journal of Applied Ecology 48:758–767.
    • Blumstein, D., D. Mennill, P. Clemins, et al. 2011. Acoustic monitoring in terrestrial environments using microphone arrays; applications, technological considerations, and prospectus. J. Applied Ecol. 48: 758-767.

Equipment reviews

The technology for recording animal sounds keeps changing rapidly and there are many different suppliers and options available. A site that provides a free and neutral overview of relevant equipment is provided by the Macaulay Library at the Cornell University Laboratory of Ornithology (

2.2 Visualizing Sound Waves


Thanks to recent efforts to promote science, technology, and mathematics education and to the rapid growth of the Internet, there are currently many informative websites devoted to visualizations of the principles of physics, and of particular interest here, the principles of acoustics. For our examples, we redirect you to “Acoustics and Vibration Animations” by Dr. Dan Russell of Pennsylvania State University:


To get a second view, many of these phenomena are also animated at “Sound Waves” from the Institute of Sound and Vibration Research (ISVR) at the University of Southampton at

Molecular movements in different kinds of sound waves

At “Longitudinal and Transverse Wave Motion” by Dr. Dan Russell of Kettering University

(, the movements of individual molecules can be seen at the same time that a particular type of sound wave is propagating. In each example, fix your eye on one molecule and watch how it moves as it helps propagate the wave past its own immediate region. The examples cover:

  • Longitudinal sound waves
  • Transverse sound waves
  • Water surface waves
  • Ground surface (Rayleigh) waves

Patterns of sound wave interference

When two waves in air or water of different frequency or phase arrive at the same location, their effects are additive. If they are in phase, the sum can be greater than either alone; if they are out-of-phase, they might cancel each other. Two waves that are similar in amplitude and only slightly different in frequency create a sum with a regular variation in amplitude called beats. A useful website is “Superposition of Waves” by Dr. Dan Russell of Kettering University, at

Sound at boundaries

Depending upon the relative acoustic impedances of two adjacent media, sound traveling in one may be reflected or refracted when it encounters the boundary between them. Reflected energy stays in the original medium but travels at a new angle; refracted energy passes into the second medium, and also changes its direction of travel. For reflection look at For refraction, go to

Scattering and diffraction

When sound waves propagating in a medium encounter either a boundary containing a hole in it, or an object in the medium, their trajectories will be altered. The way in which the altered waves move depends largely on the relative size of the slit or object and the sound wavelengths. Go to “Diffraction of sound waves around objects” at the Salford Acoustics Pages at

Doppler shifts

Doppler shifts are changes in apparent frequency when either the sender or the receiver of a sound signal is moving rapidly with respect to the other. See the website “The Doppler Effect and Sonic Booms” by Dan Russell (Kettering University), at

Sound fields

The distribution of sound pressures around a sound source differs depending upon the type of sound source and whether there are reflecting surfaces or objects nearby. A good website is “Sound Fields Radiated by simple Sources” (Dan Russell, Kettering University), at It covers monopoles, dipoles, and quadrupoles.

Resonance and filtering

Most structures and enclosed cavities tend to oscillate at particular frequencies determined by their shapes, dimensions, and acoustic impedances. Resonant frequencies are those that “fit” into the structure and can even build up in amplitudes over successive cycles; filtered frequencies are those that do not fit well into the structure or produce reflected versions that cancel out initial versions.

The resonant frequencies for a structure are also called its natural modes. Look at these examples of natural modes for:

2.3 Quantifying and Comparing Sound Amplitudes


Sound pressures can vary from levels that are little more than molecular noise to those generated by massive explosions. This poses a challenge to animals that may need to be able to detect the tiny sounds of an approaching predator as often as the roars of threatening competitors. A related problem applies to how animals compare sounds: it is often more useful to compare the ratio of two sounds than to compute the absolute difference in their amplitudes. Most animals solve these problems by using a logarithmic scale to measure sound amplitudes and use ratios to compare sounds (see Web Topic 8.6 on Weber’s Law for additional discussion of this scaling and its consequences). It was thus also natural for scientists to measure and compare sound amplitudes using the logarithm of the ratios of amplitudes. Here, we define this scale in more detail, show how it is typically invoked, and provide some reference values to keep in mind when using it.

Definition of the decibel

Microphones usually measure sound pressure variation or some time derivative of the pressure variation. However, pressure represents only part of the overall power carried by a sound wave; the rest of the power is carried by the velocity of medium particles. The power (also called intensity or energy flow) of a sound wave is the product of pressure and velocity. Because the ratio of pressure to velocity changes, even if the power were to stay constant, as a sound signal moves from the solid vibratory organs of a sender into the propagating medium and then back into the hearing organs of the receiver, scientists selected power as the appropriate basis for measuring and comparing sound amplitudes. In many cases, the power per unit of sampling area, called intensity, is the actual measure used.

Relative amplitudes of two sounds are thus measured by taking the logarithm (to the base 10) of the ratio between their powers or intensities. In the past, the resulting unit was called the bel. However, because this did not provide a fine enough scale, it has since been replaced by a second unit, the decibel (abbreviated dB), which equals one-tenth of a bel. For two sounds with intensities I1 and I2 respectively, their relative amplitudes in dB would be

Relative amplitude (dB) = 10 log10 (I1/I2)

Whichever sound is used in the denominator of the ratio is considered the reference value. Thus, if the denominator sound had twice the intensity of the numerator sound, the ratio would be 0.5 and the numerator sound would be said to be –3 dB lower in intensity than the denominator sound. If the numerator sound had twice the intensity of the denominator, then one would say that the numerator was 3 dB higher in intensity than the denominator. Note that over much of the range of frequencies that humans can hear, 1 dB is very close to the minimal difference in amplitude that is necessary before a human will judge one sound to be louder than another. This amounts to about a 26% difference in intensities.

Since most microphones only measure pressure, how can we use decibels to compare sounds? If one is making the measurement sufficiently far away from the sound source to be in the far field (see text for details), the local velocity of the medium particles is proportional to the local pressure and the proportionality constant is the reciprocal of the acoustic impedance. Intensity (I) then depends on pressure (p) and acoustic impedance (z) as

I = p × (p/z) = p2/z.

If two sounds are measured in their respective far fields in the same medium, their relative amplitudes can then be written as

Thus, at least in the far field and in the same medium, we can use pressure measurements to compute relative amplitudes. The difference is that the coefficient of the log is 20 for pressure, but only 10 for intensity. Note that if the ratio of intensities of the two sounds is 0.5, the ratio of their pressures will be √0.5 = 0.707, but the relative amplitude will still be –3 dB. Similarly, if the ratio of the two intensities is 2, the ratio of their pressures will be √2 = 1.41, and their relative amplitude will again be 3 dB.

Impulses and averages

Most animal signal waveforms have quite variable amplitudes over time. What measure of pressure should we use to compare amplitudes? One obvious approach is to find the largest maximum or minimum in a sequence of waves and use the absolute value of this peak as the amplitude measure. In some cases, we may be interested in measuring such single spikes of pressure (called impulses): examples include the very short sonar calls of porpoises or the explosive calls of bellbirds. However, more often, a single spike in a waveform is not representative of the amplitude of the rest of the signal and one wants some sort of average pressure measurement. We cannot simply average the recorded sound pressures in a signal because this will only give us the ambient pressure around which a sound wave is oscillating! One solution is to measure the differences in pressure between successive maxima and minima (which should always yield positive numbers and a non-zero average). These are called peak-to-peak measures. However, even this approach could be misleading if peaks are few and far between. A common solution is to sample the waveform at many successive points, compute the difference between the sound pressure at each point and ambient levels, square this value (to eliminate negative differences), and then compute the average of these squared values. The square root of this average is called the root-mean-square (abbreviated rms) and is widely used in statistics and physics to characterize the average amplitude of a time-varying quantity. Because these and other criteria can be used to characterize pressure amplitudes, one should always specify exactly which was used in reporting sound amplitude measurements.

Standard measures

Up to this point, we have focused on how to quantify and report the difference in amplitude between two sound signals. What if we want to give some indication of the amplitude of a single sound relative to some standard? If everyone agrees on the standard, this would provide the same benefits as having an absolute measure of sound amplitude.

In air, the standard reference is the lowest power that a young (pre-rock music) human ear can detect. This turns out to be sounds at a frequency between 1 and 3 kHz and a pressure amplitude of 20 × 10–6 pascals. The intensity standard in air is then about 10–12 watts/m2. The scale for underwater sounds is offset from that of sound in air for two reasons. The first is that underwater researchers set a reference pressure of 1 × 10–6 pascals, a value 20 times smaller than the reference pressure for air. Given this lower level of reference, the same pressure recorded in both media would be accorded a 26 dB higher value in water than it would in air. Second, it takes much less power to create a high pressure in water than in air because water’s characteristic acoustic impedance is 3500 times higher than air’s (see text for definitions). Thus the decibel value comparing a sound pressure in water to one in air will be inflated by an additional 35.5 dB. Overall, measurements of sound amplitudes in water will be inflated by 35.5 + 26 = 61.5 dB relative to an equivalent measurement in air. Thus an underwater measurement of 170 dB for a sperm whale’s sounds would be equivalent in power to an airborne sound of 109 dB. This is still very loud, but much less than a 170 dB sound in air (a value commensurate with a nearby volcanic eruption). Studies of sounds propagated in solid substrates tend to use the same reference as for water, but the even higher acoustic impedance of solids requires a different adjustment than the 35.5 dB value used in water studies.

Each research community is firmly committed to its own standards, and there is little interest in adopting a common scale across media. As a result, it is very important to specify the reference level and medium when publishing standardized measurements of sound amplitude. Many workers append the expression “SPL” (for sound pressure level) as a shorthand for using the standard air reference. However, there has been a recent call for researchers to give the actual reference in all cases (e.g., “34.5 dB re 20 μPa in air”).

Finally, it is important to remember that sound waves attenuate as they propagate away from the sound source. Spreading losses alone (see text for details) will decrease the recorded sound pressure by 6 dB for each doubling of the distance between sample points. Heat losses and scattering will add additional attenuation. As a result, standardized measurements of sound amplitude are usually accompanied by both the reference level and the distance at which the sound was measured (e.g., “34.5 dB re 20 μPa at 2 m in air”).

As a general guide to the scale for airborne sounds, here are some typical values of sound amplitudes (all re 20 μPa) covering the range we are likely to encounter:

Jet engine at 30 m: 150 dB
Jackhammer at 2 m: 100 dB
Singing birds at 2 m: 80–90 dB
Street traffic: 70 dB
Human conversation: 65 dB
Quiet restaurant: 50 dB
Whisper: 20 dB
Rustling leaves: 10 dB
Threshold of human hearing: 0 dB

Moving between standard and relative measures

One nice feature of using ratios for amplitude measurements is that a comparative measure can be obtained for two sounds by subtracting the standardized measure of one from the standardized measure of the other. This is equivalent to replacing the reference value in the ratio of one of the standardized measures with the second sound amplitude. For example, suppose two animal signals are recorded at the same distance from each source and one obtains a standardized level of 35 dB re 20 μPa for one and 45 dB (using the same reference and medium) for the other. The appropriate comparison between the amplitudes of the two sounds is simply 45 dB – 35 dB = 10 dB. Given the definition of a decibel, this means that one signal has a pressure 3.2 times greater than the other. One cannot, of course, go the other way: knowing that two sounds differ by 10 dB does not provide sufficient information to reconstitute the standardized values of the two original sounds.

One should also be careful about adding decibel values. If one sound has a standardized value of 50 dB and the second has a standardized value of 10 dB, their presence in the same place at the same time is not the sum of these two values. The first sound has a pressure 100 times greater than the second; having them present at the same time, even if they interfered constructively, would still produce a sound with an amplitude only slightly greater than 50 dB.

Other scales

Because humans (and most animals) are not equally sensitive to all frequencies, a number of psychophysical and perceptual scales have been derived for humans that correct for these sensitivity differences. The phon scale identifies the amplitude of a 1 kHz standard sound in dB that is perceived as equivalent in loudness to a test sound at another frequency. The less sensitive the human ear to that frequency, the greater the amplitude of the 1 kHz sound must be to be perceived as equivalent. If a test sound was perceived as equivalent to a 70 dB 1 kHz standard, it would be assigned a perceptual amplitude of 70 phons. The sone scale is based on the number of phons required to double the perception of loudness. A sound estimated to be 40 phons equals one sone; each additional 10 phons doubles the number of sones, and each decrease of 10 phones halves the sone score. Perceptual scalings can and have been derived for certain animals in the laboratory. However, they invariably differ from that of humans and from each other.

More information

The Web is full of useful sites discussing sound amplitude measurements, both for the dB and psychophysical scales. We leave it to the reader to track down current ones: Wikipedia is always a great place to start.

2.4 Fourier Analysis of Animal Sounds

The challenge

Animals can produce quite complicated sounds. If we record such a sound and examine its waveform, there are a number of measurements that we can make and compare with other sounds. However, it is extremely difficult to describe the shape of that waveform in ways that allow us to compare it with sounds of other species, or even other sounds of the same species. We thus need a systematic and quantitative method for describing and comparing different animal sounds. The comparison of waveforms alone is not sufficient. What else can we recruit to the task?

The solution

Fourier’s Theorem states that we can decompose nearly any periodic (regularly repeating) signal into a set of pure sine waves. By recording the frequency, amplitude, and relative phases of these waves, we can provide a full and quantitative description of the sound which can be compared to the Fourier decomposition of any other sound. Although animal sounds are rarely entirely periodic, we can usually get around this constraint by breaking a sound into segments that are sufficiently periodic to perform a Fourier decomposition on each one and then string the results together into a spectrogram or other composite graph.

Background on analyzing animal sounds

Every tool comes with some constraints and costs. This is as true of Fourier analysis as it is of any method. Learning how to properly perform Fourier decomposition of animal sounds requires some understanding of its limits and constraints and a minimal amount of practice using the tools. Luckily, many of the measurements that we can make on the waveform of the sound have some corresponding representation in the Fourier decomposition of that sound. Making sure that the waveform and spectrogram measurements correspond is one way to check that we have done each type of measurement properly. Here, we provide materials explaining useful measurements that can be made on waveforms, how to make a spectrogram and extract measurements from it, and how to compare the results from two views of the same sound. There are two parts to the presentation:

You should open up both files so you can hear and see examples in the PowerPoint that you are reading about in the text pages. You are welcome to download your own copies of both of these files and refer to them when using one of the analysis software packages listed below, or when using the free RavenViewer tools to view waveforms and spectrograms of any of the sounds in the Macaulay Library (

A practical in sound analysis using Raven

  1. If not already installed on your computer, download the Fourier Test Sounds from this site and unzip the contained folder. It contains the following sample sounds:
    • Puresine: A single pure sine wave with no modulation
    • SinusAM: A single sine wave with sinusoidal amplitude modulation
    • LowFM: A single sine wave carrier weakly frequency-modulated in a sinusoidal manner
    • HiFM: A single sine wave strongly frequency-modulated in a sinusoidal manner
    • Periodsymm: A periodic but nonsinusoidal signal. The shape of the waveform is triangular such that it is half-wave symmetric.
    • Pulsedsine: A single sine wave turned on and off. This can be interpreted as periodic nonsinusoidal amplitude modulation of the sine wave carrier.
    • Onepulse: A single, very brief pulse of sound with sudden onset and offset
    • Tungarafrog: Male advertisement call of a tungara from (Engystomops pustulosus) from Central America
    • Parrotduet: Antiphonal duet by a mated pair of White-fronted Amazon parrots (Amazona albifrons) in Costa Rica. Try to figure out where both birds overlap.
  2. This practical assumes you have access to the sound analysis program Raven Pro This program requires a licensing fee although you may be able to use it in demo mode for a short time. Note that the free program Raven Lite, available at the same site, does not have all the features assumed to be available for this practical. The authors of Raven Pro will often negotiate a class license for a short period if asked.
  3. The general goal of this practical is to learn the expected frequency domain patterns for each of three basic time domain waveforms, see how compound signal waveforms decompose into additive combinations of the basic three in the frequency domain, and build up confidence that whatever you see in one domain must have a corresponding presence in the other domain. You should be able to go back and forth between domains with ease after completing these exercises.
  4. For each of the examples in the Test Sounds folder (we suggest doing them in the order above), do the following:
    • Load the file: Use Open Sound Files in the File menu to load the sound into the program. For now, use the default window settings when asked.
    • Get a general look at the waveform: Examine the waveform of the sound at both compressed and expanded time scales. Does it repeat? Is it sinusoidal or periodic non-sinusoidal?
    • Measure periodicities in waveform: On an expanded time scale, measure the time interval between repeats if perodic. The reciprocal of this number will predict the frequency of the signal if it is a pure sine wave or the fundamental frequency of a harmonic series if it is periodic but nonsinusoidal. If AM or FM, see steps below. If the repetition rate of the waveform varies during the course of the sound, measure the period of repeating and compute its reciprocal at each of several locations. If possible, leave these locations marked in the waveform so that you can find the same point in the spectrogram.
    • Make a spectrogram: You may need to fiddle with the frame size/bandwidth of the analysis. Try to get an image that is relatively smooth and shows both slow frequency modulations and amplitude modulations without breaking them into sidebands. However, do use a small enough bandwidth to see harmonics as separate bands.
    • Compare the waveform and spectrogram: Do the measurements you made in the waveform correspond as predicted in the spectrogram? Vice versa?
    • Make a power spectrum slice: Create a spectrum through one segment of the spectrogram. Compare corresponding measurements in the spectrogram and the power spectrum so that you are convinced that the latter is just a slice through the former. If you have not already done so, try varying the spectrogram/power spectrum analysis bandwidth: do the peaks get fatter or thinner as expected?
    • AM signals: If your signal is an AM signal, try adjusting the bandwidth for the power spectrum until it is small enough to show the sidebands and the carrier. Measure the frequencies of the sideband and the carrier. Then examine the following:
      • The difference in frequency between the carrier and each sideband should be equal to the repeat rate in amplitude modulation that you can see in the waveform. Is this true?
      • Is the amplitude of any sideband greater than that of the carrier? It should not be.
    • FM signals: As with AM signals, adjust the bandwidth of the power spectrum until you can decompose it into a carrier and sidebands:
      • Look at the frequency modulation pattern in the waveform. Measure how many times the frequency reaches a peak per second. This is the modulating frequency. Now compare this number to the difference in frequency between the carrier and the sidebands in the spectrogram, and between the sidebands and each other. It should be the same.
      • Measure the maximum frequency and the minimum frequency in the expanded waveform. If the frequency modulation is sinusoidal, compute the average frequency by adding the maximum and minimum frequencies and dividing by two. Is this equal to the carrier in the spectrogram? It should.
      • Compute the modulation index for the FM signal by dividing the difference between the maximum and minimum frequencies by the modulation frequency measured in the waveform. If it is less than 10, then you should see a power spectrum for this signal in which the side band amplitudes are less than that of the carrier; if it is more than 20, then the sidebands will be taller than the carrier. Is this the case? Between 10 and 20, some sidebands may be similar to the carrier in amplitude.
      • Now spread the time scale out with both waveform and spectrogram visible so that you can see 5–10 complete modulations. Then increase the spectrogram and power spectrum analysis bandwidth (e.g., reduce the number of samples/frame in the top right slider control) until the signal no longer breaks down into sidebands. You should now see a spectrogram that shows the frequency rising and falling in a sinusoidal way. You are actually viewing the modulation in a time domain view on a spectrogram! Are the maximum, minimum, and modulation frequencies measured on this spectrogram the same as those you measured in the waveform view?
    • Periodic nonsinusoidal signals: Is the fundamental in the spectrogram view equal to the repeat rate in the waveform? Convince yourself that the bands above the fundamental are truly harmonics. To do this, make a power spectrum through a given point and measure the difference in frequency between successive peaks in the power spectrum. This difference should be the same for each pair of adjacent peaks, and it should equal the frequency of the fundamental (at least within the accuracy of the analysis tools). Is the waveform half-wave periodic or not? If so, are the even harmonics missing?
    • Two real animal sounds: Next, do the same kinds of time domain/frequency domain comparisons for the two animal sounds provided: a tungara frog call and a parrot duet. These sounds change in structure during their course, so you will want to compare measurements at the same instant when comparing waveform, spectrogram, and power spectrum measurements. Remember that anything you measure in one domain should be consistent with general Fourier principles in the other domain. You should be able to describe the changes that occur in the signal waveform during the sound and how and why these are reflected in the spectrogram.

Software packages for sound analysis

There are a variety of software packages available for sound animal analysis. They differ in whether they are free or must be purchased, which operating systems are supported (Linux, Macintosh, and Windows), which measurements and features they support, and how they look and feel. Below, we list some software options, where to download or order it, and whether or not we have provided a primer on using the package on this site. Note that if we have not provided a primer, the programs themselves often come with detailed instructions and tutorials.

  • Raven Lite: This is also a free sound analysis package made available by the Bioacoustics Research Program at the Cornell Lab of Ornithology. It works on Linux, Mac, and Windows platforms, and generates both waveforms and spectrograms in ways that facilitate accurate measurements. It does not currently provide power spectra.
  • Sound analysis Pro: This program was written by Dr. Ofer Tchernichovski (City College of New York) and colleagues. To create spectrograms, this program uses a slightly different set of algorithms, which makes for improved measurements. It also provides a very sophisticated set of tools that are now widely used by neurobiologists to study animal sound generation and perception. It is free, but only works on Windows platforms.
  • Ishmael: This sound analysis package was written by Dr. David Mellinger of Oregon State University, and was initially aimed at marine animal sound analysis. However, it is completely effective for any animal sound and is free. It is only Windows-based.
  • Raven Pro: Raven Pro is the successor to the popular Canary program that was widely used in the 1990s. It runs on Linux, Windows, and Macintosh platforms, and creates waveforms, spectrograms, and power spectra that can be measured with a variety of useful tools. There is a charge to acquire the software, but this varies depending on the client and context. Special educational discounts are available. Check the URL below for details.
  • PRAAT: This program was written by Paul Boersma and David Weenink at the University of Amsterdam. It was largely designed for analyses of human speech, but many of its tools can be used on any animal sound, especially those of the many species whose signals are periodic nonsinusoidal sounds. The program is free and runs on all common computer platforms.
  • AviSoft: This is a very sophisticated and powerful sound analysis package that is now widely used by researchers in Europe. It runs only in Windows and is fairly expensive. However, new tools and measures are continually being added, and the designers are very responsive to new needs of clients.
  • Signal/RTS: Engineering Design has been in the business of creating sound analysis software for some time and its packages are used by many American researchers. The software is sophisticated and provides waveforms, spectrograms, and power spectra, along with a diverse number of measurement tools. It also has tools for sound synthesis and manipulation that can be useful for playback studies. The program is Windows-based, but it can be run on a Macintosh using a Windows emulation mode. It must be purchased and is the most expensive of the packages that we list here.
  • Wildlife Acoustics: This company produces a variety of portable sound recorder and analytic software. Some examples:
    • Kaleidoscope Analysis Software: Extensive analysis computer software for sounds including array recording systems. Available for all platforms. Cost similar to Avisoft but also provides a free quick viewer. Website:
    • Echo Meter Touch: This system includes s of a small box that is plugged into the power port of an Apple iPad or iPod and cvonsists of a microphone and electronics for detecting ultrasonic bat calls. These are then stored on the iPad and can be visualized in real time as waveforms or spectrograms. Data can then be downloaded to other devices. Website:
    • Dr. Steven Hopp (Emory and Henry College) maintains a web page that lists a wide variety of animal sound analysis software options. Check it out at:

2.5 Reflection and Refraction


The interaction of sound at a boundary between two media can be a fairly complicated process that depends on the angle of incidence of the propagating wave, the acoustic impedances of the two media, the speeds of sound in the two media, and any textural patterns on the boundary surface. In general, some of the incident sound wave (A) will be reflected back (B) into the initial medium, and some will propagate across the boundary (C), where its direction of travel is likely to be refracted (bent) instead of traveling on in the second medium on its initial trajectory (A´). CHigher and CLower refer to two different angles of possible refraction as discussed below. There are four different cases to consider: two are well-known and widely cited in most textbooks; the other two are less well-known, but could easily be encountered while studying animal sound communication.

Basic definitions

Reflection coefficients

Consider a sound traveling in Medium 1 and encountering a boundary with Medium 2. The angle between the direction of propagation and the surface of the boundary is the grazing angle ϕ. The value of the reflection coefficient, R, at this boundary depends on the ratio of the acoustic impedances of the two media (Z2/Z1), the grazing angle ϕ, and the ratio of the velocities of sound in the two media (c1/c2):

Reflection coefficients vary from +1 to –1. When they are +1, all of the incident energy is reflected from the surface and the reflected wave undergoes no phase shift. Except for a change in propagation direction, it is as if the boundary were not even there. When R = –1, all of the energy is reflected, but the reflected wave is phase-shifted 180° (or one half wavelength): that is, it begins a half-cycle behind that which one would have expected had there been no reflection. If the incident wave were at a maximum when it hit a boundary with R = –1, then it would begin as a minimum in the reflected wave. The smaller the absolute value of R, the less energy is reflected and the more energy passes into the second medium.

Angles of incidence

The grazing angle of the sound, ϕ, can greatly affect the reflection coefficient at a boundary. It (ϕ) varies between 0, when the sound is parallel to the boundary, and 90°, when the sound is traveling in a direction perpendicular to the boundary. Between these two extremes, there will be an important threshold value: for angles of incidence below the threshold, R will vary one way with ϕ, and above the threshold, it will vary in another way. When the medium with the incident sound has the lower velocity, e.g., c1 < c2, the threshold value is called the critical angle. It is denoted by ϕc, and is computed as cos ϕc = c1/c2. When the incident medium has the higher velocity, e.g. c1 > c2, then the threshold value is called the angle of intromission, and is denoted by ϕi. It is computed as:

The four cases

We can divide the possible relationships between R and ϕ into four cases depending upon whether Z1 > Z2 or Z1 < Z2, and whether c1 < c2 or c1 > c2. Many of the situations one encounters fit either Case I or II; however, the other two cases are not uncommon and the reader should be aware of them. In the following plots, orange zones are ones with a full 180° phase shift at reflection, white zones have no phase shift, and blue zones show a continuous change in phase shift with increasing grazing angle.

Case I: Z1 > Z2 and c1 > c2

If Z1 > Z2 and c1 > c2, then the value of R is always negative. This would be the case if the incident sound waves were in water and the sound hit the water’s surface. The result is a 180° phase shift (orange region) regardless of incident angle. The critical angle is irrelevant in this case. As the angle of incidence increases from 0° to 90°, the value of R increases from –1 toward its value for perpendicular incidence, R90.

Case II: Z1 < Z2 and c1 < c2

The opposite extreme occurs when Z1 < Z2 and c1 < c. An example would be sound in air hitting the surface of a body of water. The relevant threshold in this case is the critical angle ϕc: if ϕ > ϕc, then R (solid black line) is always positive (no phase shift, white region) and decreases the closer ϕ is to 90°. For low enough incident angles, e.g., when ϕ < ϕc, all energy is reflected (|R| = 1) but the phase shift (indicated with a dashed line) decreases from a full 180° at a 0° grazing angle down to no phase shift at a grazing angle of ϕc (blue region).

Case III: Z1 < Z2 and c1 > c2

Here, the incident medium has a lower impedance but a higher velocity: Z1 < Z2 and c1 > c2. Examples include sound traveling in water and striking a muddy bottom or sound traveling in air and hitting certain types of soils. The important threshold angle of incidence here is the angle of intromission, ϕi. For ϕ  < ϕi, reflected waves always experience a 180° phase shift, and the value of R will increase from –1 to 0 as ϕ increases. When ϕ  = ϕi, no energy is reflected: it all passes into the second medium! For ϕ  > ϕi, there is no phase shift and the fraction of energy reflected increases with the incident angle.

Case IV: Z1 > Z2 and c1 < c2

This is the opposite of Case III, since Z1 > Z2 and c1 < c2. It can occur when sounds propagated in a muddy or soil substrate reach the interface with the overlying medium. Both thresholds must be invoked in this example. At incident angles less than ϕi, all energy is reflected, but the phase shift varies from 180° when ϕ = 0° to none at ϕc. Further increases in incident angle decrease R without any phase shift until ϕ = ϕi when no energy is reflected. Higher incident angles result in a 180° phase shift and variation in R from 0 to R90.


Caruthers, J. W. 1977. Fundamentals of Marine Acoustics. Amsterdam: Elsevier Publishing.

2.6 Sample Animal Sounds


It is often difficult to imagine what a particular animal sounds like from a text treatment. In these pages, we provide examples of most of the types of vibration production known in animals. Note that the sounds that were recorded—which you will hear—are rarely the original vibrations. Instead, the initial vibrations have been modified in various ways before being emitted, often in a frequency-dependent fashion, into the surrounding medium. Note also that some of the recordings below were recorded in air, some under water, and some by placing a special sensor against a plant stem or leaf. Most of the recordings listed below are currently archived at the Macaulay Library of the Cornell University Laboratory of Ornithology (Ithaca, New York). This is currently the world’s largest archive of animal sounds and access is free. Other sources are listed as needed.

Sources and tools

Before accessing some of the sounds in the table below, you will need to install a player plugin on your computer. There are two options:

  • Direct play: Most browsers on regular computers come equipped with plugins to play movies and sounds. Websites written in HTML 5 or later can play rich media without additional plugins. Older sites may require you to install Flash or Shockwave. Try some first, and if they play, then don’t worry about plugins. If they don’t play, you can get either plugin player free: Flash is available at, and Shockwave at
  • To see spectrograms of the sounds, you need to use a website that provides them, or you need to download the sound file and then use one of the sound analysis packages listed in Online 2.5 to create the spectrograms.

The samples:

  1. Solid body part moved against other solid or reshaped
    1. Percussion:
  2. A body part moved against a fluid (air or water)
    1. Pulsation:
    2. Fanning:
    3. Fluid compression:
    4. Streaming:
  3. A fluid moved over a body part
    1. Vocalization:
    2. Aerodynamic sounds:

2.7 Animations of Vocalizing Birds

Dr. Roderick Suthers and his colleagues at Indiana University have played key roles in elucidating how birds and mammals produce vibrations and modify them before emission. His group has produced several short video clips that demonstrate important components of vocalization in several songbirds. These videos show many of the basic points discussed in the text. We provide a few introductory notes below for three of these clips. Select the format according to your computer operating system. You can stop the movie at any frame, and use the cursor or the arrows to move it back and forth a frame at a time to see the details of any stage in the process.

Production of vibrations in syrinx

The first two clips ( show the production of sound vibrations in the syrinx of two common North American songbirds. The website provides some information on each clip, and we provide additional commentary below:

  • Northern Cardinal (Cardinalis cardinalis) singing: This clip begins with a photo of a male cardinal and zooms in to show the location of the trachea, syrinx, and bronchi in the animal. It then moves in further to show a longitudinal section through the syrinx. Note the positions of the lateral and medial labia on each side where the bronchi join the trachea. Note also the position of the cartilage (yellow) just behind each lateral labium. The bird breathes in and out several times (blue) without any vocalizing. Then just before an exhalation, the cartilages on each side are rotated, forcing the lateral labium into the cavity. The cardinal is beginning to sing one of the song syllables shown in the spectrogram in the upper right. The first half of this frequency “down-swept” syllable is produced by the right side of the syrinx; the left side is closed off to any airflow. About halfway through the syllable, the right labium closes off its channel completely, and the left lateral labium opens just enough to vibrate and produce the second half of the syllable. It takes enormous coordination for the bird to produce the beginning of this second section of the syllable at just the right moment and at the exact frequency at which the right side ended its contribution. Inexperienced birds do not always make a perfect union and you can sometimes see the gap between the two parts in spectrograms of their songs. The video continues with several more of these down-swept syllables, the bird inhaling before each, and ends with several final normal breaths without vocalizing. Cardinals nearly always produce high frequencies on the right side of the syrinx, and low frequencies on the left.
  • Brown-headed cowbird (Molothrus ater) singing: This video assumes you saw in the prior video where the syrinx is located. It thus begins with a glimpse of the whole bird and plays a short song at normal speed. The video then moves directly to the longitudinal section of the syrinx as the bird breathes in and out once and inhales a second time. During the subsequent movie, the sounds heard are at normal frequencies but the time scale has been expanded to show details. On the next inhalation, the bird closes off the right hand side of the syrinx and uses the left lateral labium to produce the first syllable in the song. This is a low relatively constant frequency. The bird inhales again and then produces four notes in rapid succession: the first and third are lower frequencies and are produced on the left side. The second and fourth are higher frequencies and produced on the right side. The bird finishes exhaling, and inhales again. It then produces 5 successive notes, again alternating so that low frequencies are produced by the left side and high frequencies by the right.

Modification of the sounds after production by syringeal vibrations


This clip shows a male northern cardinal singing a song with upswept syllables. Spectrograms of the radiated song (outside the bird) show a single frequency modulated component and no higher harmonics. However, we know that the signals at the syrinx are periodic but not sinusoidal and thus should contain significant energy in the higher harmonics. As we can see in this x-ray movie, a singing bird amplifies the fundamental and filters out the higher harmonics by inserting a resonant cavity between the vibrating syrinx and its mouth. This cavity is created by muscular expansion of a pharyngeal cavity and the upper part of the esophagus. Because each syllable’s fundamental frequency changes rapidly, the bird must keep changing the shape and volume of these cavities so that its resonance tracks the changing frequency generated by the syrinx. Clearly, this requires a lot of coordination!

More background on the Suthers’ Lab:

More information on the methods, publications, and approaches of the Suthers’ lab group can be found at:

2.8 Linear Versus Nonlinear Systems

A collection of interacting forces can operate as a linear system or as a nonlinear system (Strogatz 1994; Kaplan & Glass 1995). When it acts as a linear system, its responses are proportional to the amount of change in any constituent force, and if several forces are changed at the same time, the overall response is simply the sum of the responses that we might have seen had we changed each force separately. In general, a mathematical description of the dependence of a linear system’s response to changes in forces will contain no higher order terms like squares or cubes of forces. If our collection of forces acts as a nonlinear system, any or all of these conditions can be violated. Instead of producing a response that is in proportion to a change in one of the forces (the linear expectation), we might suddenly see a response that is totally unrelated to the magnitude of that change. Alternatively, the total response when we change several forces at once might not be the simple sum expected when each force is changed by itself but some complex interaction between the forces. Finally, the mathematical descriptions of nonlinear systems will contain terms that include higher order exponents. Both linear and nonlinear systems can produce sustained oscillations (periodic orbits and limit cycles, respectively): they differ in how they respond to changes in the relative forces and their temporal alignment.

In all animal vibration sources, a number of forces are brought into play at the same time. A stridulating insect uses muscle forces to drag the comb over a sharp edge, and the tensile forces in the teeth of the comb resist being bent until the muscle forces bend the tooth enough for it to escape the edge and snap back to normal shape. Similarly opposed forces are used by terrestrial vertebrates to create vibrations in their respiratory valves. As long as these forces exert their effects out-of-phase, the system will vibrate and produce a sound. As we have seen earlier, most vibrations in animal sound sources are likely to be periodic but nonsinusoidal.

Most animal sound sources have evolved properties that cause them to act as quasi-linear systems over intermediate ranges of force magnitudes. The stable periodic oscillations that result are called modal sound production. However, at extreme values, the same systems reveal their underlying nonlinearities. As an analogy, consider a child’s swing. If the parent pushes the swing gently, it will oscillate back and forth at a steady rate. If they push a bit harder, the amplitude of the swing’s motion will increase proportionately and the system will be responding linearly. However, if an unwary parent pushes too hard, the response of the swing is to rotate higher than its attachment point and then suddenly drop straight down (perhaps dumping the child from his or her seat). With an even stronger push, the swing (and child) might rotate right up and around the attachment point to complete a full circle. In either case, the sudden transition from one pattern of response to another as some force is varied is called a bifurcation.

Bifurcations can easily occur in the dynamic behavior of animal sound sources (Strogatz 1994; Herzel et al. 1995). A common bifurcation arises when the air pressure in a terrestrial vertebrate’s respiratory tract is altered while the valve is oscillating. At normal pressures and airflows, the two sides of the valve feed back on each other’s movement until they both move at the same frequency. This is called entrainment. However, if the air pressure or muscle tensions on the valves are changed sufficiently from normal levels, the two sides of the valve may begin to oscillate at different frequencies that may not be harmonically related. This is known as biphonation. Alternatively, the changes in forces may retain the entrainment but cause both sides of the valve to begin moving in a more complex trajectory. If each cycle takes twice as long, the fundamental frequency in the resulting signal will be half of what it was before the bifurcation and one will see twice as many harmonic bands in the spectrogram. This type of effect is called the generation of subharmonics. Finally, even more extreme change in air pressure or muscle tension can cause the valve movements to become completely non-repetitive (aperiodic) in their movements. The resulting spectrogram will show a broad smear of energy over a wide range of frequencies. This is known as deterministic chaos.

Figure 1: A spectrogram of three consecutive “peow” calls of a male White-fronted Amazon Parrot (Amazona albifrons). While the first call is largely linear with clear harmonics, the last quarter of the second call shows a nonlinear bifurcation with the appearance of subharmonics. The final two thirds of the last call consists of deterministic chaos. This sequence is regularly seen in peow call sequences in this species. Nonlinear components like this appear to be a common feature of wild parrot vocalizations (© Jack Bradbury).

Animals can either adjust their sound producing forces to generate periodic (but usually nonsinusoidal) sound waves, or they can push the sound source out of the modal range and trigger one or more bifurcations. Most songbirds appear to go out of their way to keep sound production in the modal range. In contrast, parrots routinely include bifurcations in their calls (Fletcher 2000). Non-modal vocalizations have been described in a variety of mammals (Riede et al. 1997; Riede et al. 2000; Riede et al. 2001; Fitch et al. 2002; Riede et al. 2004; Riede et al. 2005). Human speech is usually modal, but humans can, and often do, push their sound producing systems out of modal patterns and into nonlinear states (Berry et al. 1994; Herzel et al. 1994; Herzel et al. 1995; Herzel and Knudsen 1995; Steinecke and Herzel 1995; Berry et al. 1996; Fletcher 1996; Mergell and Herzel 1997; Mergell et al. 1998; Mergell et al. 1999; Mergell et al. 2000; Berry et al. 2001; Gerratt and Kreiman 2001; Hanson et al. 2001; Berry et al. 2006). Canids and primates often push their laryngeal vibrations into biphonation states creating more complex sounds with two apparently independent fundamentals (Volodina et al. 2006). Certainly more examples of these shifts will be found in other taxa as researchers are aware that they should look for them.

References Cited

Berry, D. A., H. Herzel, I. R. Titze, and K. Krischer. 1994. Interpretation of biomechanical simulations of normal and chaotic vocal vold oscillations with empirical eigenfunctions. Journal of the Acoustical Society of America 95: 3595–3604.

Berry, D. A., H. Herzel, I. R. Titze, and B. H. Story. 1996. Bifurcations in excised larynx experiments. Journal of Voice 10: 129–138.

Berry, D. A., D. W. Montequin, and N. Tayama. 2001. High-speed digital imaging of the medial surface of the vocal folds. Journal of the Acoustical Society of America 110: 2539–2547.

Berry, D. A., Z. Y. Zhang, and J. Neubauer. 2006. Mechanisms of irregular vibration in a physical model of the vocal folds. Journal of the Acoustical Society of America 120: EL36–EL42.

Fitch, W. T., J. Neubauer, and H. Herzel. 2002. Calls out of chaos: the adaptive significance of nonlinear phenomena in mammalian vocal production. Animal Behaviour 63: 407–418.

Fletcher, N. H. 1996. Nonlinearity, complexity, and control in vocal systems. In Vocal Fold Physiology: Controlling Complexity and Chaos (Davis, P. J. and N. H. Fletcher, eds.), pp. 3–16. San Diego, CA: Singular Publishing Group.

Fletcher, N. H. 2000. A class of chaotic bird calls? Journal of the Acoustical Society of America 108: 821–826.

Gerratt, B. R. and J. Kreiman. 2001. Toward a taxonomy of nonmodal phonation. Journal of Phonetics 29: 365–381.

Hanson, H. M., K. N. Stevens, H. K. J. Kuo, M. Y. Chen, and J. Slifka. 2001. Towards models of phonation. Journal of Phonetics 29: 451–480.

Herzel, H., D. Berry, I. Titze, and I. Steinecke. 1995. Nonlinear Dynamics of the Voice—Signal Analysis and Biomechanical Modeling. Chaos 5: 30–34.

Herzel, H., D. Berry, I. R. Titze, and M. Saleh. 1994. Analysis of vocal disorders with methods from nonlinear dynamics. Journal of Speech and Hearing Research 37: 1008–1019.

Herzel, H. and C. Knudsen. 1995. Bifurcations in a vocal fold model. Nonlinear Dynamics 7: 53–64.

Kaplan, D. and L. Glass. 1995. Understanding Nonlinear Dynamics. New York: Springer-Verlag.

Mergell, P., W. T. Fitch, and H. Herzel. 1999. Modeling the role of nonhuman vocal membranes in phonation. Journal of the Acoustical Society of America 105: 2020–2028.

Mergell, P. and H. Herzel. 1997. Modelling biphonation—The role of the vocal tract. Speech Communication 22: 141–154.

Mergell, P., H. Herzel, and I. R. Titze. 2000. Irregular vocal-fold vibration—High-speed observation and modeling. Journal of the Acoustical Society of America 108: 2996–3002.

Mergell, P., H. Herzel, T. Wittenberg, M. Tigges, and U. Eysholdt. 1998. Phonation onset: Vocal fold modeling and high-speed glottography. Journal of the Acoustical Society of America 104: 464–470.

Riede, T., H. Herzel, K. Hammerschmidt, L. Brunnberg, and G. Tembrock. 2001. The harmonic-to-noise ratio applied to dog barks. Journal of the Acoustical Society of America 110: 2191–2197.

Riede, T., H. Herzel, D. Mehwald, W. Seidner, E. Trumler, G. Bohme, and G. Tembrock. 2000. Nonlinear phenomena in the natural howling of a dog-wolf mix. Journal of the Acoustical Society of America 108: 1435–1442.

Riede, T., B. R. Mitchell, I. Tokuda, and M. J. Owren. 2005. Characterizing noise in nonhuman vocalizations: Acoustic analysis and human perception of barks by coyotes and dogs. Journal of the Acoustical Society of America 118: 514–522.

Riede, T., M. J. Owren, and A. C. Arcadi. 2004. Nonlinear acoustics in pant hoots of common chimpanzees (Pan troglodytes): Frequency jumps, suklharmonics, biphonation, and deterministic chaos. American Journal of Primatology 64: 277–291.

Riede, T., I. Wilden, and G. Tembrock. 1997. Subharmonics, biphonations, and frequency jumps—common components of mammalian vocalization or indicators for disorders? Zeitschrift Fur Saugetierkunde-International Journal of Mammalian Biology 62: 198–203.

Steinecke, I. and H. Herzel. 1995. Bifurcations in an asymmetric vocal-fold model. Journal of the Acoustical Society of America 97: 1874–1884.

Strogatz, S. H. 1994. Nonlinear Dynamics and Chaos. Cambridge, MA: Westview Press.

Volodina, E. V., I. A. Volodin, I. V. Isaeva, and C. Unck. 2006. Biphonation may function to enhance individual recognition in the dhole, Canis alpinus. Ethology 112: 815–825.

2.9 Radiation Efficiency and Sound Radiator Size


The size of an animal’s sound emission organs can seriously limit the size of the sound wavelengths that it can radiate efficiently: small animals cannot radiate high amplitude sounds with wavelengths much larger than they are. Here we provide a more detailed explanation for this general finding.

Frequency scaling and ka

The sound pressure generated by a sound source typically increases as either the frequency being produced or the size of the sound source is increased. Since the two parameters have similar effects, acousticians use a scaled version of their product to characterize the dependence of sound pressure on their values. This scaling involves converting the frequency, f, in Hz (cycles/sec) into a spatial measure, the wave number k, which is the number of cycles at the given frequency found in 2π meters of the relevant medium. This can be computed by dividing 2π by the wavelength in meters, λ, of the frequency of interest. The product we need is that between k and some appropriate measure of the size of the sound source. For a spherical monopole, this would be the radius of the sphere, a, in meters. For a dipole such as a flat disk vibrating back and forth along a line perpendicular to its surface, it would be the radius, a, of the disk.

In general, ka is a useful measure of the ratio between the size of a sound source and the wavelength of the sound that it is generating. For example, ka for a spherical monopole is equal to 2πa/λ or πd/λ where d is the diameter of the sphere. When ka = 1, the wavelength is then about 3 times larger than the diameter of the sphere; when ka = 3, then the sphere’s diameter and the wavelength are approximately equal.

Sound pressure and ka

As we shall see below, there are two concurrent processes that can limit the amplitude of a sound wave as a function of ka. One is simple and applies to all basic sound types: as we increase either frequency or the size of a sound source, all other factors being equal, we will increase the potential pressure of the sound radiating from the source. For ka > 1, this potential pressure is what is realized. However, when ka < 1, a second process comes into play, which reduces the efficiency of radiation and thus the resulting sound pressure. We outline this secondary process for different radiating geometries in subsequent sections.

It should be noted at this point that all other factors may not be equal. For example, the amplitude of the sound pressure produced by a sound source will also depend on the magnitude of the sound source’s movement. This magnitude can be measured as the maximum change per half cycle in the radius of a pulsing spherical monopole or the maximum distance moved by a central point on a dipole per half cycle. In addition to increasing the resulting sound pressure, increasing the distance moved will also increase the costs of sound production. If the sound producer is energy-limited, they might elect to reduce the amplitude of the movement of their sound producing structure whenever they increased the frequency to keep costs constant. The predicted increase in sound pressure when an animal increases its frequency might thus be countermanded by a concurrent decrease in oscillator amplitude. All things would thus not be equal when compared to the initial condition, and thus the sound source might not produce a sound with a different amplitude as predicted when ka increases. In the discussions of how ka affects radiation efficiency below, we shall assume that all other parameters remain constant as ka is varied.

Efficiency of monopoles with varying ka

Consider a completely spherical monopole that is expanding and contracting to produce sinusoidal waves at a single frequency. As the sphere expands, it pushes against the layer of medium immediately surrounding it. This layer of medium responds to the sound source force in two ways: it will be compressed, raising the local pressure inside the layer, and at the same time, the entire compressed layer may begin to move outwards (a mass flow). This expanding first layer will collide with the next closest layer of medium and the same process will be repeated: the higher pressure of the first layer will force molecules in the second layer to move, resulting in both compression and mass flow away from the sound source. As the sphere completes the expansion part of the cycle and begins to contract, a similar process occurs but in reverse with the mass flow now moving towards the sphere’s retreating surface.

As successive layers are compressed (or rarefied), three properties of the medium resist changes in molecular velocities and thus contribute to the local acoustic impedance. All media are viscous to some degree and this exerts a frictional drag on molecule motion. This resistive acoustic impedance hinders mass flow and results in compression and the local build up of pressure. Because molecules must move faster at higher frequencies, resistive acoustic impedance is higher at higher frequencies. A second relevant property is the stiffness of the medium: as force is exerted on a given layer of molecules, the stiffness exerts a counter-force that resists both the compression and the mass flow of the medium. The longer that a given force is exerted, the more the counter-force. As a result, the hindering effects of stiffness are greater for low sound frequencies than for high ones. Finally, all molecules have inertia that resists changes in their velocities. This inertia will hinder frequent changes in molecular direction more than infrequent ones; inertial effects thus become greater as the sound frequency is increased. Close to a sound source, stiffness and inertial effects rise and fall in phase with the mass flow but out-of-phase with the resistance effects. In addition, only the frictional effects contribute to molecular concentration and rarefaction, and thus sound pressure depends only on the resistive impedance. The effects of the medium stiffness and molecular inertia are thus lumped into a single component called the reactive acoustic impedance. (Note: The terms resistive and reactive impedance initially came from analyses of electrical circuits. It turns out that the resistance to flow in a liquid has similar behavior to the resistance to electrical current in a wire. Similarly, stiffness in a fluid behaves similarly to a capacitance in an electrical circuit, and inertia produces fluid behaviors analogous to inductances. Capacitance and inductance together produce the overall reactive impedance in an electrical circuit and the overall term is used by analogy in this more mechanical situation. See Fletcher 1992 for a good discussion of the utility of electrical terms in analyzing the behavior of mechanical systems.)

The values of the resistive and reactive impedances in an acoustic system depend on the relative sizes of the sound source and the wavelength of sound that it is generating, and thus on ka. To see why, we note that molecules contributing to the mass flow are increasingly diluted with non-contributors at greater distances from the source. As a result, the mass flow velocity falls off with the square of the distance from the source. In practice, most of the mass flow is limited to a blanket of medium around the sound source that is about one wavelength in thickness. The volume of medium participating in the mass flow thus depends on the thickness of this blanket and the surface area of the sphere (a larger sphere with its larger surface area moves more total medium per cycle).

When the diameter of the monopole is small relative to the wavelength of the sound being generated (ka << 1), the volume of air moved per unit of surface area of the sphere is large. Most of the force exerted by the sphere’s small surface will be needed simply to overcome the inertia and stiffness effects and get the large mass of medium moving. This leaves little additional force to condense or rarefy adjacent layers of medium. As a result, sound pressures generated when ka << 1 will be small.

There are two ways to increase ka. Let us first hold the diameter of the sphere constant and gradually increase the sound frequency being produced. This will decrease the wavelength of the sound and thus the thickness of the blanket of medium that must be moved. The volume of medium that must be moved per unit surface area on the sphere will accordingly decrease. Initially, stiffness effects will be high (due to the low frequencies) and increasing the frequency will increase inertial effects. Overall reactive impedance will thus initially rise with increasing frequency. As the frequency is increased further, stiffness effects decrease faster than inertial effects continue to increase. The result is that the reactive impedance first increases and then decreases as we increase the frequency of the sound being generated. At the same time, the decreasing thickness of the medium blanket will reduce the fraction of the force exerted by the sphere that is needed to generate the mass flow and allow for greater compression and rarefaction. This will generate a higher sound pressure variation at the source. Eventually, at high enough frequencies (ka > 1), it will take so little force to move the thin blanket of medium that all of the sphere’s force will go into compression and rarefaction and the efficiency of radiation will approach 100%. Further increases in frequency will not affect the efficiency of the radiator significantly.

What if we hold frequency constant and vary the diameter of the sphere? When the wavelength is much larger than the sphere’s diameter, any increase in that diameter will reduce the volume of medium that must be set into motion per unit area of sphere surface. This will allow a larger fraction of the applied force to be used for compression and rarefaction and thus create a larger pressure amplitude of the sound. Again, as ka is increased by increasing the sphere size, efficiency of radiation gradually arises and asymptotes towards 100% as ka becomes larger than 1.

We can look at this process in a different way by noting that when ka << 1, the resistive impedance of the blanket of medium surrounding the sphere (which is the only part of the impedance that can increase sound pressures), is very small when compared to the characteristic acoustic impedance (which is entirely resistive) of the medium far from the source. Because of the difference in resistive acoustic impedances between this blanket and layers of medium further from the source, we would not expect much sound energy to be transferred into the more distant medium. As ka is increased, the resistive impedance of the blanket increases until it is essentially identical to the characteristic acoustic impedance of the medium. At this point, efficiency of radiation from the blanket to more outlying layers is nearly 100%.

It is useful to consider some quantitative limits on these efficiency effects. When the ratio between the diameter of the monopole and the sound wavelength is less than 1/3 (e.g., ka < 1), radiation efficiency will increase monotonically with increases in the ratio. When the ratio equals 1/3, the resistive and reactive impedances around the monopole are roughly equal and efficiency is 50% of what it could be. To achieve an efficiency of 90% or more, the wavelength must be the same size or smaller than the diameter of the sound source.

Figure 1: Change in impedances (relative to the characteristic impedance of the medium) and efficiency of spherical monopole sound source as a function of ka. Red line shows resistive impedance of medium surrounding monopole, which is essentially equal to the efficiency of the sound pressure output. Blue line shows reactive impedance which governs mass flow around sound source. Note that both axes are logarithmic scales. (Computed from equations in Fletcher 1992.)

Efficiency of dipoles and quadrupoles with varying ka

Dipoles (which oscillate back and forth along a single dimension) and higher order sound sources (which oscillate with trajectories using two or more dimensions) experience losses in efficiency in part due to the reasons outlined above for monopoles. However, an additional factor reduces efficiency in these sound sources even further. This is called acoustic short-circuiting.

Because of the way a dipole operates, it generates a condensation on one side of the sound source at the same time that it creates a rarefaction on the other. If the dipole is oscillating slowly enough (e.g., at a low enough frequency), a condensation generated at one end might have sufficient time to propagate to the opposite end of the dipole and interfere negatively. One way to avoid this short-circuiting is to use sufficiently high frequencies that a condensation cannot get to the rarefaction before it is complete and already radiating into the medium. Another way is to insert the dipole in a baffle so that condensations have to travel all the way to the edge of the baffle and back before they can reach a rarefaction. The further a condensation has to travel to reach a rarefaction, the weaker its amplitude; it will thus have only a minor effect on the rarefaction and short-circuiting will be minimal.

Suppose that the shortest distance through the medium between the radiation site for a condensation and that for a rarefaction is D. It will take D/c seconds for a condensation to reach a rarefaction site (where c = the speed of sound in the medium). A sinusoidal frequency f takes T = 1/f seconds to complete one cycle, and therefore T/2 seconds to complete producing a condensation on one side and a rarefaction on the other. To minimize short-circuiting, it must be the case that the time required to create and radiate a condensation or rarefaction (T/2) is less than the time required to travel between the sites (D/c). If T/2 < D/c, it follows that Tc < 2D. Since by definition the wavelength of the sound λ = Tc, short-circuiting can be minimized when λ < 2D, or rewriting, when 2D/λ > 1. If the dipole is an insect wing that is vibrating up and down, D is half the diameter (to traverse the distance from the center of the wing to its outside margin) plus another half diameter (to traverse the opposite side of the wing). D is thus roughly equal to one diameter of the wing. To minimize short-circuiting, twice the ratio between diameter and wavelength must be greater than one. Using the measure ka, where a is the wing radius, this is equivalent to requiring that ka > π/2 = 1.57. Other geometries of the dipole will give slightly different values, but the general result is the same.

When short-circuiting is combined with other sources of radiation inefficiency, it takes a higher value of ka to produce a given efficiency of sound radiation with a dipole than it would to achieve that same efficiency with a monopole. Put another way, the reduction in sound pressure amplitude when the wavelength being generated is much larger than the sound source is much higher for dipoles than for monopoles: monopoles are more efficient sound radiators than dipoles when ka < 1. At the same time, the fraction of the force exerted on the medium that goes into mass flow is much higher for dipoles than for monopoles at any given ka < 1. When ka > 3, a dipole produces twice the sound pressure that a similarly sized monopole would produce at the same frequency. This is because the two sides of the dipole effectively act as individual monopole sound sources once the short-circuiting is minimized.

Figure 2: Change in impedances (relative to the characteristic impedance of the medium) and efficiency of a dipole sound source as a function of ka. Solid red line shows resistive impedance of medium surrounding dipole incorporating effects of short-circuiting. Solid blue line shows reactive impedance, which is similar to that of monopole but somewhat frequency dependent for ka > 1. Note larger gap between reactive and resistive impedances for dipole when compared to the monopole in Fig. 1 for ka < 1, and higher final values for ka just >1. (After Fletcher 1992.)

Putting the frequency dependences together

We now want to combine the effects of increasing the potential sound pressure as ka increases with the efficiency costs when ka < 1. All other factors being equal, sound pressure from a monopole increases with the square of ka when ka < 1, and as a linear function of ka when ka > 1. For a dipole, the exponents all increase by one: for ka < 1, pressure increases with ka to the third power, and for ka > 2, it increases with the square of ka. For quadrupoles, just add one to each exponent again.

Directionality, type of sound source, and frequency

A spherical monopole radiates equally in all directions: in the absence of any nearby boundaries, sound pressures at all points equidistant from the monopole should be equal. A dipole has a much more complex sound field. All points equidistant from the two ends of the dipole axis will exhibit little if any sound pressure because of negative interference by waves from the two ends that are out-of-phase and of roughly equal amplitude. On the other hand, sound pressures will be maximal as one moves away from the dipole along the axis of its motion. The resulting sound field can be described as having two large lobes within which are all points with sound pressures above some minimal value. Each lobe is anchored at one end of the dipole and has its long axis parallel to the axis of the dipole. At low values of ka, the lobes are wide and nearly circular. As ka is increased, the lobes become much narrower and additional lobes at other angles relative to the dipole may appear. Quadrupoles also generate sound fields with lobes but usually begin with more than two lobes at low ka, and add more and narrower lobes as ka is increased. Fletcher (1992) provides illustrations showing lobe patterns at various ka values for different types of sound radiators.

Implications for animal sound production

While the pressure wave generated by a sound source falls off with the reciprocal of distance traveled, the mass flow around the sound source decreases with the reciprocal of the square of the distance. In practical terms, detectable mass flow is limited to about one wavelength from the sound source, whereas the pressure wave will be detectable much further away. Animals trying to communicate at distances many times the size of their own bodies must therefore rely on detecting the pressure waves and not the mass flow. As we have seen, this imposes serious constraints because sound pressures will be produced very inefficiently unless the wavelength is the same size or smaller than the sound-producing organ. The smaller the animal, the smaller any sound producing organs will be, and this limits small animals communicating over significant distances to using high frequencies. If the animals need to produce sounds with ka < 1 for other reasons, a monopole design would be better than a dipole or quadrupole design.

Animals that use sound for close-range communication could use either the pressure wave or the mass flow to detect and identify the signal. If they use the latter, lower frequencies would give greater range, and a dipole or quadrupole would be a better choice of sound source than would a monopole.

Further reading

Bennet-Clark, H. C. 1971. Acoustics of insect song. Nature 234: 255–259.

Bennet-Clark, H. C. 1995. Insect sound production: transduction mechanisms and impedance matching. In Biological Fluid Dynamics (C. P. Ellington and T. J. Pedley, eds.), pp. 199–218. Cambridge, UK: Company of Biologists.

Bennet-Clark, H. C. 1998. Size and scale effects as constraints in insect sound communication. Philosophical Transaction of the Royal Society of London, Series B 353: 407–419.

Fletcher, N. H. 1992. Acoustic Systems In Biology. New York: Oxford University Press.

Fletcher, N. H. and N. D. Rossing. 1991. The Physics of Musical Instruments. New York: Springer-Verlag.

Kalmijn, A. J. 1988. Hydrodynamic and acoustic field detection. In Sensory Biology of Aquatic Animals (J. Atema, R. R. Fay, A. N. Popper, and W. N. Tavolga, eds.), pp. 83–130. New York: Springer-Verlag.

Michelsen, A. 1983. Biophysical basis of sound communication. In Bioacoustics: A Comparative Approach (B. Lewis, ed.), pp. 3–38. New York: Academic Press.