| Instructor: Dr Ho Chee Kong |
Stereo phonics and its effects on music
Gan Tiaw Leong, USAR02 ["Music And Technology"], University Scholars Programme, National University Of Singapore.
Introduction
In general, stereo phonics refers to the technology of creating sounds that have a three-dimensional feel to them. According to Eargle, it is specifically a form of audio reproduction which records, transmits and reproduces the original sound with two channels, regardless of the number of loudspeakers used. (44) In this paper, we will examine certain aspects of hearing, acoustics and stereophonic technology. Using these points, I will illustrate the effects of stereo phonics on the listening to and creating of music.
The understanding of the workings of human hearing and the science of acoustics is fundamental to learning about stereo phonics. In his book, Everest quoted Richard Heyser, "Stereo is merely an attempt to create the illusion of reality through the willing suspension of disbelief." (1) In simple terms, this technology takes what we know about hearing and acoustics and applies that knowledge to the goal of tricking the brain.
Hearing
I will not touch on details of the physics of sound or the physical structure of the ear. Both of the books in the reference list provide in-depth descriptions on these topics. For our purpose, all we need to know is that sound propagates as waves, these waves vibrate the eardrums and the brain interprets the auditory signals received and the phenomena of heard sound is created. The role that the brain plays is very important to hearing.
One of the functions of hearing is our ability to determine the general location of sounds heard. This function is known as localization phenomena. The human auditory system is capable of perceiving sounds arriving from all directions and identifying the direction from which each sound propagates. According to Everest, this surround or stereo effect relies on certain cues that our ear-brain system can interpret. He illustrates using a recording array with two microphones spaced apart:
"The trumpet is closer to the right microphone than the left, so the trumpet sound is of higher intensity in the right channel. Also, because it is closer, the sound of the trumpet reaches the right microphone a few thousandths of a second earlier than the left. When listened to separately, the right and left channel sounds are so similar that little, if any, difference could be detected." (14)
The ear-brain system "knows" about and in fact expects the differences when interpreting sounds. Since, the actual location of a sound source causes the intensity and time differences between our ears, when the brain detects such differences, it places the sound source in a three-dimensional space accordingly.
There is a need for the brain to rely on both types of cues equally because intensity effects are not totally consistent. Everest explains, "At higher frequencies, the different intensities perceived at the two ears account for most of our ability to differentiate stereo-directionality. This is not true however, at lower frequencies where the intensity of sound at the two ears is almost the same." (23) The lack of intensity difference in low frequencies is due to the way propagating sound waves behave when they meet an obstacle.
The behaviour of sound waves in a free field (no environment reverberation) when there is an obstacle in the path of radiation is dependent on the wavelength or frequency of the sound. According to Eargle, an object of relatively small size to the radiated wavelength is virtually invisible to the propagation of sound. Sound waves are said to diffract, or bend, around the obstacle and continue as before. As the wavelength gets shorter relative to the object, more of the sound is reradiated by reflection from the obstacle and a shadow zone is progressively created behind the obstacle. (14)
This effect also occurs around the head, which is an obstacle to sounds reaching the ears. The head provides significant shadowing of high frequencies but at low frequencies sound pressure at the ears will not be significantly different. Eargle states that in the low frequency region, phase or time relationships at the two ears are significant in making lateral localization judgments." (42)
However, there is still the problem of identifying the vertical direction of the sound source, as two vertical displaced sound sources will have the same intensity and time differences. This is where the pinnae or outer ears are of importance since they reflect incoming sound in ways that depend on the angle of the source. According to Everest, pinnae reflections and head diffractions affect the perception of height by emphasizing or depressing certain frequency ranges of the incoming sound. (52)
How does the ability to recreate the stereo effects affect music? The ability to recreate localization effects in recordings or electronic music performance opens up a whole realm of possibilities for musicians. For example, the musician can create certain spatial effects in a recording that would be difficult if not impossible to produce live. Imagine a musical performance where the listener can hear the musical instruments shift around and interact as the music plays. This effect will probably give the musical piece a greater sense of motion and interaction, but certain instruments like the piano or the drums cannot moved about in live performances easily. For electronic music, the benefits are obvious. The major complaints about electronically generated music are the lack of realism, the flatness of the music and the "clean tones". What are really missing are the spatial qualities that our brain expects when we hear sounds. Stereo phonics can used to inject some of the spatial relationship that should be present between instruments.
Acoustics
Sound fields are an important concept in acoustics and they refer to the environment in which sound waves propagate. The types of sound fields in which sounds exist directly affect how we hear them. For example, the distance to acoustic power relationship differs when a listener moves from the near field to the far field. Openness of the environment has great impact on the propagation of sounds. In the indoor environment, reflective surfaces can drastically change the sounds produced by sound radiators. For example, when a plane wave front hits an uneven surface, the reflected wave fronts are displaced and the resulting sound is distorted.
Enclosed environments have another effect on sounds. The reverberation effect comes from the repeated reflections of a sound by the walls. Everest explains, "Sound travels from a source to a human ear over a direct path. Reflections of the same sound could arrive at the ear over numerous reflected paths. Each reflection is delayed by the amount of time needed to travel the distance between the source and the listener's ear. Sometimes the reflections are helpful, but other times they are disastrous." (45) The effect of reverberation tends to manifest as a sense of spaciousness to the listener. The sense of spaciousness imparted by large music halls comes from the lateral reflections of sounds off the walls.
As mentioned before, obstacles in the path of sound radiation can create a shadow effect for high frequencies. Since musical instruments are themselves obstacles to omnidirectional sound radiation, different musical instruments have very different directional properties for sound radiation depending on the frequency range and size of the instruments.
Therefore, sound engineers need to take note of the characteristics of the sound field, the effects of reverberation in the environment as well as the directional differences of musical instruments in order to accurately capture and reproduce of musical performances. With this knowledge of acoustics, both musicians and listeners can improve on the musical experience by modifying the performance or listening environment to obtain the best results.
Stereophonic recording
Now we explore the art of stereophonic recording and its effects on music. The first demonstration of stereophonic sound transmission was the simple set-up of a two-microphone system with two corresponding loudspeakers. Eargle describes it in detail:
"The spacing of the two omnidirectional microphones was roughly the same as that of the ears, and the padded baffle between them provided shadowing just as the head does. By matrixing and equalizing the microphones at low frequencies, he synthesized a pair of side-facing directional microphones.
On playback over a pair of loudspeakers, accurate stereo perspectives were presented. Those sources of sound to the left of the microphone array were heard predominantly from the left loudspeaker, and the same held for the right. And for those listeners equidistant from the loudspeakers, sources of sound located towards the middle of the microphone array were heard along the median plane.
The sound sources originating from positions where there are no loudspeakers are called phantom images. Note that a real sound source located directly in front of the listener will produce equal phasors at both ears, and this is interpreted as a source directly ahead. A source displaced laterally will cause different phasors at the ears, with a leading angle at the nearer ear and a slightly louder signal at that ear." (45)
This simple system is successful because of the way the microphones are set up. As mentioned above, the recording array bears resemblance to a head and ears. Therefore, the two microphones more or less pick up the same vibrations that the ears would have in the same position. Those vibrations are then transmitted and played back on the loudspeakers. The brain is tricked into localizing sound in between the two loudspeakers due to the intensity and timing cues captured. Eargle explains this virtual localization or precedence effect:
"If two loudspeakers produce the same signal at the same time, a listener equidistant from them will localize the apparent source of sound between the two. If one of the loudspeakers is delayed with respect to the other, then the listener will immediately localize at the earlier of the two loudspeakers. For delays up to about 5 ms, a 10-dB increase in the level of the delayed loudspeaker may be accommodated, with localization still tending towards the earlier loudspeaker.
Beyond about 5 ms, a constant 10-dB level differential can be accommodated, and this range extends to about 25 or 30 ms." (42)
The ability of the precedence effect to trick the brain into creating phantom images of sound source is quite useful. In sound reinforcement, for example, loudspeakers located under balconies are normally delayed so that their sound arrives at the same time as (or slightly later than) that from the front loudspeaker. In that way, natural localization is preserved for those listeners. (43) Another useful application of the precedence effect is in echo suppression. Eargle explains:
"We have an undesirable echo from the back of a large room caused by a concave surface. Since the echo is some 60 ms behind the sound arriving from the front, it will be noticed as such and will be disturbing to listeners. Now, if a directional loudspeaker is placed over the echo interference area and is delayed at the listeners 30 ms with respect to sound arriving acoustically from the front, then we will have masked the echo from the back of the room." (43)
What has happened is that sound from the front of the room masks the sound from the overhead loudspeaker, because the delay is just within the allowable range of 30 ms. In the same manner, the sound from the overhead loudspeaker masks the echo from the rear.
Using stereophonic recording arrays similar to the one mentioned above, sound engineers and musicians could record live performances with high accuracy. Preserving not only the musical sounds from the singers or musical instruments but also sound effects from the environment itself. The whole performance is virtually brought into the recording listener's home. However, Eargle also mentioned that stereo phonics is a limited medium. It is not possible for two channels to fully capture the whole sound environment, especially not over loudspeaker playback. In such case, sound engineer need to employ techniques of pseudostereo to make up for the imperfections in recording. (245) Through sound reinforcement and echo suppression, the quality of live performances in large venues can be improved as well.
Binaural audio
Normal hearing through the two ears is termed binaural. In binaural audio, the idea behind the microphone array in the earlier stereophonic example is taken a step further. In order to achieve even greater realism in sound recording, the whole head has been modelled. As mentioned before, the pinnae and the head has certain distinguishable effects on hearing, therefore a recording will never be as good as the real experience unless those effects are captured as well. Eargle likens binaural audio to be the aural analog of the old parlour stereopticon, in which each eye saw its own picture; the two having been taken by slightly separated cameras. If extreme care is taken in the modelling of the artificial head, then binaural listening over headphones can reproduce many of the fore aft and height localization effects that we take for granted with normal listening. (42) Sunier describes the binaural experience:
"The binaural experience places the listener sonically where the sounds on the recording or broadcast originated, and requires no special equipment of any sort other than the binaural source and a pair of stereo headphones. The listener experiences sounds quite accurately localized in a complete 360-degree sphere- a true virtual audio environment." (1)
How does it create this virtual audio experience? It does it via two tiny omnidirectional mikes placed at the entrance of the ear canals on a dummy head. The two signals are kept entirely separate all the way from this artificial head mike system to the corresponding left and right drivers of the headphones worn by listeners. In this way, the ears of the listener are brought into the position of the ears of the dummy head.
Even though, there are similarities between stereophonic recording and binaural, they are essentially different. Stereophonic recordings contain two channels and are produced for loudspeaker reproduction with no restrictions on the number of loudspeakers. Binaural recordings are designed specifically for headphone reproduction. It thus requires the use of two channels fed by microphones spaced about seven inches apart.
Sunier relates an example of the kinds of experience that binaural recording can provide:
"Instead of sitting out in the front row of the audience to tape an early music ensemble, one recordist set up his dummy head with microphones in a chair right in the middle of the group onstage - creating an effect as though the listener is one of the musicians performing! The surrounding spatiality adds great interest to the music. Another recordist taped his taking an elevator, walking into the concert hall and settling in his seat at the beginning of a concert and then the reverse at the end to make it a more complete binaural experience for listeners." (1)
Binaural audio has the potential of revolutionizing the way people listen to music. Binaural recordings of important musical events would allow for an audience of infinite size to relive the events years after. Binaural recordists might also become musicians in their own right, creating unique sound performances in life. For example, a famous singer could record himself singing, then listeners can imagine themselves to be the ones singing the lovely song.
Pseudostereo
This is the part of stereo phonics that deals with creating virtual sound fields. Unlike stereophonic recording, which seeks to capture the natural soundscape, pseudostereo is the reversal of the process of recording. Working either with monaural recordings, electronically created works or musicians playing in an acoustics undesirable sound field, the sound engineers produce stereophonic output through sound field synthesis. Sound field synthesis takes the knowledge we have on acoustics and uses digital technology to create a virtual sound field that has the characteristics of natural sound fields. Eargle explains in detail:
"This category of enhancement covers a variety of approaches, all making use of digital delay and reverberation devices. The general approach is to sample the sound field in the area over the performers and feed the stereophonic signals to a network of delays and reverberation simulation. Discrete delays can simulate early reflections, and the overall reverberation time setting is modeled after some larger acoustical space. Though careful adjustment of all parameters, a convincing impression of large room size can be created. (25)
However, according to Eargle, certain musical forms like symphony orchestras are not suited for sound field synthesis. (255) This is because acoustics involved in such performances are very complex and sometimes impossible to model accurately.
Conclusion
Stereo phonics is the result of the application of technology to solving the problem of accurately capturing the fullness of natural sounds. In the process, we have learned a great deal about the nature of sounds and hearing. Modern digital audio capabilities are prime examples of a wonderful marriage between music and technology. Their capabilities have exceeded mere recording and reproduction of naturally occurring sounds.
Synthesizers have opened up the vocabulary of music to virtually all possible sounds, but the musical compositions usually sound flat. Using digital audio techniques like time delays and reverberation units, we can endow electronically produced works with spatial qualities. Virtual reality environments can be made more believable with such immersion techniques. Stereophonic technology is a step to enhance the listening experience and it forces us to rethink the differences and relationship between live performances and recordings. Music and the way we think of music have been and will be changed by technologies like stereo phonics.
References
- Eargle, John. Music, sound and technology. New York: Van Nostrand Reinhold, 1990.
- Everest, F. Alton. The new stereo soundbook. Blue Ridge Summit, PA: TAB Books, 1992.
- Sunier, John. Binaural in depth. Jun. 1999. 1 Apr. 2001.
|