Chapter 23

Creating Online Audio

by Dick Oliver and John J. Kottler


CONTENTS

Sounds have the power to completely alter the perceived message of the visuals they accompany. If you need to create your own source audio, there are a number of things you should keep in mind. Working with audio can be a tricky task, one made even more confounding by the limitations imposed on content on the Web in order for it to be viewed or heard reasonably over slow connections. The following quick introduction to digital audio should help you better understand how to add sound to your Web pages.

Working with Digital Audio

Just as with moving pictures, the quality and convincing nature of a digital audio file is largely dependent on its speed. More precisely, this depends on the distance in time between one sample, or frame, and the next. In digital audio, this distance is known as the sampling rate and is generally expressed in KiloHertz (KHz). A digital audio device that records or plays back 11,000 samples per second (a fairly poor rate, but adequate for low-end professional audio) is said to operate at 11 KHz. If the sampling rate during recording or conversion is too low, the whole thing will sound terrible.

If sampling rate can be likened to the frames of a movie, sample resolution might represent the number of colors available for use on the screen. Whereas the sampling rate indicates how often a sample is taken, the sample resolution determines the degree of detail with which each sample is recorded-that is, the number of bits dedicated to each sample.

Take, for instance, the ruler that you use to make measurements. Typically you will find inches on the ruler as well as halves, quarters, eighths, and sixteenths of an inch. If you measured the length of a pencil using inches only, you wouldn't get a very accurate measurement. However, if you measured that same pencil using sixteenths of an inch, you would get a much more accurate measurement. It is the same with sampling resolution; the finer the measurements you make (using higher rates), the more accurate the sound recording will be.

Quick Lesson on Digital Audio
A sound wave is much like a water wave; it contains highs and lows, crests and troughs. A sound wave can be recorded by measuring its highs and lows (amplitude) over time. In order to capture sound digitally, an infinite amount of time must be broken into a definite number of points (this is referred to as the sampling frequency or rate). The amplitude of the sound wave at each point in time may then be measured and stored as a digital code. There are additional compression techniques, in which the differences between actual points on a sound wave and data points predicted mathematically by the compression algorithm are stored. By storing only these differences and not every data point, audio information can be stored using less space.

Sample size is also referred to as bit depth. The bit depth of an audio file indicates the number of bits that are grouped together in order to represent the "shape" of each sample. The more bits you use to record something, the more detail and subtlety are caught by your recording. The sample sizes of digital audio files are almost always either 8-bit (as in most multimedia files) or 16-bit (as in CD audio).

Sampling rate and sample size determine both the subjective quality of the sound and the size of the sound file. For Internet audio, you will almost always want to use the lowest sample rate (8 KHz) and sample size (8 bits) available. A CD-quality, 44 KHz sample may sound terrific, but just five seconds of speech at this quality takes up over 350 K. If you were using a 14.4 Kbps modem, this little track would take over five minutes to download!

Tip
Here's a general rule of thumb to help you quickly make a ballpark estimate of how big an 8-bit sound sample is. Each second of most music or sound effects produces a file size (in kilobytes) approximately equal to the sample rate (in KiloHertz). For example, a one-second sound sampled at 8 KHz is about 8 K. Ten seconds at 16 KHz is about 160 K.
Double those estimated file sizes for speech or complex, chaotic sounds. And, of course, double them again if you use 16-bit samples instead of 8-bit. This is a general rule for monophonic sound. If you plan on using stereo sound, the size of the file is double what it would be for mono. Digital-stereo recordings require that both the left and right audio channels be stored.

Audio File Formats

In the early days of multimedia computers, most machines (such as Apple, IBM, Tandy, and IBM clones) possessed their own, native audio file formats. Moving audio from one platform to another-especially without loss of sound quality-was a difficult procedure. Soon enough, audio card manufacturers began to develop audio file formats of their own, each with its own specific strengths and weaknesses. As you might expect, this confused the issue even further.

However, in the intervening years, the audio side of multimedia development has stabilized considerably, and a number of cross-platform solutions have found their way into popular acceptance. Today, those with digital audio enjoy a somewhat more stable and standardized set of file formats and protocols than their counterparts in video. It's also fairly easy to locate software utilities that enable you to record, edit, or convert audio files from one format to another. Sometimes, however, this process noticeably degrades the quality of the sound.

The most common file formats for digital audio on the Web are .aiff, .au, .mid, .snd, and .wav. Table 23.1 provides an overview of the strong and weak points of all these common audio formats.

Table 23.1. The five most common audio formats on the Internet.
Audio Interchange File Format (.aif, .aiff)-The audio file format of choice in the Mac world
Media type:audio/x-aiff
Native platform:Apple, SGI
Sample size:Typically 16-bit (variable from 1-bit to 32-bit)
Sampling rate:16 KHz
File size per minute:960 K
Sound quality:Very good; close to FM radio
Sun Audio (.au)- The native UNIX audio file format
Media type:audio/basic
Native platform:NeXT, Sun
Sample size:8-bit
Sampling rate:8 KHz
File size per minute:480 K
Sound quality:Fairly good; television/telephone quality
Musical Instrument Digital Interface (.mid, .midi)-Protocol for exchanging music notation electronically (known as MIDI)
Media type:audio/midi, audio/x-midi
Native platform:N/A
Sample size:N/A
Sampling rate:N/A
File size per minute:50 K
Sound quality:Unpredictable; depends on listener's sound card
Macintosh Sound Files (.snd)-Nearly identical to .au files in performance features
Media type:audio/basic
Native platform:Mac, NeXT, Tandy
Sample size:8-bit
Sampling rate:8-KHz
File size per minute:480K
Sound quality:Fairly good; television/telephone quality
Waveform Audio (.wav)-Microsoft Windows native audio format
Media type:audio/x-wav
Native platform:pc
Sample size:8-bit or 16-bit
Sampling rate:8 to 22 KHz
File size per minute:480 to 1024 K
Sound quality:Good to excellent; AM radio to CD quality

What is midi?
MIDI (Musical Instrument Digital Interface) files are not waveform files and therefore operate differently from any of the other file types listed in Table 23.1. Created and maintained by the International MIDI Manufacturers Association-a coalition of musical instrument manufacturers and electronics firms-MIDI is a protocol that allows information to be exchanged between synthesizers, tone boxes, computers, and other compatible devices. MIDI files don't contain any sounds at all, in the sense that a digital sample such as an .aiff or .wav file can be said to contain a sound.
Instead, a MIDI file consists of a stream of numbers, each of which corresponds to a specific audio event or attribute, such as voice, pitch, volume, sustain, decay, bend, and so on. Like the screen layout of an HTML document that may be viewed through various browsers, the audible result of a MIDI file's execution is informed-but not dictated-by its content. For example, some sound cards use extremely high-quality digital samples of a real piano, while others synthesize a "piano" sound that resembles a baby duck as much as a baby grand.
Furthermore, some MIDI devices may have a tuba or trombone sound loaded into the channel usually used for piano, so your New Age keyboard improvisation may sound like a brass band tuning up before a parade. Though MIDI sound cards are getting better every year and do generally follow the standard instrument mappings, it can still be difficult to predict exactly how a MIDI file will sound when played on someone else's machine.

Recording and Mixing Internet Audio

Now that you have some familiarity with how digital audio is stored, let's look at how to create some new audio files of your own. To create and edit audio content, you must use an appropriate tool. GoldWave is a program included on the CD-ROM with this book. This software enables you to record, edit, convert, and play digital audio on your Windows 3.x or Windows 95 computer. Figure 23.1 shows GoldWave in action.

Figure 23.1 : GoldWave is an audioediting program you can use to record, play, convert, and apply numerous special effects to sound files.

After you start GoldWave, you will notice that the program produces two windows. The first window is the Editor window, where graphical representations of sound waves are drawn. In the lower-right corner of your screen, a second Device Controls window presents you with some familiar buttons for controlling the playback or recording process. To begin recording, you must first create a new audio file using the File | New menu option. Choosing this option presents you with a dialog box, prompting you to choose the qualities of the sound you are creating.

Tip
When creating Internet sound files, remember that the smaller a file is, the better. Unfortunately, as stated earlier, this requires a compromise with the quality of the sound. To minimize file size, consider creating your new sound files as 8-bit, 11 KHz, monophonic (not stereo) files. This will add more noise to your sounds, but will significantly decrease the amount of storage space required for the sound file.

After you have created the new sound file, a new window is created in GoldWave's editor window. To record into this window, simply click on the red record button in the Device Controls window. The record button then changes into a purple stop button, and a vertical beam slowly progresses across the length of the new window you created. This beam is the position bar, which indicates how far along in the sound file you have recorded. When you click on the stop button, or when the position bar reaches the end of the file, the recording process stops. Whatever audio you have recorded is then converted to a graphical representation and displays in the Editor window.

Tip
Remember to check the audio settings and connections for your sound card prior to recording. You should make certain that the source you are recording is connected appropriately to the sound card. If you are using a microphone, make sure it is connected to the MIC input. If you are recording from a device such as an external CD player or stereo, make certain that these devices are connected to the LINE input.
Most sound hardware in computers is accompanied by software for controlling the sound card. Usually, this software includes mixer software for controlling the input and output levels of the sound card. Before recording, make sure that these settings are correct so that the correct input device is captured.

You can play, pause, rewind, and forward the contents of the sound file in the Editor window at any time by clicking on the appropriate buttons in the Device Controls window. You will use the play button often to audition the sound that you are creating.

Trimming the Fat

Typically, after you have captured the sound file, you need to alter it in some fashion. Most likely, you will need to trim the sound file that you have just recorded. Unless you are extremely coordinated, you often must start the recording process on the computer before actually capturing the audio input. This is necessary to avoid missing the introduction of the audio clip you are attempting to record. However, depending on the length of the pause between clicking on the record button in GoldWave and beginning the audio clip, you may have a considerable amount of silence at the beginning of your audio file. Silence before or after an audio clip in a file is both annoying and wastes valuable space. Therefore, you should immediately trim the audio file you have captured.

In GoldWave, you may specify a region of the audio file by using the mouse. As you pass the mouse cursor over an audio file in GoldWave, the cursor changes into an icon of two arrows pointing toward a vertical bar between them. Place this icon at the front of the sound file, where the squiggly line that represents the sound first appears, and click the left mouse button.

You can then move the mouse cursor to where all of the squiggly lines stop in your audio file and click the right mouse button. After doing so, a range is defined that includes only the sound itself, not the silence before or after the sound. The selected range is indicated by a blue highlight in the window for the sound you are editing.

After you have selected just the sound from a sound file, choose the Edit | Trim command. GoldWave discards all information that is not within the region you defined.

Tip
You may wish to save your sound files periodically, using different file names for each version you create. As you experiment with multiple effects and edit the sound wave, you may wish to undo your changes. Although GoldWave supports the Undo command, it will undo only your most recent change. If you apply three different effects and wish to undo them all, the only choice you have is to start over or load a previous copy of the sound file.

Adding Audio Effects

Ranges that you define in GoldWave are important for other features than simply trimming the sound file. You can think of the highlighted section of the sound file as selecting several words in a word processing program. Any action you perform applies only to the selected region. Therefore, you can copy a section of the entire sound and paste it somewhere else later in the sound file, just as you can with text in a word processor. You may also apply a variety of special effects to the selected region of a sound.

Pump Up the Volume

No matter how hard you try, certain sounds that you record are going to be too quiet. You can tell this because the graphical representation of the sound does not occupy the entire height of the window in which it is drawn. To maximize the amplitude of the sound wave to fill the height of the window (or normalize the sound wave), open the Effects | Volume menu. Also included in this menu are volume commands for fading in, fading out, changing the overall volume of the sound, and creating custom volume controls over time.

Mix Well

Another common function of sound-editing software such as GoldWave is the Mix command. This command enables you to take two entirely different sound files and mix them into one sound file. The result is a single sound file that plays both of the original sounds, at the same time. For instance, let's assume you would like to mix two sound files included with Windows: chimes.wav and chord.wav. To mix these two files together, first open the two files in GoldWave. Then select the chord.wav window, set the range for the sound file to include the entire sound file, and choose the Edit | Copy command from the menu. With the chord sound in the Clipboard buffer, choose the other window (chimes.wav) and then Edit | Mix from the menu. A dialog box appears, asking which volume to use when mixing the files together. A value of 100 indicates that both sounds will be the same volume when played together. Values higher than 100 instruct GoldWave to mix the sound file that is currently in the Clipboard so that it is louder than the original file. Likewise, lower values indicate that the content in the Clipboard should be quieter than the original sound file. The Mix command uses the contents of the Clipboard to mix with a sound file. After you have performed the mix, the Clipboard data will remain in the Clipboard for you to paste when desired, although it is not necessary to use the Paste command with the mix effect.

Echo, Flange, and Doppler Effects

GoldWave supports numerous effects that cannot be covered in depth in this section of the book. However, the "Skiing" video example we create later in this book mixes the sound of a person yelling "Help!" into the video. Two digital audio files are included on the CD-ROM with this book: help.wav and help2.wav. The first is a simple recording of someone screaming "Help!" The second file is the same audio recording of the word "Help," but after some special effects have been applied. After playing each file, you should be able to quickly distinguish the difference.

The second audio file (help2.wav) applies the echo, flange, and Doppler effects to the original sound. The echo effect, as its name implies, adds an artificial echo to the sound. The flange effect, which can make your voice sound like a mechanical robot, was then applied to make the voice file sound more like two people. The Doppler effect was used to simulate the fall-off in the voice, as if the skier were moving away down the hill. Figure 23.2 shows the Doppler effect with a curve that slowly falls down. This curve is then applied to the entire region selected in the sound file.

Figure 23.2 : You can create your own curves for the Doppler effect in GoldWave. In this example, the pitch of the sound drops quickly toward the end of the sound.

Note
The Doppler effect is an effect that can be heard in everyday life. Take, for instance, a train that blares its horn as it passes you. The train's horn uses only one pitch. But as the train approaches, you may notice that the pitch of the horn seems to rise. After it passes by, the pitch falls off. You can also hear this effect when an emergency vehicle with a siren passes by or someone drives by with a loud stereo.

Compressing and Saving Audio Files

After you have perfected the sound, you need to save it. As with almost anything else on the Internet, saving the sound is not quite as simple as it first appears. First, you must determine which file format is best for storing the audio file. If you plan on using this file with the greatest number of computers from around the world on the Internet, you may consider using the Sun/NeXT (.au) format or Mac/SGI (.aif) format. Netscape provides an audio player that can easily play sound files of either type on any computer platform. In any case, when you have decided on the format that best matches your needs, you must save your work using the File | Save option in GoldWave. A dialog box then appears, prompting you to specify the path and file name for the sound, as well as to determine what file format to use (see Figure 23.3).

Figure 23.3 : GoldWave enables you to save (and convert between) a wide variety of sound file formats and to choose attributes such as bit depth and compression type.

Note
The WAV (.wav) file format contains many additional file attributes that may not be found in the other audio file formats. One important attribute is MSADpcM. Using this attribute will effectively halve the size of the overall sound file. ADpcM (Adaptive Differential Pulse Code Modulation) is a compression technique that can save a considerable amount of space (and therefore time transferring the file across the Internet). As sound players that support the .wav format become available on non-Windows computers, you may decide to use this format for storing your audio content. The GoldWave audio editor included on the CD-ROM enables you to save your files using any of these compression techniques.

Audio Conversion

No matter what platform or operating system you use, sooner or later you'll come across an audio file that you'd love to have a copy of-except for one little problem: it's encoded in a file format you're unfamiliar with, or one for which you have no player. When this happens, you will need to use a conversion utility to translate the file into a format that your player software can understand.

In essence, all conversion utilities do exactly the same thing: They effectively perform a Save As… or Export command on the designated source file, creating a replica of the original data "as heard through the ears" of the target format. Because the various file formats all have different features and performance statistics, it helps to play around with them for a while, tweaking and comparing them until you get a feel for each format's particular strengths and weaknesses. Your trusty conversion utilities are your best tools for this sort of experimentation. You can convert between most common formats within GoldWave (and most other sound-editing programs) simply by opening the file you have and saving it in the format you want.

When you select a file format for output (that is, the target format), what you're really doing is telling the converter which algorithms and ranges or boundaries you want it to use in the upcoming translation process. The conversion utility then reads the file from end to end-applying a predetermined set of algorithms to the data in the file, multiplying, dividing, averaging, truncating, and rounding to the nearest equivalent-and finally saves the output of these calculations in the style of the target format.

Note
Conversion utilities can vary greatly in the amount of control they give to their users, as well as in the quality of their output. Generally speaking, they will produce translations that approach-but do not quite equal-the quality of the original. Of course, depending on the formats involved, the conversion software used, and the contents of the original (source) file, the differences between the two versions may sometimes turn out to be more drastic than you had expected. And in any case, the result file rarely (if ever) comes out sounding better than the original-unless you have some sort of rare and wondrous talent!
For this reason, it is a good idea to view or play a converted file immediately after performing the conversion. If the result file is unsatisfactory, be sure to delete it right away; this will protect you from the possibility of confusing the two files later on.

Audio Helper Apps and Plug-Ins

Helper apps are programs that are external to the browser, but that are called into action when the browser receives files of their designated media type. They are alternately known as players or viewers.

Plug-ins differ from helper apps in that they insert their associated media files inline-that is, directly within the body of the rendered HTML document. Unlike audio helper apps, audio plug-ins can generally be set up to begin playing immediately upon page load-the user doesn't even have to click on anything. The following is a sample of audio helper applications and plug-ins that are available:

Many of the newer helper apps and plug-ins utilize a technology known as streaming. They receive and use audio data so that they can begin playing the file before the entire download is completed. In the past, in order to hear an audio file, you first were required to download the audio file in its entirety from the Internet. Depending on the size of the audio file, this could take some time. Streaming audio (and video) enables the helper application or plug-in to download the beginning of the file. This portion is then interpreted by the plug-in or helper application, and the audio (or video) starts playing the little data that is available. As that content is playing, additional information is downloaded from the Internet in the background. This gives the appearance of much quicker connectivity to information-rich content, such as video and audio. Streaming technology is made possible by efficient data compression and the power of today's multithreading and multitasking operating systems.

One nice feature of streaming audio and video is that it requires no special action from the user. The user simply clicks on a link to play the audio clip, and the appropriate helper application or plug-in is invoked automatically (assuming that the program is available on the user's computer). In most cases, the only way in which streaming audio is differentiated from standard audio files is by conscious efforts made by Web-page developers to place text or graphics indicating that streaming audio is available on a page.

It is important to recognize that not all streaming technologies are created equal. All are bound to have strong points and weak points (and in these early days of Web-based multimedia, there are still more weak points than the average TV viewer or radio listener would care to put up with). Each does a better job at compressing certain types of sound, depending on the volume, pitch, and tone of the sound, and the algorithm used to represent that sound digitally.

Note
TSPlayer is unique among audio helper apps in the previous listing. It does not play any audio files on its own. Rather, it passes audio data (in .wav format) through itself to the actual audio helper app (which may be any application that can play .wav files). This produces an effect similar to streaming, in that the audio file begins playing before the download is completed.

Summary

This chapter explained the basics of creating your own digital audio. Chapter 24, "Working with Video," initiates you into the art of motion-video production for the Internet.