Digital audio involves the creation, reproduction and transmission of sound stored in a digital format. Digital audio includes CDs as well as any sound files stored on a computer or transmitted over digital networks.
Physically, sound consists of vibrations of different frequencies and amplitudes. We hear differences in vibration frequency as the pitch (or timbre) of the sound, while changes in amplitude are recognized as the loudness (or volume) of the sound. In a typical visual display of sound vibrations, frequency is displayed on the X-axis (a function of time) and amplitude is portrayed on the Y-axis:
|
For a good in-depth technical tutorial on digital audio (by Chris Dobrian, Univ CA - Irvine) visit the following link:
|
Digital audio represents these sound waves as a stream of binary data that contains both frequency and amplitude information. Just as the quality of an image depends on the amount of data describing it (number of pixels, color depth, etc), so to does the quality of digital sound. The two key data elements determining the quality of a digital audio recording are the sampling rate and the bit-depth.
The sampling rate is the frequency at which 'snapshots' of the original waveform are taken (telling us how many amplitude changes occur per unit time). Digital audio bit-depth is analogous to digital image resolution, previously discussed. Bit-depth is the precision with which amplitude changes are recorded (how much data represents each amplitude change).
Higher sampling rates and higher bit-depths give greater fidelity. For example, CD quality digital audio samples at the rate of 44,100 times per second (44.1 KHz), and uses 16 bits (2^16 or 65,536 divisions of amplitude) to store each sample point.
However, higher sampling rates and bit-depths also require proportionately more bandwidth (for transmission) and more disk space (for storage). The table below provides three common examples of digital audio quality and their needed transfer rates and storage space. For example, one hour of CD quality audio requires about 600 Mb of storage space (60 min/hr x 10 Mb/min). Since a typical CD has a capacity of about 650 Mb, it can store over a hour of high-quality audio. For CD quality audio recording and processing on a computer hard disk, several gigabytes (1 Gb = 1000 Mb) of storage space are needed.
|
Relative Sound Quality |
Sampling |
Bit |
Transfer Rate |
Storage (Mb) per |
|
Audio CD (stereo) |
44.1 KHz |
16 bit |
172 Kb/sec |
10.1 Mb |
|
AM Radio (mono) |
22.05 KHz |
8 bit |
22 Kb/sec |
1.26 Mb |
|
Telephone (mono) |
11.025KHz |
8 bit |
11 Kb/sec |
630 Kb |
Due to their large size, CD quality digital audio files are not practical to store or transmit. For efficient storage and transmittal, these files are compressed. There are two ways to compress audio files: standard file compression and audio file compression. Standard file compression (such as ZIP technology) simply tries to minimize the space occupied by redundant data in the file. Audio compression (also called psycho-acoustic compression) is a much more sophisticated approach that decreases file size by carefully analyzing the frequency spectrum and (1) removing sounds that fall outside the range of human hearing, and (2) removing sounds that are covered up by louder sounds. Using these techniques, audio compression can reduce file size by over 90%, without any major decrease in the perceived quality of sound.
There are two basic types of audio files: the traditional discrete audio file, that you can save to a hard drive or other digital storage medium, and the streaming audio file that you listen to as it downloads in real time from a network/Internet server to your computer.
Common discrete audio file formats include *.WAV, *.AIF, *.AU and *.MP3. A fifth format, called MIDI is actually not a file format for storing digital audio, but a system of instructions for creating electronic music.
WAV
The *.WAV (pronounced wave) format is the standard audio file format for Microsoft Windows applications, and is the default file type produced when conducting digital recording within Windows. It supports a variety of bit resolutions, sample rates, and channels of audio. This format is very popular upon IBM PC (clone) platforms, and is widely used as a basic format for saving and modifying digital audio data.
AIF/AIFF
The Audio Interchange File Format (AIFF) is the standard audio format employed by computers using the Apple MacIntosh operating system. Like the *.WAV format, it supports a variety of bit resolutions, sample rates, and channels of audio and is widely used in software programs used to create and modify digital audio.
AU
The *.AU file format (au for audio) is a compressed audio file format developed by Sun Microsystems and popular in the Unix world. It is also the standard audio file format for the Java programming language. Only supports 8-bit depth, thus cannot provide CD-quality sound.
MP3
MP3 stands for "Motion Picture Experts Group, Audio Layer 3 Compression." MP3 files (*.mp3) provide near-CD-quality sound but are only about 1/10th as large as a standard audio CD file. Because MP3 files are small, they can easily be transferred across the Internet and played on any multimedia computer with MP3 player software.
MIDI/MID
MIDI (Musical Instrument Digital Interface), pronounced mid-dy, is not a file format for storing or transmitting recorded sounds, but rather a set of instructions used to play electronic music on devices such as synthesizers (somewhat like a musical score). The MIDI specification includes hardware (the electronic devices and their interfaces) and software (the rules for encoding and sound information). With MIDI, a musician can use a keyboard to simulate the sounds of many different instruments, and even add special effects. MIDI files are very small compared to recorded audio file formats. However, the quality and range of MIDI tones is limited.
Streaming is a network technique for transferring data from one computer (a server) to another (the receiver or client) in a format that can be continuously read and processed by the client computer. Using this method, the client computer can start displaying the initial elements of large time-based audio or video files before the entire file has been downloaded. As the Internet grows, streaming technologies are becoming an increasingly important way to deliver time-based audio and video data.
For streaming to work, the client side has to receive the data and continuously feed it to the player application. If the client receives the data more quickly than required, it has to temporarily store or buffer the excess for later play. On the other hand, if the data doesn't arrive quickly enough, the audio or video presentation will be interrupted.
There are three primary streaming formats that support audio files: RealNetwork's RealAudio (*.RA; *.RM files); Microsofts Advanced Streaming Format (* ASF files) and its audio subset called Windows Media Audio 7 (*.WMA files) and Apples QuickTime 4.0+ (*.MOV files. )
RA/RM
For audio data on the Internet, the de facto standard is RealNetwork's RealAudio (*.RA) compressed streaming audio format. These files require a RealPlayer program or browser plug-in (available as a free download for both PCs and the Mac). The latest versions of RealNetworks server and player software can handle multiple encodings of a single file, allowing the quality of transmission to vary with the available bandwidth. Webcast radio broadcast of both talk and music frequently uses RealAudio. Streaming audio can also be provided in conjunction with video as a combined RealMedia (*.RM) file.
ASF
Microsofts Advanced Streaming Format (*.ASF) is similar to designed to RealNetwork's RealMedia format, in that it provides a common definition for Internet streaming media and can accommodate not only synchronized audio, but also video and other multimedia elements, all while supporting multiple bandwidths within a single media file. Also like RealNetwork's RealMedia format, Microsofts ASF requires a program or browser plug-in (Windows Media Player, available as a free download for PCs using the Windows OS).
The pure audio file format used in Windows Media Technologies is Windows Media Audio 7 (*.WMA files). Like MP3 files, WMA audio files use sophisticated audio compression to reduce file size. Unlike MP3 files, however, WMA files can function as either discrete or streaming data and can provide a security mechanism to prevent unauthorized use.
MOV
Apple Quicktime movies (*.MOV files) can be created without a video channel and used as a sound-only format. Since version 4.0, Quicktime provides true streaming capability. QuickTime also accepts different audio sample rates, bit depths, and offers full functionality in both Windows as well as the Mac OS.
© Craig L. Scanlan, 2001. Version 2.0 - January 2002. Original version January 2001.