Since the introduction of the compact disc in the early 1980s,
digital technology has become the standard for the recording
and storage of highfidelity audio. It is not difficult to see
why. Digital signals are robust. Digital signals can be
transmitted and copied without distortion. Digital signals can
be played back without degrading the carrier. Who would want to
go back to scraping a needle along a vinyl groove now?
Another advantage of digital audio signals is the ease with
which they can be manipulated. Digital Signal Processing (DSP)
technology has advanced to such an extent that almost any audio
product, from a mobile phone to a professional mixing console,
contains a DSP chip. Once again the reasons for the success of
DSP are simple: stability, reliability, enhanced performance,
and programmability. Signal processing functions can be
implemented for a fraction of the cost, and in a fraction of
the space required by analog circuitry, as well as providing
functionality that simply couldn't be done in analog. In fact,
so ubiquitous has it now become that, for many people, the word
"digital" has become synonymous with "high quality".
The everincreasing performance and falling cost of DSP
hardware have generated new applications and new markets for
digital audio in both the consumer and professional audio
sectors. Digital Versatile Disk (DVD) and digital surround
sound in the home, digital radio and handsfree cellular phones
in the car are just a few of the DSPbased technologies which
have appeared in the last few years. The demands on the
quality, speed and flexibility of DSP has also increased as
more functionality is added to DSP products: a DSP might now be
required for mixing, equalization, dynamic range compression,
and data decompression, all in one product, implemented on one
chip.
16bit, 44.1 kHz PCM digital audio continues to be the
standard for high quality audio in most current applications
such as CD, DAT, and highquality PC audio. Recent
technological developments and improved knowledge of human
hearing, however, have created a demand for greater data word
lengths. Analogtodigital converters (ADCs) now available
support 18, 20, and 24 bits and are capable of exceeding the
96dB dynamic range available using 16bit data words. Many
recording studios now routinely master their recordings using
20 or 24bit recorders. These technological developments are
beginning to make their way into the consumer and "prosumer"
audio applications. The most obvious consumer audio impact is
DVD, which is capable of carrying audio with up to 24bit
resolution at sample rates well above 48 kHz. Another example
is a 16channel digital home studio recorder, capable of
sampling at a 96 kHz sample rate with 24bit resolution. In
fact, three trends can be identified which have influenced the
current generation of digital audio formats which are set to
replace CD digital audio. These can be summarized as
follows:
 Higher resolution"either 20 or 24 bits per data word
 Higher sampling frequency"typically 96 kHz and 192
kHz
 More audio channels for a more realistic "3D" sound
experience.
Lowcost, higherperformance DSPs are now appearing on the
market to satisfy the high dynamic range requirements for
processing or synthesizing audio signals. How many bits are
required for processing audio signals? Is it 16, 20, 24, or 32
bits? Does the audio application require fixedpoint of
floatingpoint arithmetic? What undesirable sideeffects of
quantization should the audio designer look out for?
The first section in this article briefly reviews desirable
characteristics of a DSP for use in audio applications, and
then discusses the differences in data formats for fixed and
floatingpoint processors. Next, the relationship of dynamic
range to data word size in processing audio signals is
examined. This will aid in determining how many bits would be
required for your application, whether it is a lowercost,
lowfidelity consumer device or highperformance,
highfidelity professional audio gear. Finally, to design a
system with either CDquality or professionalquality audio, it
is suggested that for a digital filter routine to operate
transparently, the resolution of the processing system must be
considerably greater than that of the input signal. For the
highestquality, professional audio systems, a 32bit DSP is
offered as a suggested solution.
What Are the Benefits of Using a DSP to Process Audio
Signals?
A digital signal processor has one purpose: to operate on
quantized signal data as quickly and efficiently as possible.
Compared to a typical CPU or microcontroller, a
wellarchitected DSP usually contains the following desirable
characteristics to perform realtime DSP computations on audio
signals:
 Fast and Flexible Arithmetic
Singlecycle computation for multiplication with
accumulation, arbitrary amounts of shifting, and standard
arithmetic and logical operations.
 Extended Dynamic Range for Extended Sumof Product
Calculations
Extended sumsofproducts, common in DSP algorithms, are
supported in multiplyaccumulate units. Extended precision in
the multiplier's accumulator provides extra bits for
protection against overflow in successive additions to ensure
that no loss of data or range occurs.
 Singlecycle Fetch of Two Operands For SumofProducts
Calculations
In extended sumsofproducts calculations, two operations are
needed on each cycle to feed the calculation. The DSP should
be able to sustain twooperand data throughput, whether the
data is stored onchip or off.
 Hardware Circular Buffer Support For Efficient Storage
and Retrieval of Samples
A large class of DSP algorithms, including digital filters,
requires circular data buffers. A circular buffer is a finite
segment of the DSP's memory defined by the programmer that is
used to store samples for processing. Hardware Circular
Buffering is designed to allow automatic address pointer
wraparounds to the beginning of the buffer for simplifying
circular buffer implementations, and thus reducing overhead
and improving performance. When circular buffering is
implemented in hardware, the DSP programmer does not have to
be concerned with the additional overhead of testing and
resetting the address pointer so that it does not go beyond
the boundary of the buffer.
 Efficient Looping and Branching for Repetitive DSP
Operations
DSP algorithms are repetitive and are most logically
expressed as loops. For digital filter routines, a running
sum of MAC operations is typically executed in fast and efficient loop structures. A DSP's program sequencer, or
control unit, should allow looping of code with minimal or
zero overhead. Any loop branching, loop decrementing, and
termination test operations are built into the DSP control
unit hardware. Also, no overhead penalties should result for
conditional branching instructions which branch based on a
computation unit's status bits.
All of the above architectural features are used for
implementation of DSPtype operations. For example, convolution
is a common signal processing operation involving the
multiplication of two sets of discrete data, an input
multiplied with a shifted version of the impulse response to a
system, and keeping a running sum of the outputs. This is seen
in the following convolution equation:
DSP architectural features are designed to perform these
types of discrete mathematical operations as quickly as
possible, usually within a single instruction cycle. Examining
this equation closely shows elements required for
implementation. The filter coefficients and input samples
required to implement the above equation can be stored in two
memory arrays defined as circular buffers. Both circular
buffers need to be multiplied together and added to the results
of previous iterations. To perform the operation shown above,
the DSP architecture should allow one multiplication to be
executed, along with an addition to a previous result in a
single instruction cycle. Within the same cycle, the
architecture should also contain enough parallelism in the
compute units to enable memory reads of the next sample and
filter coefficient for the next loop iteration. Hardware
looping circuitry included in the architecture would allow
efficient looping through the number of iterations with
zerooverhead. When used in a zerooverhead loop, digital
filter implementations become extremely optimized since no
explicit software decrement, test and jump instructions are
required. Thus, for actual implementation of the convolution
operation, two circular buffers, multipliers, adders, and a
zerooverhead loop construct are required. A digital signal
processor contains the necessary building blocks to accomplish
implementation of discretetime filter operations.
In performing these types of repetitive DSP calculations,
quantization errors from truncation and rounding can accumulate
over time, degrading the quality of the DSP algorithmic result.
The number of bits of resolution used in the arithmetic
computations, along with a given filter structure realization,
will determine the robustness of a filter algorithm's signal
manipulation. The rest of this article will discuss how many
bits would potentially be required for a particular audio
application, as this is determined by the complexity of the
processing and the desired target signal quality.
DSP Numeric Data Formats: Do I Require Fixed or Floating
Point Arithmetic For My Audio Application?
Depending on the complexity of the application, the audio
system designer must decide on how much computational accuracy
and dynamic range will be needed. The most common native data
types are explained briefly in this section. 16 and 24bit
fixedpoint DSPs are designed to compute integer or fractional
arithmetic. 32bit DSPs, such as the Analog Devices ADSP2106x
SHARC family, were traditionally offered as floatingpoint
devices; however, this popular family of DSPs can equally
perform both floatingpoint arithmetic and integer or
fractional fixedpoint arithmetic.
16, 24, and 32Bit FixedPoint Arithmetic
DSPs that can perform fixedpoint operations typically use a
twos complement binary notation for representing signals. The
representation of the fixedpoint format can be signed
(twoscomplement) or unsigned integer or fractional notation.
Most DSP operations are optimized for signed fractional
notation.
The numeric format in signed fractional notation makes sense
to use in DSP computations, because in a fractional representation
it would easily correspond to a ratio of the full range of
samples produced from a 5V ADC, as shown in Figure
1. It is harder to overflow a fractional result, because
multiplying a fraction by a fraction results in a smaller
number, which is then either truncated or rounded. The highest
fullscale positive fractional number would be 0.99999, while
the highest full scale negative number is 1.0. Anything in
between the highest representable signal from the converter
would be a fractional representation of the "loudest" signal.
For example, the midway positive amplitude for a converter
would be 1/2, and this would be interpreted as a fractional
value of 0x4000 by the DSP.
Figure 1: Signed twoscomplement representation of
sampled signals
Figure 2: Fractional and integer formats for a Nbit
number
In the fractional format, the binary point is assumed to be
to the to the left of the LSB (sign bit). In the integer
format, the binary point is to the right of the LSB (Figure
2).
Fractional math is more intuitive for signal manipulation,
and it is the least significant bits in a fractional result
that we will examine in this article, since it is these
lower order bits that can suffer from quantization errors due
to finite word length effects. The more bits that are used to
represent a given audio signal, the more accurate the
arithmetic result.
32/40bit FloatingPoint Arithmetic
Floatingpoint math offers flexibility in programming because
it is much harder to overflow a result, while the programmer is
less concerned about scaling inputs to prevent overflow. IEEE
754/854 Floatingpoint data is stored in a format that is 32
bits wide, where 24 bits represent the mantissa and 8 bits
represent the exponent. The 24bit mantissa is used for
precision while the exponent is for extending the dynamic
range. For 40bit extended precision, 32 bits are used for the
mantissa while 8 bits are used to represent the exponent
(Figures 3 and 4).
Figure 3: IEEE 754/854 32bit single precision
floatingpoint format
A 32bit floating point number is represented in decimal
as:
Its binary numeric IEEE format representation is stored on
the 32bit floating point DSP as:
It is important to know that the IEEE standard always refers
to the mantissa in signedmagnitude format, and not in
twoscomplement format. The extra hidden bit effectively
improves the precision to 24 bits and also insures any number
ranges from 1 (1.0000....00) to 2 (1.1111....11) since the
hidden bit is always assumed to be a 1.
Figure 4: 40bit extended precision floatingpoint
format
Figure 4 shows the 40bit extended precision format
available that is also supported on the ADSP2106x family of
DSPs. With extended precision, the mantissa is extended to 32
bits. In all other respects, it is the same format as the IEEE
standard format. 40bit extendedprecision binary numeric
format representation is stored as:
For audioprocessing, the dynamic range of floating point
may be unnecessary for some algorithms, but the flexibility in
programming in floatingpoint is desirable, especially for highlevel programming languages
like C. Keep in mind that many of the fixedpoint precision
issues discussed in later sections would still apply for a DSP
that supports floating point arithmetic, at least in terms of
truncation and coefficient quantization. The programmer still
has to convert the fixedpoint data coming from an ADC to its
floatingpoint representation, while the floatingpoint result
has to be converted back to its fixedpoint equivalent when
the data is sent to a DAC.
Floatingpoint arithmetic was traditionally used for
applications that have very high dynamic range requirements,
such as image processing, graphics, and military/space
applications. The dynamic range offered for 32bit IEEE
floatingpoint arithmetic is 1530 dB. Typically in the past,
tradeoffs were considered with price vs. performance when
deciding on the use of floatingpoint processors. Until
recently, the higher cost made 32bit floating point DSPs
unreasonable for use in audio. Today, designers can achieve
highquality audio using either 32bit fixed or floatingpoint
processing with the introduction of the lowercost 32bit
processors, at a cost comparable to 16bit
and 24bit DSPs.
The Relationship of Dynamic Range to Data Word Size in
Digital Audio
One of the top considerations when designing an audio system is
determining acceptable signal quality for the application.
Table 1 shows some comparisons of signal quality for some
audio applications, devices and equipment.
Audio
Device/Application 
Dynamic
Range 
AM Radio 
48 dB 
Analog Broadcast TV 
60 dB 
FM Radio 
70 dB 
Analog Cassette Player 
73 dB 
Video Camcorder 
75 dB 
ADI SoundPort Codecs 
80 dB 
16bit Audio Converters 
90 to 95 dB 
Digital Broadcast TV 
85 dB 
MiniDisk Player 
90 dB 
CD Player 
92 to 96 dB 
18bit Audio Converters 
104 dB 
Digital Audio Tape (DAT) 
110 dB 
20bit Audio Converters 
110 dB 
24bit Audio Converters 
110 to 120 dB 
Analog Microphone 
120 dB 
Table 1:
Some dynamic range comparisons


"Recent advancements within the past decade in human
hearing indicate the sensitivity of the human ear is
such that the dynamic range between the quietest
sound detectable and the maximum sound which can be
experienced without pain is approximately 120dB.
Further studies suggest there is critically important
audio information at frequencies up to 40 kHz and
possibly 80 kHz"



Audio equipment retailers and consumers often use the phrase
'CDquality sound' when referring to highdynamicrange audio.
Compare sound quality of a CD player to that of an AM radio
broadcast. For higher quality CD audio, noise is not audible,
especially during quiet passages in music. Lower level signals
are heard clearly. But, the AM radio listener can easily hear
the lowlevel noise at very audible levels to where it can be a
distraction to the listener. With an increase of an audio
signal's dynamic range, the better distinction one can make
for lowlevel audio signals while the noise floor is lowered
and becomes undetectable to the listener ("noise floor" is a
term used to describe the point where the audio signal cannot
be distinquished from lowlevel white noise).
To achieve CDtype signal quality, the trend in recent years
has been to design a system that processes audio signals
digitally, using 16bit ADCs and DACs with signaltonoise ratio
(SNR) and dynamic range around 9093 dB. When processing these
signals, the programmer should normally design the algorithm
with computation precision that is usually greater than
16bits in compact disk signals. CDquality audio is just one
example. For whatever the application, the audio system
designer must first determine what is an acceptable SNR and
then decide how much precision is required to produce
acceptable results for the intended application.


Click Here for a summary of the terms shown in
Figure 5 as defined by Davis and Jones (we will
be referring to many of these terms frequently
throughout this article).



What Is The SNR and Dynamic Range for a DSP?
In analog and digital terms, SNR (S/N ratio) and dynamic range
are often used synonymously. In pure analog terms, SNR is defined
as the ratio of the largest known signal that exists to the
noise present when no signal exists. In digital terms, SNR and
dynamic range are used synonymously to describe the ratio
between the largest representable number to the quantization
error.
A welldesigned digital filter should contain a
maximum SNR that is greater than the
converter SNR. Thus, the DSP designer must be sure that the
noise floor of a filter is not larger than the minimum
precision required of the ADC or DAC.
Figure 5: Audio signal level (dBu) relationship
between dynamic range, SNR, and headroom


"In theoretical terms, there is an increase in the
signaltoquantization noise or dynamic range by
approximately 6 dB for each bit added to the
wordlength of an ADC, DAC or DSP."



In "realworld" signal processing, quantization is the process
by which a number is approximated by a number of finite
precision. For example, during analogtodigital conversion, an
infinitely variable signal voltage is represented by a binary
number with a fixed number of bits. The difference between two
consecutive binary values is called the quantization step, or
quantization level. The size of the quantization step defines
the effective noise floor of the quantized signal. The word
length for a given processor determines the number of
quantization levels that are available. For example, an nbit data word
would yield 2^{n} quantization levels (some examples
for common data word widths are shown in Table 2).
N Quantization
Levels for nbit data words (N = 2^{n}
levels) 
2^{8} = 256 
2^{16} = 65,536 
2^{20} = 1,048,576 
2^{24} = 16,777,216 
2^{32} = 4,294,967,296 
2^{64} = 18,446,744,073,729,551,616 
Table 2:
An nbit data word yields 2^{n} quantization levels
A higher number of bits used to represent a sample will
result in a better approximation of the audio signal and a
reduction in quantization error (noise) that produces an
increase in the SNR. In theoretical terms, there is an increase
in the signaltoquantization noise or dynamic range by
approximately 6 dB for each bit added to the word length of an
ADC, DAC, or DSP.
Figure 6: DSP/converter SNR and dynamic range
Note that the "6dBPerBitRule" is an approximation to
calculating the actual dynamic range for a given word width.
The maximum representable signal amplitude to the maximum
quantization error for of an ideal ADC or DSPbased digital
system is actually calculated as:
1.76 dB is based on sinusoidal waveform statistics and
would vary for other waveforms, while n represents the data word
length of the converter or the digital signal processor.
In undithered DSPbased systems, the SNR definition above is
not directly applicable since there is no noise present when
there is no signal. In digital terms, dynamic range and SNR
(Figure 6) are often used synonymously to describe
the ratio of the largest representable signal to the
quantization error or noise floor. Therefore, when
referring to SNR or dynamic range in terms of DSP data word
size and quantization errors"both terms mean the same
thing.
Now the question arises, how many bits are required to
design a high quality audio system? In terms of dynamic range
and SNR, what is the best precision one can choose without
sacrificing low cost in a given design? Let's first see
the dynamic range comparisons between DSPs with different
native dataword sizes. Figure 7 shows the dynamicrange
relationship between the three most common DSP fixedpoint
processor dataword widths: 16, 24, and 32 bits. The
quantization level comparisons are also given. As stated
earlier, the number of dataword bits used to represent a
signal directly affects the SNR and quantization noise
introduced during the sample conversions and arithmetic
computations.
Figure 7: Fixedpoint DSP dynamic range
comparisons
Precision
(FixedPoint Binary Representation) 
Dynamic Range
(# of bits per data word x 6 db/bit or
resolution) 
16bit 
96 dB 
24bit 
144 dB 
32bit 
192 dB 
Table 3:
Dynamic range vs. resolution
Each additional bit of resolution used by the DSP
for calculations will reduce the quantization noise power by
6dB. 16bit fixedpoint numeric precision yields 96 dB [16 x 6
dB per bit], 24bit fixedpoint precision yields 144 dB [24 x 6
dB per bit], while 32bit fixedpoint precision will yield 192
dB [32 x 6 dB per bit]. Note that for native singleprecision
math, a 16bit DSP is not adequate for accurately representing
the full dynamic range required for 'higherfidelity' audio
signals around 120 dB.
In terms of quantization levels, Figure 8
demonstrates how 32bit and 24bit processing can more
accurately represent a processed audio signal as compared to
16bit processing. 24bit processing can more accurately
represent a signal 256 times better than 16bit processing,
while 32bit processing can more accurately represent signals
65,536 times better than that for 16bit processing, and 256
times more accurately than that of a 24bit processor.
Figure 8: Fixedpoint DSP quantization level
comparisons
Using the "6dBPerBitRule," 32bit IEEE floating point
dynamic range is determined to be 1530 dB. For floating point
this is calculated by the size of the exponent"6 dB x 255
exponent levels = 1530 dB. (255 levels come from the fact that
there is an 8bit exponent). For floatingpoint audio
processing, we can see there is much more dynamic range
available than the 120 dB required for covering the full audio
dynamic range capabilities of the human ear.
Additional Fixed Point MAC Unit Dynamic Range for DSP
Overflow Prevention
Computation overflow/underflow is a hardware limitation that
occurs when the numerical result of the fixedpoint computation
exceeds the largest or smallest number that can be represented
by the DSP. Many DSPs include additional bits in the MAC unit
to prevent overflow in intermediate calculations. Extended
sumsofproducts, which are common in DSP algorithms, are
achieved in the MAC unit with singlecycle multiplyaccumulates
placed in an efficient loop structure. The extra bits of
precision in the accumulator result register provide extended
dynamic range for protection against overflow in successive
multiplies and additions. Thus, no loss of data or range
occurs. Table 4 shows a comparison of the extended
dynamic ranges of 16bit, 24bit, and 32bit DSPs.
Table 4: Comparison of the extended dynamic ranges of fixedpoint DSP
multiplier units
Considering Data Word Length Issues When Developing Audio
Algorithms Free From Noise Artifacts
Digital Signal Processing is often discussed as if the signals
to be processed and the filter arithmetic used to process them
are both of infinite precision. However, all implementations of
DSP necessarily use words of finite length to represent each
and every value, be it a digital audio input sample, a filter
coefficient or the result of a multiplication. This finite
precision of representation means that any digital signal
processing performed to generate a desired result introduces
inaccuracy into the result. If a signal goes through several
stages of DSP, then each stage will add more inaccuracy.
The effects of a finite word length can severely effect
signal quality (in other words, lower the system S/N ratio) and
produce unacceptable error when performing DSP calculations.
Undesirable effects of finite precision can result of any of
the following:
 A/D Conversion Noise
Finite precision of an input data word sample will introduce
some inaccuracy for the DSP computation as a result of the
nonlinearities inherent in the A/D Conversion Process.
Therefore, the accuracy of the result of an arithmetic
computation can not be greater than the resolution of the
quantized sample. In other words, the A/D conversion process
will establish the noise floor for the DSP (unless the DAC
has a lower noise floor). The DSP programmer must ensure that
the noise floor of the processing algorithm does not exceed
the noise floor of the ADC.
 Quantization Error of Arithmetic Computations From
Truncation and Rounding
DSP Algorithms such as Digital Filters will generate results
that must be truncated or rounded up (in other words,
requantized). When a processing result need to be stored, it
must be quantized to the native dataword length of the
processor, introducing an error. For recursive DSP algorithms
these requantized values are part of a feedback loop,
causing arithmetic errors that can build up, which then reduces
the dynamic range of the filter. The smaller the data word of
the DSP, the more likely these types of errors will show up
in the D/A converted output analog signal.
In a nbit fixedpoint system, quantization of results may
be considered as the addition of noise to the result.
Consider a multiplication operation in a digital filter,
including requantization of the result. This can be modeled
as an infiniteprecision multiplication followed by an
addition stage where quantization noise is added to the
product so that the result is equal to a nbit number.
In a digitalsignalprocessing system, multiplication,
addition, and shift operations are performed on a sequence of
nbit input values. These operations generate results which
would require more than n bits to be represented accurately.
The solution to this problem is generally to eliminate the
loworder bits resulting from an arithmetic operation in
order to produce a nbit value which can be stored by the
system.
The two most common methods for eliminating the loworder
bits are truncation and rounding. Truncation is accomplished
by simply discarding all bits less significant than the least
significant bit that is retained. Rounding is performed by
choosing the nbit number which is closest to the original
unrounded quantity.
 Computational Overflow
Whenever the result of an arithmetic computation is larger
than the highest positive or negative fullscale value, an
overflow will occur and the true result will be lost.
 Coefficient Quantization
Finite Word Length (nbit data word size) of a filter
coefficient can affect pole/zero placement and a digital
filter's frequency response. This imprecision can cause
distortion in the frequency response of the filter and, in
the worst case, instability.
Errors in the values of a filter's coefficients cause
alterations in the positions of the transferfunction poles
and zeros and therefore are manifested as changes to the
frequency and phaseresponse characteristics of the filter.
In a DSP system of finite precision, such deviations cannot
be avoided. It can, however, be reduced by using greater
precision for the representation of coefficients. This issue
is particularly important for poles close to the unit circle
in the zplane, where an inaccuracy could make the difference
between stability and instability.
 Limit Cycles
These occur in IIR filters from truncation and rounding of
multiplication results or addition overflow. These often
cause periodic oscillations in the output result, even when
the input is zero.
Other than A/D Conversion Noise, all other effects of having
a finite dataword size are mainly dependent on the precision
of the requantization of data and the type of arithmetic
operations used in the DSP algorithm. Any given filter
structure can offer a significantly lower noise floor over
another structure which accomplishes the same task.


"The overall DSPbased audio system dynamic range is
only as good as its weakest link"



In a DSPbased audio system, this means that any one of the
following sources or devices in the audio signal chain will
determine the dynamic range of the overall audio system
:
 The "real world" analog input signal, typically from a
microphone or linelevel source
 The ADC word size and conversion errors
 DSP finite word length effects such as quantization
errors resulting from truncation and rounding, and filter
coefficient quantization
 The DAC word size
 The analog output circuitry connecting to a speaker
 Another device in the signal path that will further
process the audio signal.


"For a digital filter routine to operate
transparently, the resolution of the processing
system must be considerably greater than that of the
input signal so that any errors introduced by the
arithmetic computations are smaller than the
precision of the ADC or DAC"



So, the choice of components and the digital filter
implementation will also determine the overall quality of the
processed signal. For example, if we have a 75 dB DAC and a DSP
which can maintain 144 dB dynamic range, the overall 'System'
dynamic range will still only be 75 dB. So the DAC is the
limiting factor. Even though the DSP would compute a given
algorithm and maintain a result that had 122 dB of precision
and dynamic range, the result would have to be truncated in
order for the DAC to properly convert it back to an analog
signal. Now, if the choice is made to use highquality analog, ADC,
and DAC components, wouldn't one want to be careful to ensure
the signal quality is maintained by the DSP algorithm? Care
must then be taken in a digital system to ensure the DSP is not
the weakest chain in the 'signal chain'.
If a digitalsignalprocessing algorithm produces
quantization noise artifacts which are above the noise floor of
the input signal, then these artifacts will be audible under
certain circumstances, especially when an input signal is of
low intensity or limited frequency. Therefore, whatever the
dynamic range of a highquality audio input, be it 16, 20, or
24bit input samples, the digital processing performed
on it should be designed to prevent processing noise from
reaching levels at which it may appear above the noise floor of
the input, and thus become audible content. For a digital filter routine to operate transparently,
the resolution of the processing system must be considerably
greater than that of the input signal so that any errors
introduced by the arithmetic computations are smaller than the
precision of the ADC or DAC. In order for the DSP to maintain
the SNR established by the ADC, all intermediate DSP
calculations require the use of higher precision processing
greater than the input sample wordsize.
What are the dynamic ranges that must be maintained for
CDquality and professionalquality audio designs? Fielder
demonstrated the dynamic range requirements for consumer CD
audio requires 16bit conversion/processing while the minimum
requirement for professional audio is 20bits (based on
perceptual tests performed on human auditory capabilities).
Traditional dynamic range application requirements for
highfidelity audio processing can be categorized into two
groups:
 'Consumer CDQuality' audio systems use 16bit
conversion with typical dynamic ranges between 8593 dB
 'ProfessionalQuality' audio systems use 20 to 24bit conversion with dynamic ranges between 110122 dB.
Maintaining 16Bit 'CDQuality' Accuracy During DSP
Processing
As we saw in the last section, when using a DSP to process
audio signals, the DSP designer must ensure that any
quantization errors introduced by the arithmetic calculations
executed on the processor are lower than the converter noise
floor. Consider a 'CDquality' audio system. If the DSP is to
process audio data from a 16bit ADC (ideal case), a 96 dB SNR
must be maintained through the algorithmic process in order to
maintain a CDquality audio signal (6x16=96dB). Therefore, it
is important that all intermediate calculations be performed
with higher precision than the 16bit ADC or DAC resolution. Errors introduced by the arithmetic calculations can be
minimized when using larger dataword width sizes for
processing audio signals. For fractional fixedpoint math, we
can visualize the addition of extra 'footroom' bits added to
the right of the least significant bit of the input sample. The
larger word sizes used in the arithmetic operations will ensure
that truncation or roundoff errors will be lower than the
noise floor of the DAC, as long as 'optimal' algorithms (better
filter structures) are utilized in conjunction with the larger
word width.
When considering selection of a processor for
implementation, a choice therefore has to be made. Should one
use a lower dataword DSP using doubleprecision math, or
should a higher dataword DSP be used supporting singleprecision math, which is more efficient? It is estimated that
doubleprecision math operations can take up to 45 times the
overhead of single precision math. Doubleprecision not
only adds computation overhead to a digital filter, it also
doubles the memory storage requirements for the filter
coefficient buffer and the input delay line buffer. Every
application is different, and although some applications may
suffice smaller native dataword width processor, the use of
doubleprecision computations, coefficients and intermediate
storage comes at the expense of a drastic reduction in
processing throughput.
To visually see the benefits of a larger DSP word size,
let's take a look at the processing of audio signals from a
16bit ADC that has a dynamic range close to its theoretical
maximum, in this case with a 92 dB signaltonoise ratio (Figure 9). Figure 10 below shows a conceptual
view of a 16bit data word that is transferred from an ADC to
the DSP's internal memory. Typically, the data transfer would
occur through a serial port interface from the serial ADC, and
the DSP may be configured to automatically perform a direct
memory transfer (DMA) of the sample at the serial port
circuitry to internal memory for processing. Notice that for
the 24bit and 32bit processors, there are adequate
'footroombits' below the noise floor (to the right) to protect
against quantization errors.
Figure 9: Fixedpoint DSP noise floor with a typical
16bit ADC/DAC at 92 dB
Figure 10: 16bit A/D samples at 96 dB SNR
The 16bit DSP has 4 dB higher SNR than the ADC's 92 dB, so
not much room for error would be allowed in arithmetic
computations. We can easily see that for moderatetocomplex
audio processing using singleprecision arithmetic, the 16bit
DSP data path will not be adequate for precise processing of
16bit samples as a result of truncation and roundoff errors
that can accumulate during the execution of the algorithm. As
shown in Figure 11, errors resulting from the arithmetic
computations can easily be seen by the output DAC and thus
become audible noise. For example, complex recursive
computations can easily result in the introduction of 18 dB of
quantization noise, and with the 16bit DSP word width, the
errors are seen by the DAC and hence will be easily heard by
the listener.
Figure 11: 16bit D/A output samples with finite
length effects
Doubleprecision math can obviously still be used for the
16bit DSP if software overhead is available, but the real
performance of the processor will be compromised. A 16bit DSP
using singleprecision processing would only suffice for
lowcost audio applications where processing is not too complex
and SNR requirements are around 75 dB (audiocassette
quality).
The same algorithm implemented on a 24bit or 32bit DSP
would ensure these errors are not seen by the DAC. As can be
seen in the Figure 11, even though 18 dB of quantization
noise was introduced by the computations in the 24bit and
32bit DSP, they remain well below the noise floor of the
16bit DAC when these two processors run the exact same
algorithm.
The 24bit DSP has 8 bits below the converter noise floor to
allow for errors. In other words, we have eight digits to the
right of the least significant bit in the 16bit input sample.
It takes 256 multiplicative processing operations to be
performed before the noise floor of the algorithm goes above
the resolution of the input sample.
A 32bit DSP has 16bits below
the noise floor when executing 32bit fractional math, allowing
for the greatest computation flexibility in developing stable,
noisefree audio algorithms. There are 16 digits to the right
of the least significant bit in the 16bit input sample. It
would take 65,536 multiplicative processing operations before
the noise floor of the algorithm would go above the resolution
of the 16bit input. With more room for quantization errors,
filter implementation restrictions seen with 16 or 24bit DSPs
are now removed.
So, the higher number of bits used to process an audio
signal will result in a reduction in quantization error
(noise). If these errors remain below the noise floor, the
overall 'digital system SNR' established by the converters is
therefore maintained. The DSP should not the limiting factor in
signal quality! When using a 16bit converter for 'CDquality'
audio, the general recommendation widely accepted is to use a
higher resolution processor (24 or 32bit) since additional
bits of precision gives the DSP the ability to maintain the 96
dB SNR of the audio converters.
Is 24Bit Processing Always Enough for Maintaining 16Bit
Sample Accuracy?
Now it would appear in some cases, 32bit processing would be
unnecessary for minimal processing of 16bit data. In order to
maintain a 96 dB dynamic range, 24 bits would appear to be
sufficient to process a 16bit signal without any
doubleprecision math requirement. But the question is then
asked: Is a 24bit DSP sufficient in all cases to guarantee
that noise introduced in a DSP computation will never go above
a 16bit noise floor? For moderate and nonrecursive DSP
operations, 24bits should normally be sufficient. However,
research conducted in recent years has clearly shown that for
precise processing of 16bit signals in recursive audio
processing, a 24bit DSP may not be sufficient. Recursive
filters are necessary for a wide variety of audio applications
such as graphic equalizers, parametric equalizers, and comb
filters.
In a 1993 AES Journal publication, R. Wilson
demonstrated that even for recursive secondorder IIR filter
computations on a 24bit DSP, the noise floor of the digital
filter can still go above that of the 16bit sample and hence
become audible. To compensate for this the use of error
feedback schemes (error spectrum shaping) or doubleprecision
arithmetic were recommended, especially for extremely critical
frequency response designs. The use of doubleprecision math
can add processor computational overhead by more than a factor
of five in the filter computations, while doubling memory
storage requirements.
Another March 1996 AES Journal publication by W. Chen
came to the same conclusion. In order to maintain the 96dB
signaltonoise ratio for 24bit processing of secondorder IIR
filters, a doubleprecision filter structure was required to
ensure that the digital equalizer output's noise floor was
greater than 96 dB. Chen researched various secondorder
realizations to determine the best structure when performing
24bit processing on 16bit input. In one test case, he
implemented a single highpass secondorder filter using
directform1 structures, finding these implementations to
yield an SNR between 85 to 88 dB, which is lower than the 96 dB
theoretical maximum of the ideal 16bit ADC.
Chen's second example consisted of cascading of secondorder
structures to implement a sixteenthorder digital equalizer. He
then measured the noise floor of the equalizer using an Audio
Precision System One tester in order to find an adequate
secondorder IIR filter structure to meet his target 96dB
requirement. The results of using the 24bit DSP on a 16bit
sample are shown in Table 5.
SecondOrder
Filter Structure 
S/N Ratio (dB)
Results for 16thorder Equalizer 
Cascaded Form 1 
75 dB 
Cascaded Form 2 
63 dB 
Cascaded Transposed Form 1 
70 dB 
Double Precision Cascaded Form 1 
100 dB 
Parallel Form 1 
85 dB 
Parallel Transposed Form 1 
79 dB 
Table 5: Chen's Results of 24bit 2nd Order IIR Processing on 16bit
Data (March 1996 Journal of AES)
Chen's conclusion"in order to maintain a higher
signaltonoise ratio greater than 96 dB when cascading
multiple secondorder stages, doubleprecision arithmetic was
required. In his optimal implementation of the doubleprecision
directform1 filter, there was an increase in the number of
instruction cycles (3x increase) and greater memory space (2x
increase) for storing internal filter states.


"When processing of 16bit samples with a 32bit
processor versus a 24bit processor, the 8 additional
bits available below the noise floor and the use of
32bit filter coefficients will ensure that
doubleprecision overhead is not necessary when using
any standard secondorder IIR filter realization."



Recall that with a 32bit DSP, there are 8 extra bits of
precision compared to a 24bit processor. For a given
secondorder filter structure implemented on a 24bit processor
that is then implemented in a 32bit fixedpoint processor, the
arithmetic result should result in a reduction in the noise
floor by 48 dB. Directform 1 filter structures are generally
the best filter structure for use in audio, because of the better
noise performance they provide.
For example, we can see
that in Chen's results (
Table 5), the Parallel Form 1
structure used to construct the equalizer provided the best
result for singleprecision 24bit computation. However, this
is still less than the ideal 96dB case. The 24bit processor's
144dB ideal noise floor is significantly raised by 70 to 80 dB
and, as a result, it is greater than the 16bit converter's
noise floor. If this same algorithm is implemented on a 32bit
fixedpoint processor, the noise floor of the filter output is
lowered by 48 dB (with the 8 extra 'footroom' bits) to 133 dB.
This is not only sufficient for remaining lower than a 16bit
converter's noise floor, but a 32bit implementation of the
singleprecision directform 1 structure would be adequate for
even a 24bit converter's noise floor as well.
Processing 110120 dB, 20/24bit ProfessionalQuality Audio
When the compact disc was launched in the early 1980s, the
digital format of 16bit words sampled at 44.1 kHz, was chosen
for a mixture of technical and commercial reasons. The choice
was limited by the quality of available analogtodigital
converters, by the quality and cost of other digital
components, and by the density at which digital data could be
stored on the medium itself. It was thought that the format
would be sufficient to record audio signals with all the
fidelity required for the full range of human hearing. However,
research since the entrance of CD technology has shown that
this format is imperfect in some respects.
New research conducted within the last decade indicates that
the sensitivity of the human ear is such that the dynamic range
between the quietest sound detectable and the maximum sound
which can be experienced without pain is approximately 120 dB.
Therefore, 16bit CDquality audio is no longer thought to be
the highestquality audio that can be stored and played back.
Also, many audiophiles claimed that CDquality audio lacked a
certain warmth that a vinyl groove offered. This may have been
due to a combination of the dynamic range limitation of 16bits
as well as the chosen sample rate of 44.1 kHz. The 16bit words
used for CD allow a maximum dynamic range of 96 dB although
with the use of dither this is reduced to about 93 dB. Digital
conversion technology has now advanced to the stage where
recordings with a dynamic range of 120dB or greater may be
made, but compact disc is unable to accurately carry them.
Recent technological developments and improved knowledge of
human hearing have created a demand for greater word lengths
and faster sampling rates in the professional and consumer
audio sectors. It has long been assumed that the human ear was
capable of hearing sounds up to a frequency of about 20 kHz and
was completely insensitive to frequencies above this value.
This assumption was a major factor in the selection of a 44.1
kHz sampling rate. New research has suggested that many people
can distinguish the quality of audio at frequencies of up to 25
kHz, and that humans are also sensitive to a degree to
frequencies above even this value. This research is mainly
empirical, but would mean that a substantially higher sampling
frequency is necessary. D. E. Blackmer has suggested that
in order to fully meet the requirements of human auditory
perception, a sound system must be designed to cover the
frequency range to up to 40 kHz (and possibly up to 80 kHz)
with over 120 dB dynamic range to handle transient peaks. This
is beyond the requirements of many of today's digital audio
systems. As a result, 18, 20, and even 24bit ADCs are now
widely available which are capable of exceeding the 96dB
dynamic range available using 16 bits.
The Race Toward The Use of 24bit A/D and D/A
Conversion
Multibit SigmaDelta Converters capable of 24bit conversion
are now in production by various manufacturers, including Analog Devices,
Crystal Semiconductor, and AKM Semiconductor.
The popularity of 24bit DACs is increasing for both
professional and highend consumer applications. The reason for
using these higher precision ADCs and DACs for audio processing
is clear: the distortion performance (linearity) of these
higher resolution converters are much better than 16bit
converters. The other obvious reason is the increase in SNR and
dynamic range that they provide over 16 to 20bit
technology.


"24bit ADC and DAC technology is capable of 120122
dB dynamic range, fully supporting the dynamic range
capability of the human ear up to the threshold of
pain of 120 dB, at sample rates of 96 kHz and 192
kHz"



Many 24bit converters on the market range from 110 to 120 dB,
which is professional quality and close to the range capable by
the human ear. The higherend converters range from 117 dB to
122 dB (Conversion errors such as intermodulation distortion
introduced by the 24bit converters limit the final SNR from
the theoretical 148 dB maximum). These newer 24bit converters
have up to 120122 dB dynamic range, easily allowing input
sources such as a 120 dB lownoise condenser microphone.
At many AES conventions in recent years, professional
equipment manufacturers have showcased equipment with 24bit
conversion and 96 kHz sample rates. New DVD standards are
extending the digital formats to 24bits at sample rates of 96
kHz and 192 kHz formats. Professional quality audio is emerging
in the consumer audio market sector, traditionally a market with
less stringent audio specifications. The race is on for audio
equipment manufacturers to include 24bit, 96 kHz converters to
maintain signal quality up to 120 dB.
Comparing 24Bit and 32Bit Processing of Audio Signals
with 24Bit Resolution
For years it has been widely accepted that in most cases 24bit
DSP processing offers adequate precision for 16bit samples.
With higherprecision 24bit converters emerging to support
newer professional and consumer audio standards, what will
become the recommended processor wordwidth required to
maintain 24bit precision? For 24bit conversion, a 24bit DSP
may no longer be able to adequately process 24bit samples
without resorting to doubleprecision math, especially for
recursive secondorder IIR algorithms. Newer 24bit converter
technology is making a strong case for 32bit processing. The
use of a 32bit DSP has already become the logical
processorofchoice for many audio equipment manufacturers when
using a 24bit signal conversion. Let's examine why this is the
case.
Figure 12 visually demonstrates a typical situation
that can result from moderately complex or recursive processing
of 24bit samples. Note that the 24bit sample in this case is
assuming a 1.23 fractional number interpreted from the 24bit
converters. The extra bits of precision provided by 32bit fixedpoint
processing are to the right of the 24bit input's LSB. For
example, the parallel combination of secondorder IIR filters
can result in significant quantization artifacts from in the
lower order bits of the data word. If both the 24bit and
32bit end up producing errors that result in an introduction
of 24 dB of noise (4 bits x 6 dB/bit), the error will show up
on the 24bit DAC since the 24bit DSP has the result above the
noise floor. Singleprecision computations with 24bit
processing can limit the result of a processed input to about
15bit accuracy. Should one use double precision routines on
the 24bit processor, or should one opt for a 32bit processor
when using a 24bit converter? Using a 32bit processor, the
errors produced during the computations will never be seen by a
120 dB, 24bit DAC.
Figure 12: 24bit D/A output samples with finite
length effects
Recall earlier in the article, the analysis of Wilson's and
Chen's research demonstrated that for even secondorder IIR
filter designs using a 24bit processor, one may require the
use of additional error feedback computations or
doubleprecision math to ensure the noise floor remains lower
that a 16bit converter. If 24bit computations can introduce
noise artifacts that can go above a 16bit noise floor for
complex second order filters, what does that mean? We can
conclude that a 24bit DSP processing 24bit samples will
result in the noise floor of the digital filter to always be
greater than the 24bit converter's noise floor, unless methods
are implemented to reduce the digital filter's noise floor.
These costly methods of implementing errorfeedback schemes and
doubleprecision arithmetic are unavoidable and can add
significant overhead in processing of 24bit audio data.
With many converter manufacturers introducing 24bit ADCs
and DACs to meet emerging consumer and professional audio
standards, the audio systems using these higher resolution
converters will require at least 32bit processing in order to
offer sufficient precision to ensure that a filter algorithm's
quantization noise artifacts will not exceed the 24bit input
signal. If optimal filter routines are used for complex
processing, any quantization noise introduced in the 32bit
computations will never be seen by the 24bit output DAC. In
many cases, the audio designer can choose from a number of
secondorder structures because the result will still be
greater than 120 dB. 32bit processing will guarantee that the
noise artifacts remain below the 120dB noise floor, and hence
provide a dynamic range of the audio signal up the human ear's
threshold of pain. Therefore, the goal of developing robust
audio algorithms is accomplished, and the only limiting factor
when examining the signal quality (SNR) of the digital audio
system is the precision of the 24bit ADC and DACs.
Summary of Data Word Size Requirements for Processing Audio
Signals
To maintain high audiosignal quality well above the noise
floor, all intermediate DSP calculations should be done using
higher precision than the bit length of the quantized input
data. High precision storage should also be used between the
DSP's memory and computation units. The use of "optimal" filter
algorithms, higher precision filter coefficients, and higher
precision storage of intermediate samples (available with
extended precision in the MAC unit) will ensure that errors
introduced by the arithmetic computations are much smaller than
the error introduced by the conversion of the results by a DAC.
Therefore, the noise floor of the digital filter algorithm will
be lower than the resolution of the ADCs and DACs.
A 16bit DSP may suffice for lowcost audio applications
where processing is not complex and SNR requirements are around
75 dB. However, 16bit DSPs using singleprecision computations
will not be adequate for precise processing of 16bit signals.
When using 16bit ADCs and DACs in an audio system that will
process 'CDquality' signals having a dynamic range of 90 to 96
dB, a 16bit data path may not be adequate as a result of
truncation and rounding errors accumulating during execution of
the DSP algorithm. Doubleprecision routines can be utilized to
lower the digital filter's noise floor as long as the software
overhead is available.
While complexity for new DSP algorithms increase as audio
standards and requirements are increasing, designers are
looking to 18bit, 20bit, and 24bit converters to increase
the signal quality. A 16bit DSP will not be adequate due to
these higher resolution converter's dynamic range capabilities
exceeding those of a 16bit DSP processor. However, a 16bit DSP may
still be able to interface to these higher precision
converters, but this would then require the use of
doubleprecision arithmetic. Doubleprecision operations slow
down the true performance of the processor while increasing
programming complexity. Memory requirements for
doubleprecision math are doubled. Even if doubleprecision
math can be used, the interfaces to these higher precision
converters in many cases would require glue logic to move the
data to and from the DSP.
At least 24 bits are required in processing if the quality
of 16 bits is to be preserved. However, even with 24bit
processing, it has been demonstrated that care would need to be
taken to ensure the noise floor of the digital filter algorithm
is not greater than the established noise floor of the 16bit
signal, especially for recursive IIR audio filters. Recursive
IIR filters can introduce quantization noise above the noise
floor of a 16bit converter when using a 24bit DSP and
therefore 24bit processing requires software overhead to lower
the digital filter's noise floor. Again, doubleprecision math
is an option, but this can add overhead by as much as a factor
of five.
Using a 32bit, fixedpoint DSP will offer an additional benefit
of ensuring 16bit signal quality is not impaired during
arithmetic computations. Thus, the higher resolution of the
32bit DSP will eliminate quantization noise from showing up in
the DAC output, providing improved SignaltoNoise (SNR) ratio
over 16 and 24bit DSPs.
When processing 16bit audio data, the use of 32bit
processing is especially useful for complex recursive
processing using IIR filters. For example, parametric and
graphic equalizer implementations using cascaded 2ndorder IIR
filters, and comb/allpass filters for audio are more robust
using 32bit math. A 32bit processor operating on 16 or
20bit data removes the filter structure implementation
restrictions that are present for 24bit processors. Any filter
structure of choice can then be used without worrying about the
level of the noise floor. Doubleprecision and errorfeedback
schemes are therefore eliminated. With 16bits below the noise
floor on a 32bit DSP, quantization errors would have to
accumulate up to 96 dB from the LSB before these errors can be
seen by the 16bit DAC.
At least 32 bits are required if 24bit signals are to be
preserved with complex, mathintensive, or recursive
processing. Using 24bit ADCs and DACs will require a 32bit
DSP in order to offer sufficient precision to ensure that the
noise floor of the algorithm will not exceed the 24bit input
signal.
For more information about DSPs, visit the
Analog Devices Web site.