MQA: Aliasing, B-Splines, Centers of Gravity Page 2

Austin: What about upward imaging during reconstruction?

Stuart: If properly managed, upward imaging need have no negative impact on the sound, especially if the images are beyond the frequency range of associated electronics or transducers. Nevertheless, MQA applies quite specific constraints, not just to replicate what was heard in the studio but to maintain envelope and slew rate.

Austin: MQA's main claim is that it improves temporal response—hence, sound quality—by removing digital-conversion–induced "timing artifacts." There's less "ringing," and no "pre-echo." Impulse response is shorter. Critics, though, have pointed out that aliasing, which MQA seems to accept by design (while attempting to minimize), manifests itself not just in the frequency domain but also in the time domain—as acknowledged in an MQA-related article written by you and Peter Craven (footnote 4). Is this a real issue? Is it significant?

Stuart: In MQA, the first moment (center of gravity) of the reproduced impulse is always at exactly the right place.

For a number of reasons based on the auditory science of object detection, it seems very plausible that the first moment is of prime importance to the ear and that higher moments are less important and (importantly) can be shown not to contribute errors such as jitter. The possible timing error caused by the variation of the leading-edge shape of MQA impulse response pales into insignificance compared with the error that results from triggering on the wrong peak; we are considering differences of more than an order of magnitude.

Austin: How can a system with finite aliasing have the center of gravity always in exactly the right place? How is this possible if, as suggested in the previous question, aliasing can induce timing errors?

Stuart: We need to answer your question in three ways: in general theory, theoretically relating to MQA, and in actual practice.

In fact, generalizations of sampling theory help us solve the practical situation we face.

A minimum-phase filter's impulse response has certain attributes: it has a risetime from zero to the first peak (which is not necessarily coincident with the center of gravity); it has a decaying portion; it has a total "area" (the 0th moment) that expresses the "energy" in the response; it has a total response duration from start to finish (infinite in an analog or IIR filter but finite in some digital systems; this duration we call the "support" of the filter); it has a 1st-moment (and a center of gravity = 1st/0th moments) that occurs after an impulsive input by an amount equal to the group delay at 0Hz; etc.

How is it possible that the center of gravity is always in exactly the right place? The simple answer is that this is a property of B-splines. The more complete answer is that the B-spline (and sinc) sampling kernels satisfy the so-called "Strang-Fix conditions."[footnote 5] MQA is designed to ensure that the center of gravity of a reproduced pulse is at exactly the correct place. Although the kernels in MQA are not simple B-splines—they comprise the convolution of a B-spline with another filter—this property of the B-spline remains after the convolution.

You may have realized by now that the comparatively recent theoretical advances in sampling theory attempt to deal with non-band-limited signals, or more exactly, to reconcile the fact that bandwidth and information content are not synonymous.

It is important to re-emphasize that whereas we commonly use the terms system end-to-end impulse response, characteristic response, and average kernel response, these provide convenient ways to express important ideas. However, in the real world we do not have impulses in air. We do not listen to impulses. In fact, the power spectrum of all the signals to which we listen are radically different from these test signals. In music, speech, and environmental sounds, the spectral energy decays as frequency rises, and normally that energy spectrum has decayed below the system noise floor before the "Nyquist frequency" of our "Encapsulation" (which includes a resampler when the signal sample rate is higher than the kernel rate). Aliasing cannot be a problem if there are no signals to alias. [Stuart's emphasis]

So, MQA is designed to ensure that the center of gravity of a reproduced pulse is at exactly the correct place. Hence, to the extent that the ear determines the "timing" of a pulse by estimating the center of gravity, MQA has no timing error at all. It would be strange if the ear used a measure radically different from the center of gravity, but an alternative measure such as the start of the leading edge leads to a result that differs only slightly—for example, by 2.6µs— compared with an error of about 13µs if a 192kHz stream has been sinc-filtered to a Nyquist of 96kHz and the ear mistakenly latches on to the first positive pre-pulse 13µs away. Or with an error of approximately 26µs if the stream is sinc-filtered to 48kHz in preparation for transmission at 96kHz.

Even this small error is with a highly unrealistic test signal. With actual music, in the application for which MQA was designed, this effect is either simply not present, or exists at such a low level that it is considered, by us, to be immaterial to the human listener.

Austin: Above, you said, "Aliasing cannot be a problem if there are no signals to alias." Is it not similarly true that time smear itself does not occur if there are no signals to alias?

Stuart: Any deviations that aliasing brings to the "impulse response" (when analog is being uniformly sampled) are quite different from the impact of the filters controlling (and contributing to) end-to-end system response. The latter is there whether or not filtering is adequate to control or eliminate aliasing. Time smear relates to the fact that the "filter" spreads every sample out in time, irrespective of frequency—particularly in the "real world," where we take into account quantization (and sometimes aliasing) effects in A/D, workstations, and DACs.

This smear, we believe, can be material for the human listener who is extracting multiple cross correlations, as well as envelope and nonlinear measures of the audio.



Footnote 4: This is the paper referenced in footnote 2. The relevant text: "Aliasing in the frequency domain is equivalent to the time-domain phenomenon of an impulse response that depends on where, relative to the sampling instants, the original stimulus was presented: see footnote 8." Footnote 8 reads: "The complication is that because of the sampling, the total system is not time-translation invariant and so does not have a unique 'impulse response'—the response is slightly different according to the position of an original impulse relative to the sampling points."

Footnote 5: Note that parenthetical "(and sinc)," which implies that this property of MQA is shared by the usual Shannon approach to sampling—that is, by old-fashioned PCM. For more on the Strang-Fix conditions and their implications, see Stereophile's "MQA: Questions and Answers."—Jim Austin
Advertisement
Advertisement
Advertisement