Some Real-World Comparisons
Air attenuates high frequencies and disperses transients, but in a way that is completely familiar. Can we therefore mimic this behavior to give a 'more natural' system response whose only effect would be to effectively move the listener a short but familiar distance from the source? That this is possible can be seen in Figure 9.
It's useful to know the original sampling rate of the mastering process, because that tells us about the first part of the chain. The DAC is equally important; a chain is as strong as the weakest link. More important, unless the encoder (A/D or mastering) and decoder (plus D/A) processes are complementary, it isn't possible to reach the final result and certainly not at low data rates. That means that we can't solve this problem either in the studio or in the DAC alone; we have to get both ends right and working together. Meddling with the DAC alone never reaches this level of performance and cannot achieve reproducibility.
Our ideal DAC has zero modulation noise and a compact impulse response. Of the many converter chips out there today, the best gives the MQA decoder direct access to the Modulator. Failing that, we will generally want to minimize the on-chip processing; that means driving the DAC as fast as possible and with tailored filters. That also involves matching the resulting impulse response to fit the response into the conceptual hierarchy described in [1], that way the sound most closely matches the studio preview.
Our brain-stem (which is very responsive to fine time structure), extracts the envelope of many sounds. Knowing this, we can more clearly understand why the sinc kernel is less appropriate than others for natural and other sounds of human interest. [1][9]
The graph below is from a model of neural significance of temporal blur introduced by the filters in different systems. One thing we can see is that, as listening tests bear out, MQA can significantly improve low-rate digital sources.
Temporal blur refers to dispersion in time in a single channel and can occur in a purely analogue system. Digital systems tend to confound the picture by employing filters that exhibit pre- and post-echo which smears more than quite long columns of air and with respect to the limit of human perception.
Figure 7: Impulse response of a typical CD channel showing the dispersion in time over ±4ms.
Figure 8: End-to-end impulse magnitude response of MQA compared to typical 192kHz and 48kHz systems and Air at STP and 30% RH. Compared to typical 192kHz sampling, temporal blur of the example is lowered by an order of magnitude. Note the expanded timescale and that only part of the 48kHz response is shown (it extends ±4ms).
Figure 7 shows the end-to-end impulse response for a typical CD channel operating at 44.1kHz. In Figure 8 we see some channels compared. In both these plots magnitude response is shown because we see in [2] and its references that the envelope of the sound is critical to perception.
Above all, the human listener is confused by responses that precede the main impulse (leading up to the centroid). The increasing evidence of acute sensitivity to time/frequency balance and practical experiments showing deterioration in sound quality from steep filters have led Stuart and Craven to conclude that the most appropriate benchmark against which to judge the blurring in a sound reproducing system is air itself. [2]
Figure 9: Here we are looking at two impulses spaced by 26µs and examining the response of three different columns of air and an MQA system.
Figure 10: Here we are just comparing the 48-kHz sinc system with air. Graphically we can infer that the 48kHz system puts us somewhere closer than 25m away and the twin impulses cannot be distinguished. In this and the next two figures the response follows a power-law as an estimate of detectability.
Figure 11: Here we are showing that 192kHz sinc sampling doesn't provide the detail separation of 10m of air and probably also fails to separate the two events. Note also the significant 'false-alarm' in the pre-echo.
Figure 12: Here, on a finer time-scale, we are looking at the two impulses spaced by 26µs and showing that MQA system resolves similarly to 5m of air.
A central axiom of MQA is that sound we hear is analogue; digital technology is most useful for storage, transformation or transmission.
Figure 13: Modeled temporal blur vs sample rate in sinc systems cf MQA.
To Sum-upTemporal blur refers to dispersion in time in a single channel and can occur in a purely analogue system. Digital systems tend to confound the picture by employing filters that exhibit pre- and post-echo which smears more than quite long columns of air and with respect to the limit of human perception.















