Spoke to the team internally for some clarification on the oversampling approach for the MLA-4 (and why no lower than 4x)
Turns out, the question of lowering oversampling rates actually touches on some technical ground that I’ll do my best to explain - and I am more than happy to get clarification for any outstanding questions.
For the deep details - I’ll always leave that to the powers that be well above my pay grade
For the MLA-4, oversampling should probably be called “quality factor” or something along those lines - because that’s exactly what it’s controlling.
First off, the relationship between oversampling and audio quality isn't as straightforward as it might seem. The actual quality achieved depends on how many samples converge to a highly accurate solution and, for those that don't, how close they get to the target. This varies significantly based on your input material - some audio content requires more processing iterations than others to achieve accuracy.
Here’s a pretty top-level view of what it’s doing…
Instead of the usual "double sample rate = double CPU" relationship, the team has developed an adaptive iterative process to solve differential equations that match the hardware behavior.
For every sample, we need to determine the exact output the analog circuit would produce - but since many analog circuit equations are too complex (or a closed form doesn’t exist), we use an iterative loop to solve the multiple equations in the processing.
This means that at higher sample rates and calculation precision, fewer iterations are required to reach our target accuracy - both factors help us converge on the correct result more quickly.
In practical terms, doubling the sample rate might only increase CPU by 20% instead of the traditional 100% because we reach the accuracy threshold faster.
We use 4x oversampling (192kHz) as our baseline because it provides sufficient time resolution to accurately model the analog circuit behavior while maintaining reasonable CPU efficiency.
Higher oversampling rates like 8x or 16x allow for more precise iteration steps and better convergence, which is particularly useful for complex source material or offline processing where computational cost is less critical.
The tradeoff is between processing time and solution accuracy - higher sampling rates give the iterative solver more opportunities to converge to an accurate solution but require more CPU resources.
However, there’s no noticeable difference between 4x and 16x in most cases.
So, while we could implement lower oversampling rates, it would work against the efficiency of our processing method - and impact the sound quality. A compromise we’re not willing to live with
We're working on a more detailed technical article that dives deep into this approach. It's pretty cool stuff to look under the hood at how this kind of circuit modeling works. I hope to include insights directly from the DSP team on the development process and the challenges they solved along the way.
That being said, for now, if you're experiencing CPU hits using multiple instances, I'd recommend printing some of the tracks as you go - because if the oversampling dropped any lower - as mentioned, it strays way too far from our quality standards.