Quote:
Originally Posted by
kraku
➡️
Just to be sure we're talking about the same thing here:
Do you mean how well human hears a signal from under another signal, which is playing in much higher volume? I.e. you have signal A around frequency X and a second signal B at much lower volume around the same frequency X: how well signal B is perceived from under signal A by a human.
Yes.
Quote:
Originally Posted by
kraku
➡️
Won't that be taken into account at the step where the difference of the signals (original vs. the one which went through the circuitry, calculated in frequency space) is compensated by the human hearing systems sensitivity at different frequencies?
In other words, human hearing system's sensitivity changes by frequency. But the masking itself is dependent on the relative volume levels of the signal A and B around any given frequency. So if the difference signal (frequency space) is adjusted to human hearing sensitivity curve at all those frequencies, wouldn't that result in a graph which shows how well human perceives the difference in those frequencies? Now if you add all energies of those frequencies together, you should (in theory?) get one number, which tells how audible the whole difference signal is from under the original signal. I.e. how well human can hear the difference between the original and the processed signal.
Well, that's a sizable task, scaling a difference based on a dynamic model of human hearing. You're into the world of perceptual coding there, and that's a really complicated world. The precision and quality of lossy codecs is still evolving.
Quote:
Originally Posted by
kraku
➡️
Hmm, I'm not 100% sure if that's accurate. The mechanism could be somehow different. What comes to mind is how dithering of digital signals works:
When you convert digital audio into lower bit depth, you get small extra peaks into your signal. You can make this new signal much less audible by dithering, which transforms those peaks into noise which spreads across wide range of the frequency space. This noise is much less audible to human hearing.
The difference here is that we're talking about noise with dithering vs. boosting/lowering original signal. I.e. the change in the signal type isn't radical in your example. This might affect how human perceives those signal types.
But otherwise, perceivability of peaks vs. dips could be an issue for this test. I.e. we seem to be moving into the realm of psychoacoustics. I know very little about the topic, but I know enough that it's could potentially get really complex really fast.
What I said is 100% correct, and is not a new concept. It's why you can make a 1dB change in a filter tuned at 1kHz with a very low Q, and it's easily heard, but you can put a 40dB deep notch at 1kHz with a Q of 10000, and it's not heard at all. It's always be easiest to think of response change audibility as the area below or above the curve that changed. That's not quite right because gain is easier to hear than loss, but it's totally true. My first real experimentation with this goes back to a UREI 565T filter set to notch out single frequency tones in broadband audio, though I played with deep gyrator based notch filters years before that.
Quote:
Originally Posted by
kraku
➡️
Hmm. The even vs. odd harmonics could be taken into account fairly easily in the test, if the signal used in the test was a sine wave. Then it's easy to pick even/odd harmonics and give them different weights when calculating the total "audibility" of the difference signal.
But harmonic audibility is more complex than that. It's not only the distribution, which could be weighted, its also about masking because while harmonic distortion, even or odd, is audible with pure tones, it's not nearly so when using complex waveforms. For example, a device with 3% THD, even order, will still sound pretty clean, but the same level of odd harmonics will not. But when you put music through that device, 3% even-order becomes pretty much inaudible, where 3% odd is starting to sound pretty bad. You also have to consider that analog mechanisms that generate harmonic distortion are different, and can co-exist to varying degrees. I highly recommend reading up on analog systems and distortion mechanisms, non-linearities etc. Just way to0 much for me to be writing up here.
Quote:
Originally Posted by
kraku
➡️
The audibility differences when going up the harmonics should probably be already taken into account by my original idea: adjust the difference signal's frequencies according to human hearing system's sensitivity.
Human hearing is not a 2d model as simple as the above. Hearing spectral sensitivity changes with SPL and frequency, and is directly affected by masking effects, and harmonic energy distribution. Not simple.
Quote:
Originally Posted by
kraku
➡️
I'm not sure what the "specific nonlinearity" vs dynamic audibility means in this context, though.
If you consider just two radically different transfer functions, I think you'll see my point. One might be a type of nonlinear response that changes slowly and evenly over a wide dynamic range. The other might be a function that is nearly perfectly linear up to a point, then goes radically nonlinear above that. Both can generate harmonic distortion, both would sound radically different from each other. Now, take that first transfer functions into the other quadrant and make that function in that quadrant a different curve. Now, you've changed the balance of even/odd harmonic distribution, and changed audibility again.
The only simple way to say it is that distortion audibility is affected by the order of the harmonics generated, the spectral energy distribution of all harmonics, and the presence of other masking signals, combined with the specific SPL. Remember, the ear is also nonlinear!
Quote:
Originally Posted by
kraku
➡️
I haven't given this much thought (I'd have to do some research and testing on the subject), but with hard clipping the signal changes abruptly, thus creating large quantities of fairly large amplitude (i.e. loud) higher frequencies. Soft clipping does gradual changes to the signal, thus introducing new signals more into the lower frequencies.
The harmonics generated by either one always have higher energy at lower frequencies, with successive harmonics energy falling off as the harmonics are more removed from the fundamental.
But this is just simple harmonic distortion. Intermodulation distortion is in many cases more audible, more objectionable, and comes in many different styles. It is nearly impossible to have low THD and high IMD, but certain devices with dynamic gain control can actually measure that way.
Quote:
Originally Posted by
kraku
➡️
If you play two signals, low frequency and high frequency one, both at amplitude X, the high frequency one is much more audible or at least jarring to the ear. Regular sounds/music has the approximate frequency curve of being high amplitude at lower frequencies and gradually leveling down when going up in frequency space. This is what human hearing system has developed into receiving. This could explain why the unnaturally large high frequency signals with hard clipping can be so audible vs. soft clipping.
You have to be a little careful when considering the human hearing response characteristic. Yes, its a very non-flat curve, but it is also a constant "mask" applied to all hearing to a greater or lesser degree. I would agree that a high level of 3rd harmonic of 1kHz would be more audible than the same level of 3rd harmonic of 6kHz, or 20Hz, but again, the basic hearing sensitivity curve us just one part of the story.
Quote:
Originally Posted by
kraku
➡️
If there is a difference in perception of IMD when there are more than couple of sine waves in the test signal, we're entering deep into the psychoacoustic area and I've no idea (yet) how to take into account any of that. Sounds complicated to test that definitively.
The test isn't all that complicated anymore because we have computers, software and really good audio interfaces. The reference paper is
"Spectral Contamination Measurement" by Deane Jensen and Gary Sokolich, Nov. 1988. He had to use a rather cumbersome test setup, but it revealed a lot of audibility information that was not available before. He did not continue to scale the data to audibility. Multi-tone generation and analysis is now built into REW, you can generate a multi-tone test signal similar to Jensen's setup, and now that audio interfaces and FFTs are of getter resolution, you can get back the kind of data Jensen did with minimal effort. At least one commercial audio test product company has adopted some of this technology, but since their market is automated industrial testing, and the products are financial out of my world, I haven't followed them much.
BTW, if you're not a member of the AES, I recommend that just for access to the papers. SO much information there. Not all is definitive, of course, but it's where most of the cutting edge audio stuff is published.