A potential flaw with ABX/Double-blind testing - Page 3 - Gearspace.com
The No.1 Website for Pro Audio
A potential flaw with ABX/Double-blind testing
Old 23rd April 2014 | Show parent
Lives for gear
adrianww's Avatar
🎧 5 years
Originally Posted by Arthur Stone ➡️
ABX is much touted as a way of measuring if listeners determine differences between audio files; my point is that if no controls are present then bias will change the significance of the results (as explained above).

I'm asking how the people operating ABX tests are accounting for the bias: that seems reasonable and worthwhile.
The point is, when you think it through, control groups (in the sense of other test subjects) cannot really apply to an ABX test and - more importantly - bias is not actually much of an issue in terms of the significance of the results.

A properly conducted ABX test is a test of individual perception. You can't use a control group to say that a particular test subject has any kind of bias since it may well be the case that the one person who said that they can't hear a difference is still telling the truth. The fact that two dozen (or two million) other people can hear a difference might suggest that the one person who didn't is playing silly buggers (either deliberately or unconsciously), but you can't say that for sure. That one person really might be unable to hear the differences that others can. So control groups, in the sense of other test subjects taking the same test, don't really work with an ABX test in the way that they do with, say, a medical test or other test where you're dealing with directly measurable physical effects.

Of course, you can do control experiments such as mentioned earlier where you carry out an "AAX" test without the test subject being aware of it. Or carry out a test where the two sources are so dramatically different that even the chronically hard of hearing should be able to tell them apart. But that still doesn't really tell you anything concrete about your test subject when it comes to the real test that you want to administer. A repeated (as in statistically significant) "non-random" result out of the AAX test might tell you that you've got your experimental setup a bit screwy. A completely random result on the widely differing test might suggest that your test subject is being bloody stupid (or just not paying attention). But neither of those allow you to control for any bias in the test subject when it comes to the real test.

However...and it's an important however...any bias that the test subject may have is actually largely irrelevant. The reason for this is that it is only really possible for such a bias to work in one way - the way that leads to a false negative, rather than a false positive. If someone, consciously or unconsciously, doesn't want to hear a difference then carrying out an ABX test on that person can ONLY tell you that they either don't or won't hear what's there. If you give the same test to half a dozen other people and they can all hear the difference between A and B (to whatever degree of statistical significance you fancy) then you can have some confidence that some kind of audible difference is present. Yes you may have one subject who can't (or won't) hear it, but your other test subjects can counter that.

The really important point though is that you CAN'T really have a false positive - unless it's by some highly improbable fluke (the statistical likelihood of which can itself be reduced as far as you want by repeated testing). It doesn't matter if your test subject is highly biased in favour of wanting to hear something. It doesn't matter if they swear blind on the holy book of their choice that there is a difference. If they can't then pick the difference out in an ABX test, then they're talking out of their hat. Moreover, if other test subjects give the same results (i.e. they can't hear the difference either) then, as with the previous case, you can have some degree of confidence that any claimed difference isn't really there. Or isn't anything like as glaring or obvious as some may have claimed it to be.

That is the real value and robustness of the ABX test method - there isn't any way to fake the positive result, regardless of how much the test subject may want to. And it's the positive result that is the interesting one. If someone claims to hear a clear difference between two sources then they should be able to ace an ABX test and, given that such a person is likely to be pre-disposed to hear any difference that may be there (or try their damnedest to do so), if it then turns out that they can't tell the two things apart, then you know that their claim is demonstrably (I'd even be tempted to say provably) false.
📝 Reply
Post Reply

Welcome to the Gearspace Pro Audio Community!

Registration benefits include:
  • The ability to reply to and create new discussions
  • Access to members-only giveaways & competitions
  • Interact with VIP industry experts in our guest Q&As
  • Access to members-only sub forum discussions
  • Access to members-only Chat Room
  • Get INSTANT ACCESS to the world's best private pro audio Classifieds for only USD $20/year
  • Promote your eBay auctions and Reverb.com listings for free
  • Remove this message!
You need an account to post a reply. Create a username and password below and an account will be created and your post entered.

Slide to join now Processing…

Forum Jump
Forum Jump