• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Limitations of blind testing procedures

Status
Not open for further replies.
OP
oivavoi

oivavoi

Major Contributor
Forum Donor
Joined
Jan 12, 2017
Messages
1,721
Likes
1,938
Location
Oslo, Norway
Perhaps, if you guys went out and bought a parametric equalizer or used one on your computer, and tried changing how your favorite songs sounded, you might come to terms with how easy it is to change audio, and how many permutations can all sound equally good, and how you can on your favorite song change the way it is equalized every day and still have a good experience. I say this, not to discount blind tests, but to acknowledge that WE are the variable here, and the mega buck unit that sounded great to you today, could very well not sound as great a week from now. I also say this to say that, minute variations on your favorite song, using the parametric equalizer, will not be audible to you, yet the song is not the same, kind of like splitting hairs on which good quality gear sounds "better". To see results, you need to be listening to different amp topologies, different speakers, etc, and all the time, with your head locked in a vice, as movement of 6 inches can change the level of sound by 6 db depending on your room modes, etc.

Don't forget, if your of advancing years, you just threw out everything above what, say 6Khz, or 8Khz, or whatever, you pick. The equalizer will tell you a lot about your hearing or lack thereof. I agree that it is important to purchase something you like the way it looks and feel the value reaches your threshold, and if you look to the specs, and they are about the same as the mega priced units, you are in pretty good shape anyway.

In my personal hifi choices, I tend to agree with you. I spend 95% of my hifi budget on speakers, and ideally I would have liked all speakers to be fully active solutions so that I didn't have to make any choices about electronics at all. Much more rational from every conceivable perspective, as I see it. But we're discussing the theoretical principles here - what weight we can/should assign to blind tests in our beliefs about hifi/audio.
 

tomelex

Addicted to Fun and Learning
Forum Donor
Joined
Feb 29, 2016
Messages
990
Likes
572
Location
So called Midwest, USA
In my personal hifi choices, I tend to agree with you. I spend 95% of my hifi budget on speakers, and ideally I would have liked all speakers to be fully active solutions so that I didn't have to make any choices about electronics at all. Much more rational from every conceivable perspective, as I see it. But we're discussing the theoretical principles here - what weight we can/should assign to blind tests in our beliefs about hifi/audio.


Pretty much my whole audio life, I have been confident that I can pretty quickly decide if I like a "sound" or not. Typically, at audio shows, I can walk in and quickly determine if I want to stay in the room or not, now I am aware of all the reasons sound can be what it is, but I don't have time at a show to analyze stuff that does not sound "good enough" and do a study on it. I know what I like, depending on what my goal is. For shows, 75% is hearing speakers as there are no audio salons very near me, with enough range of stuff to make an informed decision. Yes, audio shows are really only for me, to introduce me to the possibility of what a speaker may sound like.

So, whats this go to do with weight assigned to blind listening tests, well, virtually walking into a hotel room at a show and listening to speaker you never actually heard before is almost blind, and I trust my ears to tell me if its worth staying or moving on. That means yes, a big weight assigned. Is it fair, well, when you are in the culling process to narrow down stuff, once you get to the few sounds that you like, then you can do further investigation, such as playing the songs you brought along because you know what is on the recording and what you like to listen for.

I know what kinds of sound appeals to me, I know that ice cream that the first ingredient is cream, is going to taste like. I am not new to this hobby, and I know the hows and whys and can make your amp sound the way you want by changing electronic parts, so, you might say I am a professional at knowing what I like, so I can judge COMPARISONS of switched out gear, in a given system, and decide what I like the sound of, with full confidence.

I know what tight bass sounds like, what details in music sound like, what noise sounds like, and what coloration sounds like when I can play sources I am familiar with. In some ways it is easy for me, either I am a detail freak, or I am looking for the right kind of sing song embellishment that only SET can provide, but in all cases, given the interdependence of all parts of a system, if you just swap in and out one thing at a time, yes, a blind test tells me when the system sounds more like what I want or less, based on whether I am expecting detailed presentation or happy SET presentation.
 

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
20,696
Likes
37,432
Resurrecting this thread, as I just stumbled across a scientific article which seems to me to be pertinent to the discussion here (and for the "Can you trust your ears" and "blind test design" threads as well). It's an article about different kinds of auditory/echoic memory:
"From Sensory to Long-Term Memory: Evidence from Auditory Memory Reactivation Studies"
https://pdfs.semanticscholar.org/2096/25309d5e01183db129fa3a7151945cdf3e8f.pdf

The present "rational consensus" seems to be that
a) the echoic memory is very short, and A/B or ABX-comparisons therefore have to use short excerpts and short time intervals
b) test tones/signals are better suited for finding differences than musical excerpts, since the brain doesn't get fooled by getting drawn into the music

While this is probably valid for many cases, this article nevertheless seems interesting. The claim in the article is that there also exists a long-term auditory memory. They base this on a very intuitive and self-evident fact: That we are able to recognize things like specific voices of people we know, even after a long time. The article is rather technical and a heavy read, but my take-away is that it is easier to remember tones and sounds for a longer time if we can put them into categories/systems/regularities. When we can put sounds in a specific context, we remember them more easily.

What's the relevance for blindtesting? I think that "analytical" blindtesting might take away some of our ability to put sounds within a larger context, and thus make it more difficult to identify differences. I assume that this can be overcome by training on the specific task at hand, but I still think that this easily can mask differences for untrained testees. It also seems to me that ABX tests are way too difficult for the brain to handle, given the limitiations of our auditory memory, and that AB tests would be better.

I also wonder about what kind of differences we might expect to find in AB-comparisons. If our short term acoustic memory is so short, it might be able to spot what I would call "static" differences - that would be differences in frequency response, for example. But what about differences in dynamics? Or transients? Or the time domain in general? Everything which has to do with changes in the music over time seems to me to be rather difficult to capture in short-term listening. Might this go some way towards explaining why Toole and others found that frequency response trumped all other differences in their tests?

With respect I think you misrepresented conclusions of the article you linked. It shows that some sensory memory has some effect longer than 10 or 20 seconds. Perhaps it trails off over a longer period rather than being some buffer that fills up and then empties over a set time. Perhaps there is pattern recognition that lets some features of the more finely resolved sensory memory be reactivated which would imply some of it gets encoded in the patterns. The rest is conjecture on their part of what might be happening. In fact it was an odd article in that two thirds of it is about what might not be fully ruled out instead of their test results or those of others.

One thing about patterns is they are like maps. Maps are not the territory. They work in fact by pruning away excess resolution to make clear the patterns. Now obviously we have long term auditory memory yet nothing in this paper indicates it rivals or is much like immediate sensory memory.

Recognition of voices of people we know or other sounds over long periods of time is interesting. Recognition of music too, it doesn't require much fidelity to hear different copies of a song if you are familiar with it. That isn't surprising if long term memory is mostly about patterns and not about maximum sensory fidelity. The more interesting part would be how low can fidelity go before it interferes with those. It might help us set a limit on how low fidelity or what parts of fidelity are still satisfying in case you cannot spend hundreds of thousands on a stereo at home.

Now more in the vein of what you were referring to I can relay an interesting experience. It involved an old girlfriend from my days at university. A decade afterwards I was working in a lab in the evening. The other guy in the lab listened to a local call in talk show over AM radio in a city several hundred miles away. It always annoyed me because it made no sense why he wanted to listen to that. All of a sudden one evening a gal calls in, and boom, that sounds like my old girlfriend. I didn't have her in my thoughts and had not for years. I stop, and listen more closely, and realize it isn't her. The voice is pitched a good deal higher for one thing. Yet it kept eerily sounding like it was her. The pattern of speech, the pattern of thought seemed exactly like her. During the call whatever her issue was ( I don't remember it) she tells her age which was a year older than the gal I knew. Tells she lived on a street across from a local high school which was the high school my old gal attended. My old gal friend lived 4 blocks from there. Apparently the local neighborhood and high school had a pattern of speech dialect and perhaps from it even some patterns of framing thoughts that was peculiar to that little area and group of people. Extreme example think of California Valley Girls.

Now never having been around anyone else from that location I never realized it. Her dialect wasn't even noticeable in isolation (unlike Valley Girl speak). Yet it was strong enough I recognized it literally in hearing about 5 words over an old analog phone broadcast 800 miles over AM radio playing over a boom box. Whatever that pattern is it needed little audio fidelity to be recognized. I didn't even know before then how much of my old girlfriend's framing of speech was a local fashion instead of just her. So much so it eerily seemed like the same girl even after I knew it wasn't.

BTW, an excellent book IMO, that helps lay people like myself understand pattern matching and some of the mechanisms of how it works is about reading. Reading in the brain: the New Science of how we read.

How this works applies to much more than just reading. It is very interesting about how our interaction with modern technology is influenced by pattern matching tendencies of our brains structure. How some structure that evolved for other purposes get re-purposed or more likely why we direct some interactions in ways that make it easy on our brain.

https://www.amazon.com/Reading-Brain-New-Science-Read/dp/0143118056
 

Cosmik

Major Contributor
Joined
Apr 24, 2016
Messages
3,075
Likes
2,180
Location
UK
I keep seeing statements that more-or-less say "Method ... may have problems, but I can't think of anything better".

This is the sort of issue that causes disasters in economics. People come up with a measure e.g. GDP or inflation, and say things like "It has problems as a measure, but it's all we've got; and it is a commonly agreed standard, so if we all use it in the same way we're on a level playing field, anyway". Governments begin targeting the measure as something to be maximised or minimised, and people base their investment decisions on the values of those measures. Before they know where they are, they have completely screwed up the system and it crashes. For example, debt-GDP-ratio influences investment decisions, but GDP is one of those measures with "problems".

If the audio world agrees that ABX testing of music excerpts is the officially-sanctioned standard despite having problems that we can't even describe or quantify, and then develops its equipment to target the best ABX listening test scores, the resultant systems will have problems that we can't even describe or quantify baked into them.

In the economics example, the best way to approach it would be through a combination of measurements, rationality and 'common sense' (I think you said something similar oivavoi), and this might be what distinguishes a good politician or finance minister from one who robotically targets a particular measure (I think we had a prime example of that approach in the UK a few years ago - which screwed up very badly).
 

Jakob1863

Addicted to Fun and Learning
Joined
Jul 21, 2016
Messages
573
Likes
155
Location
Germany
@Cosmik,

<snip>

If the audio world agrees that ABX testing of music excerpts is the officially-sanctioned standard despite having problems that we can't even describe or quantify, and then develops its equipment to target the best ABX listening test scores, the resultant systems will have problems that we can't even describe or quantify baked into them.

In the economics example, the best way to approach it would be through a combination of measurements, rationality and 'common sense' (I think you said something similar oivavoi), and this might be what distinguishes a good politician or finance minister from one who robotically targets a particular measure (I think we had a prime example of that approach in the UK a few years ago - which screwed up very badly).

ABX is not a standard and it is not agreed upon in the audio world. :)
It is a test exclusively for difference and therefore it depends on the hypothesis under test if the ABX protocol fits. If one is interested if listener prefer something he should run a preference test (which also may serve as a confirmation of a difference, as an established preference can´t exist otherwise), if one is interested in the degree of acceptance, he should run tests of the hedonic kind. But every test has its problems and the experimenter shall know about these but quite often does not and that is the real problem.

Taking into account what experience in other fields (like medicine or cognitive psychology) already has shown, we should in the audio field combine quantitative and qualitative test methods to get better results.
 

Jakob1863

Addicted to Fun and Learning
Joined
Jul 21, 2016
Messages
573
Likes
155
Location
Germany
@Purité Audio,

I have experienced on many occasions a 'perceived' sighted difference which disappears when compared unsighted .
<snip>
Keith

That happens all the time ;) , but have you checked what the reason was? Can´t you perceive something in general or can´t you perceive something under the specific conditions, that the question in test situations and it is not that easy to find out......

@Blumlein88,

So can you fill in more detailed info about this. How many total listeners were involved?

In total 6, me choosing one device in advance and 5 other listeners.

Something I am unclear about, did each participant get the units once, and make a choice? Or did they get them multiple times? Or did only people picking the unit you picked get additional choices?

Each listener got the units only once for a time span he thinks to need for the evaluation.
 
OP
oivavoi

oivavoi

Major Contributor
Forum Donor
Joined
Jan 12, 2017
Messages
1,721
Likes
1,938
Location
Oslo, Norway
With respect I think you misrepresented conclusions of the article you linked. It shows that some sensory memory has some effect longer than 10 or 20 seconds. Perhaps it trails off over a longer period rather than being some buffer that fills up and then empties over a set time. Perhaps there is pattern recognition that lets some features of the more finely resolved sensory memory be reactivated which would imply some of it gets encoded in the patterns. The rest is conjecture on their part of what might be happening. In fact it was an odd article in that two thirds of it is about what might not be fully ruled out instead of their test results or those of others.

One thing about patterns is they are like maps. Maps are not the territory. They work in fact by pruning away excess resolution to make clear the patterns. Now obviously we have long term auditory memory yet nothing in this paper indicates it rivals or is much like immediate sensory memory.

Recognition of voices of people we know or other sounds over long periods of time is interesting. Recognition of music too, it doesn't require much fidelity to hear different copies of a song if you are familiar with it. That isn't surprising if long term memory is mostly about patterns and not about maximum sensory fidelity. The more interesting part would be how low can fidelity go before it interferes with those. It might help us set a limit on how low fidelity or what parts of fidelity are still satisfying in case you cannot spend hundreds of thousands on a stereo at home.

Now more in the vein of what you were referring to I can relay an interesting experience. It involved an old girlfriend from my days at university. A decade afterwards I was working in a lab in the evening. The other guy in the lab listened to a local call in talk show over AM radio in a city several hundred miles away. It always annoyed me because it made no sense why he wanted to listen to that. All of a sudden one evening a gal calls in, and boom, that sounds like my old girlfriend. I didn't have her in my thoughts and had not for years. I stop, and listen more closely, and realize it isn't her. The voice is pitched a good deal higher for one thing. Yet it kept eerily sounding like it was her. The pattern of speech, the pattern of thought seemed exactly like her. During the call whatever her issue was ( I don't remember it) she tells her age which was a year older than the gal I knew. Tells she lived on a street across from a local high school which was the high school my old gal attended. My old gal friend lived 4 blocks from there. Apparently the local neighborhood and high school had a pattern of speech dialect and perhaps from it even some patterns of framing thoughts that was peculiar to that little area and group of people. Extreme example think of California Valley Girls.

Now never having been around anyone else from that location I never realized it. Her dialect wasn't even noticeable in isolation (unlike Valley Girl speak). Yet it was strong enough I recognized it literally in hearing about 5 words over an old analog phone broadcast 800 miles over AM radio playing over a boom box. Whatever that pattern is it needed little audio fidelity to be recognized. I didn't even know before then how much of my old girlfriend's framing of speech was a local fashion instead of just her. So much so it eerily seemed like the same girl even after I knew it wasn't.

BTW, an excellent book IMO, that helps lay people like myself understand pattern matching and some of the mechanisms of how it works is about reading. Reading in the brain: the New Science of how we read.

How this works applies to much more than just reading. It is very interesting about how our interaction with modern technology is influenced by pattern matching tendencies of our brains structure. How some structure that evolved for other purposes get re-purposed or more likely why we direct some interactions in ways that make it easy on our brain.

https://www.amazon.com/Reading-Brain-New-Science-Read/dp/0143118056

Thanks for reading it so thoroughly! Impressive. It might very well be that you're right. I need to read it again to see whether I agree with your criticism or not.
 

Cosmik

Major Contributor
Joined
Apr 24, 2016
Messages
3,075
Likes
2,180
Location
UK
Could I suggest that there is one thing that is danced around, with few people nailing their colours to the mast? It is this: an audio system (at least as far as the speakers) should reproduce the recorded signal as exactly as possible. Agree or disagree? (People who give any credence to vinyl or valves clearly don't agree).

If we could agree on that, there is a way out of this impasse. We may not be very good at judging differences in ABX tests etc., and we may not be very good at judging good versus bad i.e. the unquantifiable "problems" with listening tests. Similarly, we don't know how to interpret objective measurements meaningfully (is 1% harmonic distortion of one kind more benign than 0.1% distortion of a different kind? etc.). But there is one thing we are probably good at judging merely by listening: silence versus not silence.

If we agree that an audio system should simply be accurate, we therefore agree that its error can be measured. It can also be isolated. Its level can be calculated, and compared to the human threshold of hearing in the context of a typical listening room. The error difference can even be listened to in isolation (using an existing audio system that has errors, but in this case the errors are proportional to a very small signal). *If* the error is so low that it is perceived as silence, can we not agree that the system is transparent? (In fact better than is needed, because the error is silent even without the masking of the main signal).

I would say yes, but I get the feeling that not many of you scientists here would sign up to it. If not, why not?
 

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
20,696
Likes
37,432
Could I suggest that there is one thing that is danced around, with few people nailing their colours to the mast? It is this: an audio system (at least as far as the speakers) should reproduce the recorded signal as exactly as possible. Agree or disagree? (People who give any credence to vinyl or valves clearly don't agree).

If we could agree on that, there is a way out of this impasse. We may not be very good at judging differences in ABX tests etc., and we may not be very good at judging good versus bad i.e. the unquantifiable "problems" with listening tests. Similarly, we don't know how to interpret objective measurements meaningfully (is 1% harmonic distortion of one kind more benign than 0.1% distortion of a different kind? etc.). But there is one thing we are probably good at judging merely by listening: silence versus not silence.

If we agree that an audio system should simply be accurate, we therefore agree that its error can be measured. It can also be isolated. Its level can be calculated, and compared to the human threshold of hearing in the context of a typical listening room. The error difference can even be listened to in isolation (using an existing audio system that has errors, but in this case the errors are proportional to a very small signal). *If* the error is so low that it is perceived as silence, can we not agree that the system is transparent? (In fact better than is needed, because the error is silent even without the masking of the main signal).

I would say yes, but I get the feeling that not many of you scientists here would sign up to it. If not, why not?

I find this post curious. I would agree to what you are saying implicitly. I think few here would disagree.

There are difficulties with the silence listening tests for some purposes, but only that difficulties. The idea is sound. Sorry, the pun was not intended.

It is essentially a null test. Certainly with electronics one of the great frustrations is being able to do one of those, and present the signal with nothing left to hear, and yet have people be completely unconvinced by that result. To imagine that somehow without any masking a silent result could hide issues that become terribly obvious during normal listening while being also completely unmeasurable. One of the simpler ideas to understand should be masking. It is so easily demonstrated. If you understood that and the silent comparison how could you insist there is an audible difference left to hear?

I once presented tests of interconnects where everything was the same except swapping analog interconnect between ADC and DAC. Then nulled those. The result was not a measured zero, but playing that difference file was fully silent on any system. Not only was that not good enough, but it was not at all convincing to most audiophiles. It was then I offered to let people simply listen (their gold standard) and see if they could reliably hear differences when they had no identifying labels. If they could choose the good from the bad. They could not. Yet that was not convincing to the majority either. After that I lost some measure of patience.
 

Cosmik

Major Contributor
Joined
Apr 24, 2016
Messages
3,075
Likes
2,180
Location
UK
I find this post curious. I would agree to what you are saying implicitly. I think few here would disagree.

There are difficulties with the silence listening tests for some purposes, but only that difficulties. The idea is sound. Sorry, the pun was not intended.

It is essentially a null test. Certainly with electronics one of the great frustrations is being able to do one of those, and present the signal with nothing left to hear, and yet have people be completely unconvinced by that result. To imagine that somehow without any masking a silent result could hide issues that become terribly obvious during normal listening while being also completely unmeasurable. One of the simpler ideas to understand should be masking. It is so easily demonstrated. If you understood that and the silent comparison how could you insist there is an audible difference left to hear?

I once presented tests of interconnects where everything was the same except swapping analog interconnect between ADC and DAC. Then nulled those. The result was not a measured zero, but playing that difference file was fully silent on any system. Not only was that not good enough, but it was not at all convincing to most audiophiles. It was then I offered to let people simply listen (their gold standard) and see if they could reliably hear differences when they had no identifying labels. If they could choose the good from the bad. They could not. Yet that was not convincing to the majority either. After that I lost some measure of patience.
Agreed that a conventional null test is often not very meaningful. In my imagined version of a null test, the differencing is performed in software by calculation (e.g. is high res inherently better than CD?), or subtraction of two sampled signals, and can compensate for fixed or slowly changing 'benign' differences like slight DC or gain errors or delays.
 

Jakob1863

Addicted to Fun and Learning
Joined
Jul 21, 2016
Messages
573
Likes
155
Location
Germany
<snip>
I once presented tests of interconnects where everything was the same except swapping analog interconnect between ADC and DAC. Then nulled those. The result was not a measured zero, but playing that difference file was fully silent on any system. Not only was that not good enough, but it was not at all convincing to most audiophiles. <snip>

Could you give some more details about the setup and the nulling procedure?
 
OP
oivavoi

oivavoi

Major Contributor
Forum Donor
Joined
Jan 12, 2017
Messages
1,721
Likes
1,938
Location
Oslo, Norway
Could I suggest that there is one thing that is danced around, with few people nailing their colours to the mast? It is this: an audio system (at least as far as the speakers) should reproduce the recorded signal as exactly as possible. Agree or disagree? (People who give any credence to vinyl or valves clearly don't agree).

If we could agree on that, there is a way out of this impasse. We may not be very good at judging differences in ABX tests etc., and we may not be very good at judging good versus bad i.e. the unquantifiable "problems" with listening tests. Similarly, we don't know how to interpret objective measurements meaningfully (is 1% harmonic distortion of one kind more benign than 0.1% distortion of a different kind? etc.). But there is one thing we are probably good at judging merely by listening: silence versus not silence.

If we agree that an audio system should simply be accurate, we therefore agree that its error can be measured. It can also be isolated. Its level can be calculated, and compared to the human threshold of hearing in the context of a typical listening room. The error difference can even be listened to in isolation (using an existing audio system that has errors, but in this case the errors are proportional to a very small signal). *If* the error is so low that it is perceived as silence, can we not agree that the system is transparent? (In fact better than is needed, because the error is silent even without the masking of the main signal).

I would say yes, but I get the feeling that not many of you scientists here would sign up to it. If not, why not?

Cool idea. But is in principle possible to do this for things like THD+N? Or jitter, or whatever?
 

Jakob1863

Addicted to Fun and Learning
Joined
Jul 21, 2016
Messages
573
Likes
155
Location
Germany
Cool idea. But is in principle possible to do this for things like THD+N? Or jitter, or whatever?
It tries to cover all at once, as the basic idea relies on catching the analog output of both DUTs and subtract it; if both are the same the result must be a null.
Goes back to the 1960s/1970s (afair) - see for example our short discussion about the famous carver challenge - back then people realized that a really deep null was hard to achieve as even slight phase differences between the DUTs spoiled the party.

A modern implementation is/was libinst´s audio diffmaker which tried to do it in the digital domain (used subsample shifting and other tricks to get a real deep null if nearly identical signals were examined), needed surprising amounts of RAM during the calculation especially for longer samples.

The idea is principially appealling because it enables the comparison in real life settings, but in practice presents a lot of problems to avoid introducing new independent variables.
 

Cosmik

Major Contributor
Joined
Apr 24, 2016
Messages
3,075
Likes
2,180
Location
UK
Cool idea. But is in principle possible to do this for things like THD+N? Or jitter, or whatever?
Those measurements already come 'packaged up' for us, and enable us to simply calculate the level at which they would be coming out of the speakers. If the level is 80 or 100 dB down from the main signal, and we know how the measurement/calculation has been made, we can simply predict whether it would register as silence at normal listening levels, and even synthesise it and listen to it.

This is the absurdity of most audiophile scientific tests as far as I can tell: people know that those levels represent 'silence' in isolation but they still want to test them using ABX against music because they're not quite prepared to believe that they don't suddenly become audible against the main signal - or something like that. But this is never discussed, because the ABX listening test is regarded as the ultimate mop-it-all-up-in-one-go test. It is a separate discussion that says "Of course ABX tests may have unquantifiable problems". And another discussion that talks about masking - the same people talking sagely about masking while demonstrating by their ABX tests that they don't believe that a -100dB sound cannot be heard against a signal.

The null test would call their bluff: it would simply isolate the actual error using the music of their choice and demonstrate that it sounded like silence - assuming the system really was that accurate. They could continue to argue that it was a problem, but most of us could move on to more interesting aspects of audio...:)
 

Jakob1863

Addicted to Fun and Learning
Joined
Jul 21, 2016
Messages
573
Likes
155
Location
Germany
There seem to be some wrong premises:
-) people "don´t know that those levels represent" "silence in isolation" , but there are people who assert that this must be without delivering proof
-) most of the people don´t want to run tests in "ABX style", looking into published research papers confirms that other test protocols are widely used

ABX got popular where samples were already available in digital format as ABX-like software was the first freely available to do such tests.
The "100 dB" assertion looks reasonable at first, but as the "dB" is a relative we have to give a reference point. Which should that be? Digital Fullscale (FS) or a voltage or power level?
Btw, we all know already a system in which "something" down at ~ -92 dBFS _does_ matter....... :)
 

Cosmik

Major Contributor
Joined
Apr 24, 2016
Messages
3,075
Likes
2,180
Location
UK
There seem to be some wrong premises:
-) people "don´t know that those levels represent" "silence in isolation" , but there are people who assert that this must be without delivering proof
-) most of the people don´t want to run tests in "ABX style", looking into published research papers confirms that other test protocols are widely used

I am using ABX as shorthand for "Comparing the sounds of different audio systems by listening, using music as the test signal, imitating* the style and methodology of scientific experiments". ABX, AB, ABCDE, XYZ....

The "100 dB" assertion looks reasonable at first, but as the "dB" is a relative we have to give a reference point. Which should that be? Digital Fullscale (FS) or a voltage or power level?
As I said, the isolated error (difference) is played at the same absolute volume as it would be when listening to it plus the main signal.

Btw, we all know already a system in which "something" down at ~ -92 dBFS _does_ matter.......
Good point, but I allowed for that when I said "...and we know how the measurement/calculation has been made". In other words, we would not simply assume it was uniform noise down at that level, but would have to synthesise (in this case) the actual error signal. You illustrate a point for me: in the case of undithered digital quantisation error, this is a theoretical, mathematical error that we don't even have to measure; we just have to calculate and synthesise it - and reproduce it with a system that has much better than 16 bits resolution of course. Which we already have developed through theory, design and engineering - not through the use of listening tests!


*Not me being unpleasant and facetious: I sincerely don't believe it is any better than pseudoscience.:)
 

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
20,696
Likes
37,432
I don't think we need any calculations. I think conventional measures are sufficient.

I propose the following as signifying an audibly transparent set of conditions.

Frequency response 0/-.1 db to 20 khz.
THD of -80 dbFS or better at all audible frequencies.
SNR of -100 dbFS or better
IMD of both the 19+20 khz and 60hz+7 khz variety of -100 dbFS or better.
IMD of ultrasonic signals to -100 dbFS or better in the 20 khz band.
No digital aliasing above -100 dbFS below 20 khz.
These at the input terminals of the loudspeaker.
The system must be properly gain staged for the gear in use.

From academic research into hearing which includes blind listening tests of hearing abilities and from understanding the structure and functioning of the hearing mechanism, in my opinion it would be safe to raise the above measures by a factor of 10x (20db) and still have audible transparency. If perhaps this is very close to audible, certainly the level of audibility at that point would be trivial at worst.

A calculation that would plug in the specs for each individual piece of gear and tell you if the chain of components met these conditions might be handy. Usually it isn't too hard to figure that out for most conditions.

Now after all this or well done null tests or even blind listening tests beyond reproach you will find yourself regarded like a climate scientist. The majority of audiophiles will ignore you or even hold you in disdain. Being correct will not effect things among the current group of audiophiles. JGH opened Pandora's audiophile box, and regretted it in the latter part of his life. I don't know how we get a lid back on that thing at this point. The only bright spot is the entire audiophile crazy world is such a small niche it probably has the same effect on the music scene in general as the Druids have in religious circles of society.
 

RayDunzl

Grand Contributor
Central Scrutinizer
Joined
Mar 9, 2016
Messages
13,246
Likes
17,160
Location
Riverview FL
Out of curiosity:

Microphone pickup of 16/48 1-bit 1.1kHz "sine wave" (undithered) with third harmonic (I suppose because that "sine" looks pretty square to me) - JBL LSR 308 - preamp wide open (volume display value 151, watching TV Dinner value right now it is 041).

Clearly, though softly, audible.

upload_2017-6-12_17-57-24.png
 

Cosmik

Major Contributor
Joined
Apr 24, 2016
Messages
3,075
Likes
2,180
Location
UK
I think conventional measures are sufficient.
I think I see one of the circles that audiophile discussions go round in forming again. :)

In previous comments, people have said that they think there are strange interactions between amplifiers and real speaker loads. Conventional measurements do not guarantee to stimulate those, hence the need - I think - to prove the point with a software-implemented (i.e. practical) null test using real music.

Also, could a deliberately terrible-sounding system still look good in your measurements? I think so. Supposing we inserted a compressor in the system that reduced dynamic range. It wouldn't have much effect on any measurements that are based on a steady state waveform (i.e. all of them?) but it would sound terrible with real music.

Ditto for some radical phase shifting that destroys the coherence of transient edges.

As long as it is possible to imagine a way around the tests, audiophiles will not accept their results.
 
Last edited:

Jakob1863

Addicted to Fun and Learning
Joined
Jul 21, 2016
Messages
573
Likes
155
Location
Germany
I am using ABX as shorthand for "Comparing the sounds of different audio systems by listening, using music as the test signal, imitating* the style and methodology of scientific experiments". ABX, AB, ABCDE, XYZ....

Ah, i see, but wouldn´t it be better to us "listening test" or "LT" instead to avoid any misunderstanding?

<snip>You illustrate a point for me: in the case of undithered digital quantisation error, this is a theoretical, mathematical error that we don't even have to measure; we just have to calculate and synthesise it - and reproduce it with a system that has much better than 16 bits resolution of course. Which we already have developed through theory, design and engineering - not through the use of listening tests!

It runs in circles also on the developing side, as something like dither was already known before the introduction of the CD, but somehow it got lost.
I have to slightly disagree; the case of subtractive dither was solved purely in theory, but the nonsubtractive dither approach relied on psychoacoustics (means involved listening tests) as well, because the error signal can´t be rendered independent from the input signal.
(see Wannamaker ~1992, i´ll cite it later)


*Not me being unpleasant and facetious: I sincerely don't believe it is any better than pseudoscience.:)

Never mind. But you could imo emphasize a bit more that you´re describing a hypothesis although you present it (at least partly) as fact. Up to now no solution exists to capture the differences and to confirm that the null is ~100 dBwhatever.
 
Status
Not open for further replies.
Top Bottom