• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Limitations of blind testing procedures

Status
Not open for further replies.

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
20,766
Likes
37,625
I think I see one of the circles that audiophile discussions go round in forming again. :)

In previous comments, people have said that they think there are strange interactions between amplifiers and real speaker loads. Conventional measurements do not guarantee to stimulate those, hence the need - I think - to prove the point with a software-implemented (i.e. practical) null test using real music.

Also, could a deliberately terrible-sounding system still look good in your measurements? I think so. Supposing we inserted a compressor in the system that reduced dynamic range. It wouldn't have much effect on any measurements that are based on a steady state waveform (i.e. all of them?) but it would sound terrible with real music.

Ditto for some radical phase shifting that destroys the coherence of transient edges.

As long as it is possible to imagine a way around the tests, audiophiles will not accept their results.

So you missed the part about taking measures at the speaker terminal? Yes a good simulation would help, but I offered a real possibility to test. So we see if there are real strange interactions. And yes, often there are.

The compression will raise the IMD levels of either test, especially the second one.

The phase could be detected with square waves or sawtooth waves. You'll have to decide how much. What info there is indicates humans are relatively insensitive to phase above 2 khz.

If you think audiophiles are going to accept your modelling or the test results I think you are being naive.
 
Last edited:

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
20,766
Likes
37,625
snip.....


It runs in circles also on the developing side, as something like dither was already known before the introduction of the CD, but somehow it got lost.
snip.....

Dither was in the very first Sony PCM recorders. Now whether users of the gear understood its importance is another matter. I don't think you could say it got lost. It was some time before studio people new to the digital tech understood its need, its benefit, and when it should be applied.
 

Cosmik

Major Contributor
Joined
Apr 24, 2016
Messages
3,075
Likes
2,180
Location
UK
So you missed the part about taking measures at the speaker terminal?
No. But the question is whether your stimulus (some steady state waveforms) will trip the instability in the amp, or whatever. Unless we use real music, we will never know.
The compression will raise the IMD levels of either test, especially the second one.
It depends on the characteristics of the compressor e.g. the time constant, and what it does at various volume levels and frequency content.
The phase could be detected with square waves or sawtooth waves.
Indeed it could, but that wasn't mentioned in your conventional tests.
If you think audiophiles are going to accept your modelling or the test results I think you are being naive.

I would only model those audiophile obsessions that are purely mathematical e.g. high res versus CD. Everything else I would be measuring using a high res ADC, and doing the subtraction (nulling) in software - no modelling, except the removal of 'benign' errors such as DC or delays and, if we decided it, phase errors (e.g. at the top and bottom of amplifier FR). These would be a nightmare with a hardware null test, but trivial in software.

I do not expect audiophiles to accept the results, but they might confirm a few things for more rational hi-fi enthusiasts.:)

Edit: Those people who listen to vinyl and valves *know* that they are not clinical in their measured precision, but they like to believe they are exuding some sort magic that better re-constitutes the original event from the freeze-dried recording. They would obviously not be interested in this in the slightest.
 
Last edited:

Cosmik

Major Contributor
Joined
Apr 24, 2016
Messages
3,075
Likes
2,180
Location
UK
Ah, i see, but wouldn´t it be better to us "listening test" or "LT" instead to avoid any misunderstanding?
OK.
...you could imo emphasize a bit more that you´re describing a hypothesis although you present it (at least partly) as fact. Up to now no solution exists to capture the differences and to confirm that the null is ~100 dBwhatever.
There are two bits of established science from which we could predict whether a result is audible or not, when the error is reproduced in isolation, or against the signal.
https://en.wikipedia.org/wiki/Auditory_masking
https://en.wikipedia.org/wiki/Absolute_threshold_of_hearing

I did specify "a typical listening room", but we could specify its background noise floor characteristics further.
 

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
20,766
Likes
37,625
No. But the question is whether your stimulus (some steady state waveforms) will trip the instability in the amp, or whatever. Unless we use real music, we will never know.

It depends on the characteristics of the compressor e.g. the time constant, and what it does at various volume levels and frequency content.

Indeed it could, but that wasn't mentioned in your conventional tests.


I would only model those audiophile obsessions that are purely mathematical e.g. high res versus CD. Everything else I would be measuring using a high res ADC, and doing the subtraction (nulling) in software - no modelling, except the removal of 'benign' errors such as DC or delays and, if we decided it, phase errors (e.g. at the top and bottom of amplifier FR). These would be a nightmare with a hardware null test, but trivial in software.

I do not expect audiophiles to accept the results, but they might confirm a few things for more rational hi-fi enthusiasts.:)

Edit: Those people who listen to vinyl and valves *know* that they are not clinical in their measured precision, but they like to believe they are exuding some sort magic that better re-constitutes the original event from the freeze-dried recording. They would obviously not be interested in this in the slightest.

I have no problem adding tests to my list if needed. My list was a starting point. Better than music to uncover stability issues is twin high level sweeps concurrently going in opposite directions.

Now I like null tests. Have done plenty. I would support what you have in mind. I also know it is much more touchy and constraining than development of a testing suite that isn't nulling. For instance the time difference between one and two meters of cabling can corrupt results. A difference of only 3 nanoseconds.
 
Last edited:
OP
oivavoi

oivavoi

Major Contributor
Forum Donor
Joined
Jan 12, 2017
Messages
1,721
Likes
1,939
Location
Oslo, Norway
Thanks for reading it so thoroughly! Impressive. It might very well be that you're right. I need to read it again to see whether I agree with your criticism or not.

Ok, I read through the article again now. I'm again impressed by your ability to read articles like that critically. You're absolutely right on one point: Much of the article is what you call "conjecture" - but I would also call it "formulation of a possible theory". As I see it, their claim is that much of the experimental evidence on echoic memory doesn't really align with many of our day-to-day experiences - the fact that there exists a rather fine-tuned long-term echoic memory. And then they try to develop a theory which may account for those everyday experiences, and provide some first and limited evidence which goes some way of supporting their theory. But I agree that they can't fully support their theory with enogh evidence - at least not in that article.

I don't think we need any calculations. I think conventional measures are sufficient.

I propose the following as signifying an audibly transparent set of conditions.

Frequency response 0/-.1 db to 20 khz.
THD of -80 dbFS or better at all audible frequencies.
SNR of -100 dbFS or better
IMD of both the 19+20 khz and 60hz+7 khz variety of -100 dbFS or better.
IMD of ultrasonic signals to -100 dbFS or better in the 20 khz band.
No digital aliasing above -100 dbFS below 20 khz.
These at the input terminals of the loudspeaker.
The system must be properly gain staged for the gear in use.

From academic research into hearing which includes blind listening tests of hearing abilities and from understanding the structure and functioning of the hearing mechanism, in my opinion it would be safe to raise the above measures by a factor of 10x (20db) and still have audible transparency. If perhaps this is very close to audible, certainly the level of audibility at that point would be trivial at worst.

A calculation that would plug in the specs for each individual piece of gear and tell you if the chain of components met these conditions might be handy. Usually it isn't too hard to figure that out for most conditions.

For me personally, any electronic devices with these specs would be good enough. I'm 100% sure that devices with these specs would not color the sound in any disturbing way, and would enable me to enjoy music, and that any changes from there on would be marginal at most. Right now, I'm debating with myself whether an amp that costs 250 USD is good enough, or whether there's any point in spending up to 1000 USD for used high-end amps. That gives you a clue about my own approach.

But my point here is more philosophical. I must admit that I'm starting to increasingly doubt that all the differences that are being heard by the gear-swapping subjectivists are all in their head. Can they all be so fundamentally wrong? It just seems a bit... strange to me. And I'm therefore searching for possible explanations as to why these differences seem to disappear under blind-test conditions. I guess it's a question of challenging my own belief system - there's no fun in agreeing with myself all the time!
 

Thomas savage

Grand Contributor
The Watchman
Forum Donor
Joined
Feb 24, 2016
Messages
10,260
Likes
16,306
Location
uk, taunton
On a different note, had my hearing tested yesterday ( only upto 8k ) and I'm actually better off than most 18 year olds ..

It did show up how over time many ( not me :p) lose sensitivity in the crucial every day hearing frequently range at various points in the range.

Could this effect results ? Maybe the selection of participants might not be ideal further effecting the usefulness of any listening tests blind or other wise.
 

Jakob1863

Addicted to Fun and Learning
Joined
Jul 21, 2016
Messages
573
Likes
155
Location
Germany
@Thomas savage,

of course it will effect listening results. Without knowing which sort of deformation is given it´s hard to conclude the specific direction.
Afair in germany the estimation for the proportion of teenagers facing already mild to severe hearing damage was as high as ~30% .

@Cosmik,
OK.

There are two bits of established science from which we could predict whether a result is audible or not, when the error is reproduced in isolation, or against the signal.
https://en.wikipedia.org/wiki/Auditory_masking
https://en.wikipedia.org/wiki/Absolute_threshold_of_hearing

I did specify "a typical listening room", but we could specify its background noise floor characteristics further.

My remark was related to your assertion:
"This is the absurdity of most audiophile scientific tests as far as I can tell: people know that those levels represent 'silence' in isolation but they still want to test them using ABX against music because they're not quite prepared to believe that they don't suddenly become audible against the main signal - or something like that"

Especially the part about "people know that those level represents silence" because it is afaik just a proposition (in most cases) as no measurement solution exists to capture the analog signal at the loudspeaker terminals of two DUTs and to confirm that the difference is indeed around -100 dBwhatever using real music.
 

Cosmik

Major Contributor
Joined
Apr 24, 2016
Messages
3,075
Likes
2,180
Location
UK
I also know it is much more touchy and constraining than development of a testing suite that isn't nulling. For instance the time difference between one and two meters of cabling can corrupt results. A difference of only 3 nanoseconds.
Absolutely. The idea would be to decide on the allowable variables e.g. DC, phase and delay, and to exhaustively sweep them, or if this was too slow, more intelligently home in on the values that minimised the null output. The software would have to mimic the reconstruction filtering in software.
 

Fitzcaraldo215

Major Contributor
Joined
Mar 4, 2016
Messages
1,440
Likes
634
But my point here is more philosophical. I must admit that I'm starting to increasingly doubt that all the differences that are being heard by the gear-swapping subjectivists are all in their head. Can they all be so fundamentally wrong? It just seems a bit... strange to me. And I'm therefore searching for possible explanations as to why these differences seem to disappear under blind-test conditions. I guess it's a question of challenging my own belief system - there's no fun in agreeing with myself all the time!

I am an agnostic. Nothing is perfect, even many "objective" procedures. Yes, of course, pure sighted listening is the most problematical and least trustworthy, but it is not totally unreliable. Except, who is doing the sighted listening comparisons? I accept the risk of trusting my own potentially faulty judgement on those only because I have no better procedure available to me. As far as trusting other's opinions based on sighted listening, forget that. Completely relying other's ears is much too risky for me. Friends and I often listen to each other's systems, but I do not think my opinions have ever changed their minds, nor theirs' my own mind.

And, it also depends on what is being compared. If stereo vs. mono versions of the same music were compared on the same system, for example, just about everybody would get that right via simple sighted listening. It is as differences become smaller that sighted listening has serious problems and may be overwhelmed by listener biases or faulty technique.

On the amp question, I have done sighted listening comparisons of solid state amps in my own system on a fair number of occasions for purchase or upgrade purposes. The amps I selected for this all had decent pedigrees as far as available objective measures, which I had diligently researched prior to the listening tests.

The tests were normally done in AB fashion with music, after being level-matched by RatShack meter with pink noise. Switchover time was two to three minutes, which is far from ideal. But, I have no way of doing better with heavy speaker cables, and some amp terminals are a pain.

In different sessions with different sets of amps compared, I have come to a variety of different conclusions.
I do not conclude they all sound exactly the same. But, I also conclude that the ones most preferred in my own listening tended to sound very much alike, often essentially indistinguishable. A very few others sounded more odd, with a slight, but unusual signature I did not prefer.

The point is this was done for me and me only. I was not writing a research paper or even a published review, nor was I trying to tell the world of the rightness of my decision using only my speakers in my room with my music. My question is how else is a stupid but pragmatic guy like me supposed to proceed in these purchase situations?

If I am self-delusional in these beliefs, it is only me who is affected by it. And, if there were a better way that were actually available via dealerships or online information, I would assuredly make use of it. But, even then, I still would not do without listening myself. I cannot completely trust my own ears, but I do not trust not using them for evaluation, either.
 
Last edited:

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
20,766
Likes
37,625
Often some one will ask, "how many of you really buy your gear based upon specs and test results?". Well my last several purchases have been that way. It has worked out very nicely. I looked for best specs in my budget, along with features I wanted/needed, and if there were multiple possible choices made final choices off of looks or neato extra features. I did also estimate the reputation of some brands vs others. If nothing else it matters if you later wish to sell an item.

In times past I did what is common among audiophiles. Listened, compared, did informal shoot-outs, and developed in my mind a map of general differences and character in the sound of each piece of gear. I would become very comfortable and solid in my choices eventually. Over the years it was obvious much if not nearly all of that was in my head and not in the sound performance of the gear. I remain susceptible to gut feelings, emotional tie ins (you know when I listen over at Ralph's it just seems so satisfying, must be Ralph's gear being special) or a myriad of other ways we let ourselves be biased without any controls. Over time having intellectually understood what is usually going on and that the sound I hear is not the sound in the gear it became easier to let go of that.

So choosing off spec/feature/budget has become easy, no-hassle relatively speaking, and resulted in some very musically satisfying results. It took time, but I don't second guess that approach and at one time I thought it would be impossible not to do so. It is still easy if you get one good result with Brand X to become a fanboy and think Brand X is better than most. At least I no longer worry if a box of electronics has tubes or discrete transistors or op-amp chips in it. If the input vs output is what I need it doesn't matter. It also becomes more clear and liberating in what I can consider for my needs.

The one big hassle is getting third party test results. This is where one company's rep vs another comes in. Some are careful to have their gear meet spec and then some. Others are always skirting about it in some way or another. Even good companies making good gear are guilty of this at times.
 

Jakob1863

Addicted to Fun and Learning
Joined
Jul 21, 2016
Messages
573
Likes
155
Location
Germany
With respect I think you misrepresented conclusions of the article you linked. It shows that some sensory memory has some effect longer than 10 or 20 seconds. Perhaps it trails off over a longer period rather than being some buffer that fills up and then empties over a set time. Perhaps there is pattern recognition that lets some features of the more finely resolved sensory memory be reactivated which would imply some of it gets encoded in the patterns. The rest is conjecture on their part of what might be happening. In fact it was an odd article in that two thirds of it is about what might not be fully ruled out instead of their test results or those of others.

Afaik today there are still differences between the various memory _models_ and at least none is able to cover all effects precisely. There is still a lot of research going on and therefore this sort of article (Cowan wrote several others explaining the different memory conditions and limitations of models) recollects what was found out and where there are still contradictionary observations.

The models differ in their approach to strictly seperate memory conditions, the time span of the real sensory (means echoic) memory varies in the range from a couple hundreds of milliseconds to ~5s . The next step would be the short term storage or (in the other model) the working memory covering short term storage; again according to the literature the time span observed in experiments differs in a range from > ~5s to 15s (or even 20s) while it is known that rehearsal and processing could lengthen this to even minutes.
And there is long term storage, common understanding seems to be that the echoic memory is capable to hold a very detailed "image" of the sound heard (which way is imo not so well understood) while the long term storage, as ovavoi already pointed out, is more about categories, which i think incorporates the patterns you´ve mentioned. There are individual differences, some people are known to have the ability of a nearly perfect recording memory enabling astonishingly detailed recollections (known to exist for visual and audible stimuli),

One thing about patterns is they are like maps. Maps are not the territory. They work in fact by pruning away excess resolution to make clear the patterns. Now obviously we have long term auditory memory yet nothing in this paper indicates it rivals or is much like immediate sensory memory.

Why should it rival or has to be like "immediate sensory memory" ?

Recognition of voices of people we know or other sounds over long periods of time is interesting. Recognition of music too, it doesn't require much fidelity to hear different copies of a song if you are familiar with it. That isn't surprising if long term memory is mostly about patterns and not about maximum sensory fidelity. The more interesting part would be how low can fidelity go before it interferes with those. It might help us set a limit on how low fidelity or what parts of fidelity are still satisfying in case you cannot spend hundreds of thousands on a stereo at home.

I´d say it is not so much about maximum sensory fidelity, but about the capabilities of the long term storage. I´ve since long time promoted the hypothesis that the probability to recollect an auditory event after a longer time span is much higher if more brain areas are involved during the categorization process and finally transferring to long term storage (which way that works isn´t imo well understood either). As nearly everything has a pattern, that will obviously important, but the "nopattern property" is something we can store as well, we are able to remember what white noise sounds like.
Long term storage not only enables us to remember melodies but the sound of instruments as well. And i agree that in the sound of instruments patterns are playing an important role, but fail to understand why that would prevent us from remembering the "fidelity of a reproduction" .

<snip> Whatever that pattern is it needed little audio fidelity to be recognized. I didn't even know before then how much of my old girlfriend's framing of speech was a local fashion instead of just her. So much so it eerily seemed like the same girl even after I knew it wasn't.

Exaggerating a bit, the conclusion from your anecdote seems to be that the people from your girlfriends area would have real trouble to differentiate the voices of their close ones as the patterns are of such a terrible similarity...... :)
 
Last edited:

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
44,663
Likes
240,991
Location
Seattle Area
The present "rational consensus" seems to be that
a) the echoic memory is very short, and A/B or ABX-comparisons therefore have to use short excerpts and short time intervals
b) test tones/signals are better suited for finding differences than musical excerpts, since the brain doesn't get fooled by getting drawn into the music
I will read the article later but I am not aware of (b) being even part of the equation let alone consensus. ABX tests use music and I have not heard of any that that have been put forward that use test signals.

Pink noise, clicks and such are used in acoustic research but they are not ABX style tests. They are used to see which impairment thresholds and such at any rate, they are usually kept in context and used as compared to music. Here is an example from Dr. Toole's book:

upload_2017-6-28_10-3-6.png


What I think you may be meaning and this is actually a good thing, is to use material that is revealing of the differences we are searching. In that sense such selection is a good thing as it increases the chances of positive outcome. Castanets above for example is used in lossy audio compression because the common "pre-echo" distortion is readily heard prior to each sharp transition. These tracks are aptly called "codec killers."

Also (a) is not a condition of such testing either. As a listener, you can take as long as you want to change inputs/selections. And you can listen as much as you want. It is just that research shows making the transitions shorter helps due to the reasons you mention. This highly correlates with my experience too. But again, there is no such restriction in the test protocol itself. Most often entire songs are presented in ABX tests, not short snippets (although for copyright reasons that is often done).
 

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
44,663
Likes
240,991
Location
Seattle Area
While this is probably valid for many cases, this article nevertheless seems interesting. The claim in the article is that there also exists a long-term auditory memory. They base this on a very intuitive and self-evident fact: That we are able to recognize things like specific voices of people we know, even after a long time. The article is rather technical and a heavy read, but my take-away is that it is easier to remember tones and sounds for a longer time if we can put them into categories/systems/regularities. When we can put sounds in a specific context, we remember them more easily.
OK, I skimmed through the conclusion/summary section. Nothing in that indicates that long term memory is more accurate than short-term. The paper simply says that we may be able to retain some of the short-term acoustic events/memory in our long term memory. In no way does it say that is all that short-term memory captured, or that it is better basis for remembering things.

In the example you cite of friend's voice, in a quick AB comparison I can detect how much noise may be there, amount of sibilance, warmth/richness, detail, roughness, etc. None of that is maintained in long-term memory. Indeed voices change objectively based on environment we hear them in. Our brain readily and willingly gets rid of the environment making them sound seemingly the same no matter where we hear them. In that sense, long term memory absolutely filters acoustic information that is captured by the ear. It has to due to capacity limit alone. But also as part of evolution of remembering what is important and what is not.

The paper simply challenges the notion that long term memory is all about cognitive filtering and that short-term sensory information (i.e. what would be recorded bit for bit) is lost. It says some of that information makes it through. That it does. If I listen to a system that is boomy, I will remember that in the long term. In this sense the paper is challenging existing models of long term memory but lends no hand to audiophile belief that long term listening is more revealing. It simply is not.
 

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
44,663
Likes
240,991
Location
Seattle Area
But, if you want to find confirmation for any hypothesis you have to do experiments and have to realize that these tests (using human listeners) are social/behavorial experiments at the time as well. So after a couple of years and observing the difficulties a lot of listeners usually had/have with "unusual conditions" i tried to use another approach with two preamplifiers in which the participants were not knowing being part of a test:
You performed a preference test. It is easy to find preferences between identical samples. And just as easily get a few people in a few instances to agree with each other. That data ultimately is wrong if in test of differences between two samples, they cannot be distinguished.
 

Cosmik

Major Contributor
Joined
Apr 24, 2016
Messages
3,075
Likes
2,180
Location
UK
I will read the article later but I am not aware of (b) being even part of the equation let alone consensus. ABX tests use music and I have not heard of any that that have been put forward that use test signals.
...
Also (a) is not a condition of such testing either. As a listener, you can take as long as you want to change inputs/selections.
Just because certain things are customarily done does not mean that they are 'right' - if it is even possible to show that there is a 'right' answer. And this is the crux of the difficulty: if conscious acoustic memory is short but subconscious differences are only perceived over longer periods, or differences can only be perceived against a 'comprehensible' signal such as music yet the music creates such emotional 'noise' in the brain that it swamps the perception of tiny acoustic differences, then it may simply not be possible to use science to establish whether, say, absolute phase is important or not - and in the Rumsfeld sense, we may never know whether we know that is the case or not. Simply saying "ABX testing is the industry standard and is always carried out using music, therefore music is a valid stimulus and ABX testing works..." does not make the results meaningful.

I can actually live with this, but I get the feeling that not many people agree - science has simply *got* to be applicable to everything, even such an irrational or transcendent experience as listening to music. This leads to an entire industry of science and testing that is completely self-referencing and has no concrete meaning or value outside of itself.
 

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
44,663
Likes
240,991
Location
Seattle Area
None of this has to be theoretical or arguments about the "science." You can trivially prove the critical component of it to yourself as I have done.

I have lost track of how many times I was presented with two samples, listened to them and found them to be hopeless to distinguish when playing one and then playing the other. Then I pull them into an ABX program, select short segments and after some trial, reliably tell them apart. So for me, there is no debate or ambiguity whatsoever: short term memory is far better and the shorter the better.

You can try the same. Compress some music to 256 kbps, put it in a playlist with the original and AB between them with your eyes closed. Then pull them into an ABX program and do what I stated. I think you will find that your odds of hearing them being different markedly improves.

As an example of something that I have passed, you can take Ethan's generational loss test: http://ethanwiner.com/loop-back.htm

Listen to the tracks individually at first and you will find them very challenging to tell apart.
 

Cosmik

Major Contributor
Joined
Apr 24, 2016
Messages
3,075
Likes
2,180
Location
UK
None of this has to be theoretical or arguments about the "science." You can trivially prove the critical component of it to yourself as I have done.

I have lost track of how many times I was presented with two samples, listened to them and found them to be hopeless to distinguish when playing one and then playing the other. Then I pull them into an ABX program, select short segments and after some trial, reliably tell them apart. So for me, there is no debate or ambiguity whatsoever: short term memory is far better and the shorter the better.

You can try the same. Compress some music to 256 kbps, put it in a playlist with the original and AB between them with your eyes closed. Then pull them into an ABX program and do what I stated. I think you will find that your odds of hearing them being different markedly improves.

As an example of something that I have passed, you can take Ethan's generational loss test: http://ethanwiner.com/loop-back.htm

Listen to the tracks individually at first and you will find them very challenging to tell apart.
But again, is this not an example of a 'self-referencing' paradigm? Proving that ABX testing works on certain differences does not prove that it works on others.
 

Jakob1863

Addicted to Fun and Learning
Joined
Jul 21, 2016
Messages
573
Likes
155
Location
Germany
You performed a preference test. It is easy to find preferences between identical samples. And just as easily get a few people in a few instances to agree with each other.
Sherlock, you´re right on the first two. :)
It was a preference test and of course people can (and will as data from a lot of tests showed) find preferences between identical samples.

At the third, it gets a bit more complicated, because "few people in a few instances agree with each other" didn´t took place. Each (hidden) participant got the samples alone and the marks were randomly assigned to the samples. And i didn´t know either which sample each particpant should choose when handing out the samples.

That data ultimately is wrong if in test of differences between two samples, they cannot be distinguished.
These assertions seem to be based on incorrect premises, but even in case that the premises were correct, the conclusion wouldn´t follow. ;)
 
Last edited:
Status
Not open for further replies.
Top Bottom