• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Master Thread: Are measurements Everything or Nothing?

For the moment I drive my Confidence C2 signature with my Devialet. Do not know how difficult they are but they need a lot of power/current to open up. My old Primare I32 did not manage that.
What I know I never stated anything about different DAC sounds?? I have limited experience as I mainly run analog records. I only said headphone amps can, according to me, sound and work differently.

I am no expert!! My records of building and modifying amps and TV sets goes a long time back about 50 years or so to the time of my studies at Chalmers University of technology/ Electronics. The first amp I build was tube amplifier and it became a real disaster. At that time amplifier circuits were quite simple and it was easy to get hold of a circuit diagram and make calculations and change components. Amplifiers I studied at that time were NAD, Tandberg, Braun, Kenwood (?), Philips, Audio Pro . I must admit I have forgotten details about the feedback changes on a Japanese amplifier I made. It was done in 1975. Today Everything is so much more complicated and modifications/reparations are impossible for me. Its the same with electronics in modern cars :(
Which Devialet Expert do you have? There's a range of them. Also, because they are quite heavily software controlled, it's just possible that a firmware update may have helped with the problem that Amir and John Atkinson both noted.

Needing more power to "open up" is another of those tropes that goes the rounds and is too vague a description to understand what is going on. And it's not unusual to hear someone say "it needed more power to open up" when the replacement app ls actually far less powerful... but there are reasons why an apparently less powerful amp delivers more power in practice, like gain.

Just like amps and speakers, headphone amps and headphones have to match up to work. I think we've had cases where people have claimed that a pair of speakers or headphones is "more revealing" because they are harder to drive... so of course a less powerful amp is considered "inferior", when with different "less revealing" speakers or headphones are easier to drive, the inferior app sounds the same. Think about it...

I wouldn't remember much that I did in 1975 either. No worries there!
 
OK but your argument seems completely circular without actually making a. point one way or the other. Is that by design?

It’s not circular in the least.

If you followed the conversation, you will see that Axo was responding to Newman.

Newman was looking to undermine the validity of ASR members performing blind tests at home (mine of course in particular). To that end he was raising the possibility of experimenter error. And that the risk was “TOO HIGH” that the individual wouldn’t conduct a proper experiment and analysis to warrant confidence in the results.

The point both Axo and I have made is that this is overreaching: you can’t just rule out blind testing because somebody did it at home. Ultimately the validity of any blind test comes down to how it survives scrutiny. Not on who did it or where.

So: it’s quite possible to conduct a valid blind test at home. And you don’t have to rely only on your own analysis: you can take those results to somebody competent to analyze the results as well if you want.

In the case of ASR, you can present the method of the test and the results here for analysis by competent ASR members.
If it passes scrutiny, then you can be justified in raising your confidence level.

So you don’t actually need to be stuck with the level of uncertainty that Newman suggests.

And it is possible in principle for an ASR member to produce a blind test that survives scrutiny, and so some may justifiably afford it some merit and validity, regardless of the fact it was done in a private home or wherever.

There’s nothing remotely circular about that point, as against the level of uncertainty Newman was promoting.
 
It’s not circular in the least.

If you followed the conversation, you will see that Axo was responding to Newman.

Newman was looking to undermine the validity of ASR members performing blind tests at home (mine of course in particular). To that end he was raising the possibility of experimenter error. And that the risk was “TOO HIGH” that the individual wouldn’t conduct a proper experiment and analysis to warrant confidence in the results.

The point both Axo and I have made is that this is overreaching: you can’t just rule out blind testing because somebody did it at home. Ultimately the validity of any blind test comes down to how it survives scrutiny. Not on who did it or where.

So: it’s quite possible to conduct a valid blind test at home. And you don’t have to rely only on your own analysis: you can take those results to somebody competent to analyze the results as well if you want.

In the case of ASR, you can present the method of the test and the results here for analysis by competent ASR members.
If it passes scrutiny, then you can be justified in raising your confidence level.

So you don’t actually need to be stuck with the level of uncertainty that Newman suggests.

And it is possible in principle for an ASR member to produce a blind test that survives scrutiny, and so some may justifiably afford it some merit and validity, regardless of the fact it was done in a private home or wherever.

There’s nothing remotely circular about that point, as against the level of uncertainty Newman was promoting.
Maybe I luckily lost track of what this discussion is about.

Never mind me.
 
You know that our august and generous host (and others like Erin) conduct their tests at home, yes?

Your proposed heuristic "at home" = "hubris" is ragged and self-serving. It isn't a serious proposition though, and we can certainly discount such rhetoric. Testing may be done sufficiently well, or not, regardless of whether the setting is residential. What discussions usually do here when these things come up is to go into the details of the testing and validity of consequent claims. Which may stand up to scrutiny, or not. Which is as it should be. Better that than constantly falling into a credentials fallacy.
There is a specific proposition at play here with the value of such a test. The terms of the test so often asked for are difficult for the average person to meet, worked around ("I couldn't set levels electrically so I used my phone) or just plain ignored, and of course all we have is a description - we don't get to actually see what is being done or how well it holds up.
And we can only find out about a sample of one. So it's great as a diagnostic test, but it isn't necessarily going to tell us anything that moves science forward. That requires larger scale statistically valid testing. If I test something at home, in an audiology lab, or am even tested by Newman directly with everything in place... I'm stlll a sample of one.

So there are limitations on what individual testing can tell us about the world. It may tell us a lot about our individual hearing and our individual setups.

The same applies also to measuring a device. Only one device has been measured. We know nothing about sample variance from testing one product, and the one you have in your living room may be different to the one tested here. It probably doesn't vary by enough to matter, but you never know. And remember that different mains standards in your country may mean different mains noise, which is why maybe a HiFi News measurement can look different to one of Amir's.

We need to remember all of these things when discussing measurements, hearing and performance. There's a lot that can make a difference and context has to be taken into account.

I suspect Newman meant something a bit different to how you are reading his comment. And I guess I may be misreading yours as well, reading through what I just wrote. Take it as a contribution rather than a "reply", please.
 
All good stuff: we should scale our confidence to the nature of any particular claim or test.

Also, if anyone feels they are being misread, I hope they will clarify in their own words what they were getting at. That’s always welcome!
 
The same applies also to measuring a device. Only one device has been measured. We know nothing about sample variance from testing one product, and the one you have in your living room may be different to the one tested here. It probably doesn't vary by enough to matter, but you never know. And remember that different mains standards in your country may mean different mains noise, which is why maybe a HiFi News measurement can look different to one of Amir's.

That leads us to not being able to fully trust anything. We can not rely on open listening tests, we can not rely on blind listening tests, and we can not rely on measurements professionally done or not as of the risk of sample variations or errors.
So what do we do? :D
 
That leads us to not being able to fully trust anything. We can not rely on open listening tests, we can not rely on blind listening tests, and we can not rely on measurements professionally done or not as of the risk of sample variations or errors.
So what do we do? :D
Trust, but verify??? (Reagan via Russian proverb)

Trust No One (Deep Throat)

The Truth is Out There. (Mulder)

The Truth is Out There, but so are LIES. (Scully)

I don't have time for your Convenient Ignorance (Sculley)

Deep Throat said "trust no one." And that's hard, Scully. Suspecting everyone, everything, it wears you down. You even begin to doubt what you know is the truth. Before, I could only trust myself. Now, I can only trust you...and they've taken you away from me. (Mulder)
 
There is a specific proposition at play here with the value of such a test. The terms of the test so often asked for are difficult for the average person to meet, worked around ("I couldn't set levels electrically so I used my phone) or just plain ignored, and of course all we have is a description - we don't get to actually see what is being done or how well it holds up.
And we can only find out about a sample of one. So it's great as a diagnostic test, but it isn't necessarily going to tell us anything that moves science forward. That requires larger scale statistically valid testing. If I test something at home, in an audiology lab, or am even tested by Newman directly with everything in place... I'm stlll a sample of one.

So there are limitations on what individual testing can tell us about the world. It may tell us a lot about our individual hearing and our individual setups.

The same applies also to measuring a device. Only one device has been measured. We know nothing about sample variance from testing one product, and the one you have in your living room may be different to the one tested here. It probably doesn't vary by enough to matter, but you never know. And remember that different mains standards in your country may mean different mains noise, which is why maybe a HiFi News measurement can look different to one of Amir's.

We need to remember all of these things when discussing measurements, hearing and performance. There's a lot that can make a difference and context has to be taken into account.

I suspect Newman meant something a bit different to how you are reading his comment. And I guess I may be misreading yours as well, reading through what I just wrote. Take it as a contribution rather than a "reply", please.

I agree with much of this.

A properly-conducted blind AB test with me as subject tells us that I can differentiate (or not) a particular signal or device under particular circumstances. Increasing the listener sample size increases representativeness toward the general population. Single new/used devices may be representative, or not. And so on.

But if you'll excuse the aphorism, the poster I replied to appeared to throw the baby out with the bathwater. There's no need for that. And it's a waste of perfectly good babies. Some people are hubristic idiots, for sure. But not all of us, at least not all of the time.
 
Trust, but verify??? (Reagan via Russian proverb)

Trust No One (Deep Throat)

The Truth is Out There. (Mulder)

The Truth is Out There, but so are LIES. (Scully)

I don't have time for your Convenient Ignorance (Sculley)

Deep Throat said "trust no one." And that's hard, Scully. Suspecting everyone, everything, it wears you down. You even begin to doubt what you know is the truth. Before, I could only trust myself. Now, I can only trust you...and they've taken you away from me. (Mulder)

I've been enjoying a bit of X-Files lately, as it happens. I don't think I got as far as season 7 originally, it all holds up quite well.
 
I find this a bit of a spaghetti-wall proposition. Are you validating home testing or not, and if so, with what purpose?

To me it remains simple. Competent measurements in a controlled environment tell me about fundamental design goals and virtues that were met.

They tend to tell me little about what stuff will do in my home environment. With tower speakers because they'll need correction no matter what. With bookshelves because no test reveals optimally measured sub xover frequencies and such.

Personally, I've not done my own blind AB or ABX at home (Matt has, and Newman hasn't, iirc which makes their different views unsurprising). But I think it's a valid undertaking if you have the opportunity and the patience.

Thinking more generally I find taking my own measurements most useful when setting up a room and positioning loudspeakers. Those measurements correlate reasonably with the REW room simulation, or Amcoustics calculator, for example. I think room effects are somewhat predictable, but the models I can easily use are oversimplifications.

Listening to loudspeakers in other rooms, and looking at standardised measurements provides useful information I reckon. As others have said, the latter are useful as triage. For example, I can infer that I'd be disappointed with Revel F228 when I see the rolloff below 100 Hz, compared to the loudspeakers I already have. So no need to search for those.

But more esoteric questions—and this is an individual tangent—like will the beautiful JBL K2 S9500 that I came across at a local shop present soundstage like/unlike my Audio Physic Codex? I can infer not, as wide dispersion bi-radial horns will likely present differently to a more directional cone upper mid and cone tweeter pairing. Available measurements are limited. A full spin of each would help, but unlikely we'll ever see that. Which will I prefer in my room/s? For which music? Etc. And the logistics of a properly conducted blind test are obviously daunting. Can I take these 200 kg speakers home for a bit of a comparative listen, and bring them back if they don't work out? Haha. Apply that to speakers you may come across or be interested in, obviously, not the weird sh*t I like, but the principles hold. Those sorts of practical questions are what I think of when you point to uncertainties wrt results in your home environment.
 
Personally, I've not done my own blind AB or ABX at home (Matt has, and Newman hasn't, iirc which makes their different views unsurprising). But I think it's a valid undertaking if you have the opportunity and the patience.

Thinking more generally I find taking my own measurements most useful when setting up a room and positioning loudspeakers. Those measurements correlate reasonably with the REW room simulation, or Amcoustics calculator, for example. I think room effects are somewhat predictable, but the models I can easily use are oversimplifications.

Listening to loudspeakers in other rooms, and looking at standardised measurements provides useful information I reckon. As others have said, the latter are useful as triage. For example, I can infer that I'd be disappointed with Revel F228 when I see the rolloff below 100 Hz, compared to the loudspeakers I already have. So no need to search for those.

But more esoteric questions—and this is an individual tangent—like will the beautiful JBL K2 S9500 that I came across at a local shop present soundstage like/unlike my Audio Physic Codex? I can infer not, as wide dispersion bi-radial horns will likely present differently to a more directional cone upper mid and cone tweeter pairing. Available measurements are limited. A full spin of each would help, but unlikely we'll ever see that. Which will I prefer in my room/s? For which music? Etc. And the logistics of a properly conducted blind test are obviously daunting. Can I take these 200 kg speakers home for a bit of a comparative listen, and bring them back if they don't work out? Haha. Apply that to speakers you may come across or be interested in, obviously, not the weird sh*t I like, but the principles hold. Those sorts of practical questions are what I think of when you point to uncertainties wrt results in your home environment.
One at least interesting exercise would be to have two different speakers on each channel. Match levels and listen. I have done this. It becomes very interesting when you can do EQ across the whole band. I had a Tact Room Correction system to use for this. It is quite surprising how matching direct FR makes things much closer than you would expect. You have to pick the lesser speaker as the one to match the other one to. But you can in a symmetrical room get such a thing to work better than likely most people believe at the centered listening position. I think it highlights how much our perception of stereo sound is due to frequency response.
 
That leads us to not being able to fully trust anything. We can not rely on open listening tests, we can not rely on blind listening tests, and we can not rely on measurements professionally done or not as of the risk of sample variations or errors.
So what do we do? :D
In real science, you trust an experimental result that has been replicated, hopefully more than once and by different researchers.

Here, you often have to assign a degree of trust or mistrust, maybe from experience, to any result someone gives.

But to give an absolutely trivial example, do you trust Amir's amp measurements here in Australia? Do you therefore expect to see products of 60Hz mains voltage in a device here? The measurements are still useful, because there won't be too many differences apart from that and what is well designed for the US is likely also to be well designed for Europe or Australia, but there is no guarantee that a garbage PSU may be used for a smaller market, even so. Of course, this is about application of a measurement - the measurement itself is going to be accurate - but it is a point that is forgotten here all too often.

We only get text reports of a lot of tests people do around here, or maybe something that looks like a cut and paste of a Foobar ABX result. We take a lot on trust, and from time to time we need to be reminded of that, and we need to remind ourselves of the purpose and value of a test.
 
That leads us to not being able to fully trust anything. We can not rely on open listening tests, we can not rely on blind listening tests, and we can not rely on measurements professionally done or not as of the risk of sample variations or errors.
So what do we do? :D
We rely on expert professionally conducted experimental listening tests, interpreted and analysed by expert professionals, written into high-grade papers and tested by fire among the peer group. The Science.

We don't rely on home listening of whatever construction, to test the findings of The Science, as if The Science is under review in such manner.

We don't conclude that The Science is a bit iffy, lacks nuance, or needs further work, based on our home tests not confirming The Science.

We for goodness sake don't strut about on science discussion forums, throwing smokescreens all over statements of others that accurately reflect The Science, even going chesties with the actual scientists who wrote some of The Science and who adorn the forum with their presence, because we reckon that their summary statements on the trustworthiness of sighted listening are not right, and all because our personal blind-ish and non-blind tests suggest (to ourselves at least) that we have managed to 'calibrate' our personal sighted listening to the actual sound waves. And because now we don't even do the blind tests any more because we can use our 'calibrated' sighted listening to make direct observations about the sound waves themselves. LOL I mean spot the error, and maybe roll out a few synonyms for overconfidence.

We do rely on the same process as the first paragraph to bring new expert insights, ie advances, to The Science. And if such advances are not forthcoming, we shrug our shoulders and move on to areas where advances are being made: we don't get all frustrated about it and start pumping smoke into the room. To do so would be a disservice to everyone in the room, from neophyte to luminary.

cheers
 
We rely on expert professionally conducted experimental listening tests, interpreted and analysed by expert professionals, written into high-grade papers and tested by fire among the peer group. The Science.

We don't rely on home listening of whatever construction, to test the findings of The Science, as if The Science is under review in such manner.

We don't conclude that The Science is a bit iffy, lacks nuance, or needs further work, based on our home tests not confirming The Science.

We for goodness sake don't strut about on science discussion forums, throwing smokescreens all over statements of others that accurately reflect The Science, even going chesties with the actual scientists who wrote some of The Science and who adorn the forum with their presence, because we reckon that their summary statements on the trustworthiness of sighted listening are not right, and all because our personal blind-ish and non-blind tests suggest (to ourselves at least) that we have managed to 'calibrate' our personal sighted listening to the actual sound waves. And because now we don't even do the blind tests any more because we can use our 'calibrated' sighted listening to make direct observations about the sound waves themselves. LOL I mean spot the error, and maybe roll out a few synonyms for overconfidence.

We do rely on the same process as the first paragraph to bring new expert insights, ie advances, to The Science. And if such advances are not forthcoming, we shrug our shoulders and move on to areas where advances are being made: we don't get all frustrated about it and start pumping smoke into the room. To do so would be a disservice to everyone in the room, from neophyte to luminary.

cheers
I mostly agree. What is different is the products available, and the need to differentiate one from competitors. You wouldn't worry too much buying medical gear that someone is selling you a device based upon homeopathy. You can get the audio equivalent of it however. OTOH, you can still buy homeopathic products. I know people way too intelligent, and too educated to believe such things. But they somehow give something a try, it seems to work, and they'll literally tell doctors it works and they know it works because it worked for them. You don't have the chance to let people try their own blind test of homeopathic products for disease or ailments. You can do a reasonable version of that for audio products in the home however.

I would think all of us have experiences that are too real to easily dismiss because they are irrational. We know better, many of us will not act on them, but we feel the urge to do so. In the case of audio-phoolery, we can do our own blinded experience to perhaps disabuse ourselves or others of hard to shake gut-felt experiences which aren't rational with the facts. That is much more powerful than just being told the facts.

Was Ronald Fisher's Lady Tasting Tea test so complex one needs to be a scientist to run the test?
 
Last edited:
Thinking more generally I find taking my own measurements most useful when setting up a room and positioning loudspeakers. Those measurements correlate reasonably with the REW room simulation, or Amcoustics calculator, for example. I think room effects are somewhat predictable, but the models I can easily use are oversimplifications.

I trust my own measurements of my loudspeakers the most, they are repeatable and I know they are representative of my specific “samples”. :)
 
In real science, you trust an experimental result that has been replicated, hopefully more than once and by different researchers.

Here, you often have to assign a degree of trust or mistrust, maybe from experience, to any result someone gives.

But to give an absolutely trivial example, do you trust Amir's amp measurements here in Australia? Do you therefore expect to see products of 60Hz mains voltage in a device here? The measurements are still useful, because there won't be too many differences apart from that and what is well designed for the US is likely also to be well designed for Europe or Australia, but there is no guarantee that a garbage PSU may be used for a smaller market, even so. Of course, this is about application of a measurement - the measurement itself is going to be accurate - but it is a point that is forgotten here all too often.

We only get text reports of a lot of tests people do around here, or maybe something that looks like a cut and paste of a Foobar ABX result. We take a lot on trust, and from time to time we need to be reminded of that, and we need to remind ourselves of the purpose and value of a test.

Replicated measurements etc are crucial in science, and here we also can see both very similar and some different results from that of Amir. This might be due to slight differences in design from Europe and US equipment, or something else that we do not have control of. See e.g.

 
We rely on expert professionally conducted experimental listening tests, interpreted and analysed by expert professionals, written into high-grade papers and tested by fire among the peer group. The Science.

We don't rely on home listening of whatever construction, to test the findings of The Science, as if The Science is under review in such manner.

We don't conclude that The Science is a bit iffy, lacks nuance, or needs further work, based on our home tests not confirming The Science.

We for goodness sake don't strut about on science discussion forums, throwing smokescreens all over statements of others that accurately reflect The Science, even going chesties with the actual scientists who wrote some of The Science and who adorn the forum with their presence, because we reckon that their summary statements on the trustworthiness of sighted listening are not right, and all because our personal blind-ish and non-blind tests suggest (to ourselves at least) that we have managed to 'calibrate' our personal sighted listening to the actual sound waves. And because now we don't even do the blind tests any more because we can use our 'calibrated' sighted listening to make direct observations about the sound waves themselves. LOL I mean spot the error, and maybe roll out a few synonyms for overconfidence.

We do rely on the same process as the first paragraph to bring new expert insights, ie advances, to The Science. And if such advances are not forthcoming, we shrug our shoulders and move on to areas where advances are being made: we don't get all frustrated about it and start pumping smoke into the room. To do so would be a disservice to everyone in the room, from neophyte to luminary.

Well, that was cringe. Of course we don't. On this forum we discuss engineering (music reproduction gear) and art (music).

*some of hit The Shift key when typing to give our rhetoric gravitas, but the rest of us don't fall for that :)
 
Last edited:
Which Devialet Expert do you have? There's a range of them. Also, because they are quite heavily software controlled, it's just possible that a firmware update may have helped with the problem that Amir and John Atkinson both noted.

Needing more power to "open up" is another of those tropes that goes the rounds and is too vague a description to understand what is going on. And it's not unusual to hear someone say "it needed more power to open up" when the replacement app ls actually far less powerful... but there are reasons why an apparently less powerful amp delivers more power in practice, like gain.

Just like amps and speakers, headphone amps and headphones have to match up to work. I think we've had cases where people have claimed that a pair of speakers or headphones is "more revealing" because they are harder to drive... so of course a less powerful amp is considered "inferior", when with different "less revealing" speakers or headphones are easier to drive, the inferior app sounds the same. Think about it...

I wouldn't remember much that I did in 1975 either. No worries there!
Devialet Expert 220 core is my Devialet. I still love it after many years:)
I agree with you. Power/current is maybe not correct wording and too vague. My friends renovated Quad 303 is a very capable amplifier driving "difficult" loads. I think it is "only" rated 45 W.
 
... although I would never capitalize The Science.
(I mean, other than just now). :rolleyes: :cool:

;)

Besides, everybody knows that science is a verb.

1729613302753.jpeg
 
And some of us simply try to mimic the inflection of the spoken word ... to hell with "gravitas". (@mhardy6647 and me , for instance.) I don't see anything wrong with that.

You are half right. Imagining how @Newman was trying to say all that gravity-enhanced shite was the really funny part.

But getting back to the science and the thread topic, I think we can science it. Can you post some audio of you speaking in title case, versus normally? Level-matched. I feel an ABX coming on.
 
Back
Top Bottom