Ideas for more meaningful speaker measurements

RickS · Jan 9, 2022

Placeholder for new thread to discuss ideas on how to drive speaker measurements to be more meaningful.

Today's reviews and scoring have a basis in past approaches and have led to lots of review thread discussions. Many of them show a need to do something better. This thread is meant to consolidate these discussions. and (hopefully) drive out some concrete improvements.

Here is one small example (my own) from the Sointuva review thread...

Some of the problem starts with how the data is presented IMO. For instance, the usual 50 db range of the scale on a simple frequency response plot leads many to think the response is flatter than it really is. Here is on-axis Klippel data from a highly regarded active monitor (is quite flat shown using the standard range), shown at a 4 db range:

Does not look very flat now, does it? Harder to eyeball a trend line too. I can tell you if you fit a trend line to this data, it is neither flat or zero slope.

Some might argue that the above is raw data and too harsh a presentation. So, here is the same data with 1/3 smoothing:

Wonder how many would flock to buy this speaker now?

Not so pleasing to the eye any longer, but is a much more revealing presentation than the usual scaling. If we start by presenting the data with more of the true nature of its complexity, the technical consumer is forced to think more critically (or may realize that judging the speaker is not as simple as it was made to appear).

EDIT: Should note that any improvements may need to be vetted for impact on the legacy speaker dataset.

sigbergaudio · Jan 9, 2022

With regards to your reply here https://www.audiosciencereview.com/...r-the-march-audio-sointuva.29677/post-1042503

Not sure if this would help. Would we then not fret even more over imperfections that are insignifcant in the grand scale of things?

RickS · Jan 9, 2022

sigbergaudio said:
With regards to your reply here https://www.audiosciencereview.com/...r-the-march-audio-sointuva.29677/post-1042503

Not sure if this would help. Would we then not fret even more over imperfections that are insignifcant in the grand scale of things?

Perhaps, but have always one to try to go to the problem before a solution.

As you pointed out, it is multi-faceted problem and mine is one simple example. Thanks for including the pointer back to the Sointuva review thread though!

From ASR speaker reviews, I see Amir (and Erin) struggle with measurement content and we have at least 3 folks analyzing the data and posting their analysis. Some of it is redundant but also see some potential improvements that are buried in the noise of information overload.

As you know, many review thread (like the Sointuva one) can get consumed in discussion over one aspect of the design. If we identify what is important and improve the focus., hoping we avoid more of the minutiae dives.

May be wishful thinking, but willing to give it a try!

johnwong · Jan 9, 2022

Perhaps it's not all about graphs but also needs to incorporate ratings against reference, which is how a lot of testing is performed:

https://audiosciencereview.com/foru...ut-2-0-b6-2-speaker-review.14272/post-1041570

sigbergaudio · Jan 9, 2022

One possible solution is to change the way things are presented, which may be a good idea. Another (not mutually exclusive) is to write and explain a bit more about what is important and what is less important.

And perhaps some real life examples (I guess both Erin and Amir have already been doing some of this in their videos) of what things actually mean, and the significance (or not) of different things.

It's also things that looks different between speakers without necessarily being a problem in either. So it's probably hard to make this completely transparent and understandable for everyone. It's not like anybody can just look at a directivity plot and go "Aha I understand exactly what I'm seeing".

sigbergaudio · Jan 9, 2022

As discussed briefly in private with Amir with regards to review of the speakers we build: They are designed to play with subwoofers, which means in isolation they have poor bass extension (by design). This will hurt their preference score. There's a perfectly valid explanation for this, but not necessarily immediately obvious for everyone. So there are a number of reasons things can look "off" while not necessarily being a problem. But also not necessarily easy to explain in all cases.

Digby · Jan 9, 2022

I'll give my own, layman input, regarding preference scores. Two very different speakers can get a similar preference score, take a relatively small Genelec 8 series speaker and a, rather larger, JBL 4349, for example.

These speakers will, likely as not, sound rather different to each other and probably an individual will appreciate one rather more than the other (just through the difference in presentation), yet they can have very similar preference scores.

How to account for this?

sarumbear · Jan 9, 2022

sigbergaudio said:
It's also things that looks different between speakers without necessarily being a problem in either. So it's probably hard to make this completely transparent and understandable for everyone. It's not like anybody can just look at a directivity plot and go "Aha I understand exactly what I'm seeing".

The problem I see is very simple but I also think there’s no solution.

Various measurements are done using good equipment and using good engineering practices. Measurement is an engineering process. Engineers in a discipline related to audio will have no problem understanding them.

On the other hand we have enthusiast who cannot interpret, nor understand those measurements fully. How can you explain them something they don’t know? You can’t teach engineering subjects in a few posts can you?

The method in ASR is to trust the man who is testing, which occasionally cause concern, but works enough to help ASR a growing forum.

tomtoo · Jan 9, 2022

sigbergaudio said:
With regards to your reply here https://www.audiosciencereview.com/...r-the-march-audio-sointuva.29677/post-1042503

Not sure if this would help. Would we then not fret even more over imperfections that are insignifcant in the grand scale of things?

The proplem is, how to tell people measurements and how they impact what we hear, when most not realy know what +4db q3 at 15khz listen like? Yes +4db q2 at 4khz would sound like sh**(you know). Some enjoy to talk about the influence of .5 dB measurements in a small fr band, whats completly insane. Understanding measurements is not hard, but getting how they realy influence what we hear is much harder.

sq225917 · Jan 9, 2022

It'd be good to have some sort of room size rating included. You're not going to use horns in a shoebox or LS35a in a ballroom.

Inner Space · Jan 9, 2022

Jim Taylor said:
IMHO, we need more education, not more tests. Jim

Education is always a wonderful thing, but in this arena it leads - IMO - to an immediate requirement for two more tests, given almost everyone's use case: 1) Closeness of sample-to-sample matching; and 2) are the cabinets physically inert? That data would be very meaningful to me, and it's very hard to find at the moment.

Digby · Jan 9, 2022

sarumbear said:
On the other hand we have enthusiast who cannot interpret, nor understand those measurements fully. How can you explain them something they don’t know? You can’t teach engineering subjects in a few posts can you?

That seems to me a problem we had in a different thread.

Isn't all science essentially taken on trust by those not qualified to make judgements as to whether the science is good or bad? Scientists operating in different fields often have to take the work of those in the other field on trust, hoping that colleagues in the same field, collectively correct any mistakes in their thinking - so the same trust model applies, somewhat, to scientists themselves.

sarumbear · Jan 9, 2022

Digby said:
I'll give my own, layman input, regarding preference scores. Two very different speakers can get a similar preference score, take a relatively small Genelec 8 series speaker and a, rather larger, JBL 4349, for example.

These speakers will, likely as not, sound rather different to each other and probably an individual will appreciate one rather more than the other (just through the difference in presentation), yet they can have very similar preference scores.

How to account for this?

Many, many clever people tried, many papers written, many standards agreed on but none have managed to create a method, including the one used in ASR, that can satisfy your wish fully.

TimVG · Jan 9, 2022

Understand that the estimated in-room response implies the estimated measured response. Far too often it is confused with 'what we hear" - but that is not true. What we hear in an average residential (listening) room is the direct sound, some early reflected sounds, and not much else (paraphrasing Toole). Horizontal reflections are more important than vertical reflections, due to our binaural hearing. Many people see a small imperfection in the power response (or in this case the predicted in-room response) and proceed to correct this to improve the score, even if it compromises the direct sound, the very first sound we hear, and is responsible for the timbre. Much can be discussed about crossover topologies, but when you have multriple non-concentric drivers, there will be a discrepancy somewhere off-axis, depending on the orientation of the drivers, and the phase response of the system on the listening axis.

markus · Jan 9, 2022

johnwong said:
Perhaps it's not all about graphs but also needs to incorporate ratings against reference, which is how a lot of testing is performed:

https://audiosciencereview.com/foru...ut-2-0-b6-2-speaker-review.14272/post-1041570

I wish Amir would or could do his subjective listening tests blind, comparing the DUT to at least one known reference. In my opinion his subjective descriptions are just the same as what others already deliver on social media. We don't need more of that.

He would need to have a device similar to the Harman shuffler. Toole shows a simplified version in his book which uses a rotating platform. That shouldn't be hard to implement and would allow simple comparisons. I'm sure there's enough competent people here capable of developing such a solution:

sarumbear · Jan 9, 2022

Digby said:
That seems to me a problem we had in a different thread.

Isn't all science essentially taken on trust by those not qualified to make judgements as to whether the science is good or bad? Scientists operating in different fields often have to take the work of those in the other field on trust, hoping that colleagues in the same field, collectively correct any mistakes in their thinking - so the same trust model applies, somewhat, to scientists themselves.

I think you are confusing science and engineering. I don’t think you will see engineers arguing over a measurement. They will only argue about the quality of the test.

markus · Jan 9, 2022

Jim Taylor said:
IMHO, we need more education, not more tests.

...and a ton more research. We don't know nearly enough about how exactly recording technique, speaker directivity and room acoustics affect spatial and timbral perception.

sarumbear · Jan 9, 2022

tomtoo said:
Understanding measurements is not hard, but getting how they realy influence what we hear is much harder.

You measure after you understand what the results will mean, not the other way round as you suggest.

RickS · Jan 9, 2022

johnwong said:
Perhaps it's not all about graphs but also needs to incorporate ratings against reference, which is how a lot of testing is performed:

https://audiosciencereview.com/foru...ut-2-0-b6-2-speaker-review.14272/post-1041570

Thanks! interesting proposal, but probably needs its own thread due to scope.

tomtoo · Jan 9, 2022

You have to learn, how FR sounds. Its like bike driving, from reading alone you wont learn.

Ideas for more meaningful speaker measurements

Member, Moderator, Mediator

Major Contributor

Member, Moderator, Mediator

Member

Major Contributor

Major Contributor

Major Contributor

Master Contributor

Major Contributor

Major Contributor

Major Contributor

Major Contributor

Master Contributor

Major Contributor

Addicted to Fun and Learning

Master Contributor

Addicted to Fun and Learning

Master Contributor

Member, Moderator, Mediator

Major Contributor

Similar threads