• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Research Project: Infinity IL10 Speaker Review & Measurements

tuga

Major Contributor
Joined
Feb 5, 2020
Messages
3,984
Likes
4,285
Location
Oxford, England
I think that is backwards. Backwards as in old method not being the best anymore and a backwards approach. There is enough known and shown to design toward better objectives. Then use your approach to find missed gotchas in a design. As a first line of picking speakers your method isn't a good one.

Who said anything about picking speakers?

I was referring to assessing performance through listening.
One should do it once one has acquire/borrowed speakers which are worth listening to because they produce good measurements.

Shortlist on measurements. And you could stop there (buy on measurements alone) or you could/should then listen for further input in regard to performance as well as for tasting/preference.
I can concede that for a DAC measured performance might be enough but not for speakers, they're too flawed and as you've rightly mentioned they put sound into a room which creates a whole new set of issues.
 

lashto

Major Contributor
Forum Donor
Joined
Mar 8, 2019
Messages
1,045
Likes
535
Every distortion profile is different as far as audibility. So I have no way of quantifying it for you that way. I can show however the ability to hear small impairments in double blind tests that are impossible for most people to detect.

Here are a couple of quick examples, here are the difference between 320 kbps MP3 and original:

---
foo_abx 1.3.4 report
foobar2000 v1.3.2
2014/07/19 19:45:33

File A: C:\Users\Amir\Music\Arnys Filter Test\keys jangling 16 44.wav
File B: C:\Users\Amir\Music\Arnys Filter Test\keys jangling 16 44_01.mp3

19:45:33 : Test started.
19:46:21 : 01/01 50.0%
19:46:35 : 02/02 25.0%
19:46:49 : 02/03 50.0%
19:47:03 : 03/04 31.3%
19:47:13 : 04/05 18.8%
19:47:27 : 05/06 10.9%
19:47:38 : 06/07 6.3%
19:47:46 : 07/08 3.5%
19:48:01 : 08/09 2.0%
19:48:19 : 09/10 1.1%
19:48:31 : 10/11 0.6%
19:48:45 : 11/12 0.3%
19:48:58 : 12/13 0.2%
19:49:11 : 13/14 0.1%
19:49:28 : 14/15 0.0%
19:49:52 : 15/16 0.0%
19:49:56 : Test finished.

----------
Total: 15/16 (0.0%)

--

Here is another from AIX records:

foo_abx 1.3.4 report
foobar2000 v1.3.2
2014/07/31 15:18:41

File A: C:\Users\Amir\Music\AIX AVS Test files\On_The_Street_Where_You_Live_A2.mp3
File B: C:\Users\Amir\Music\AIX AVS Test files\On_The_Street_Where_You_Live_A2.wav

15:18:41 : Test started.
15:19:18 : 01/01 50.0%
15:19:30 : 01/02 75.0%
15:19:44 : 01/03 87.5%
15:20:35 : 02/04 68.8%
15:20:46 : 02/05 81.3%
15:21:39 : 03/06 65.6%
15:21:47 : 04/07 50.0%
15:21:54 : 04/08 63.7%
15:22:06 : 05/09 50.0%
15:22:19 : 06/10 37.7%
15:22:31 : 07/11 27.4%
15:22:44 : 08/12 19.4%
15:22:51 : 09/13 13.3%
15:22:58 : 10/14 9.0%
15:23:06 : 11/15 5.9%
15:23:14 : 12/16 3.8%
15:23:23 : 13/17 2.5%
15:23:33 : 14/18 1.5%
15:23:42 : 15/19 1.0%
15:23:54 : 16/20 0.6%
15:24:06 : 17/21 0.4%
15:24:15 : 18/22 0.2%
15:24:23 : 19/23 0.1%
15:24:34 : 20/24 0.1%
15:24:43 : 21/25 0.0%
15:24:52 : 22/26 0.0%
15:24:57 : Test finished.

----------
Total: 22/26 (0.0%)

And test of high-res audio versus downsampled:

foo_abx 1.3.4 report
foobar2000 v1.3.2
2014/07/10 18:50:44

File A: C:\Users\Amir\Music\AIX AVS Test files\On_The_Street_Where_You_Live_A2.wav
File B: C:\Users\Amir\Music\AIX AVS Test files\On_The_Street_Where_You_Live_B2.wav

18:50:44 : Test started.
18:51:25 : 00/01 100.0%
18:51:38 : 01/02 75.0%
18:51:47 : 02/03 50.0%
18:51:55 : 03/04 31.3%
18:52:05 : 04/05 18.8%
18:52:21 : 05/06 10.9%
18:52:32 : 06/07 6.3%
18:52:43 : 07/08 3.5%
18:52:59 : 08/09 2.0%
18:53:10 : 09/10 1.1%
18:53:19 : 10/11 0.6%
18:53:23 : Test finished.

----------
Total: 10/11 (0.6%)

Those are very impressive results. I tried such tests myself several times and also tried with friends. Noone came even close to relevant results. But we never paid that much attention to the test tracks, mostly just used whatever was at hand and/or most familiar to the testers.

Your test tracks seem to be quite special and I guess they do a much better job at pointing possible differences in high freq content, mp3 codec quirks, etc. Can we download those test tracks somewhere? ... feeling kinda google-lazy now :)
 

QMuse

Major Contributor
Joined
Feb 20, 2020
Messages
3,124
Likes
2,785
Assuming @amirm's listening impression is correct and the reason for his poor listening impression is based on the high third order harmonic distortions around 1.5kHz.

I can't see that narrow THD spike at 1.6khz been guilty for the "grunginess and lack of clarity to everything it played".

Such narrow THD spikes, if audible at all, don't manifest like that and certainly don't have such general influence on "everything" been played.
 

Andreas007

Active Member
Joined
Mar 11, 2019
Messages
145
Likes
379
Location
Germany, Bavaria
Music and sound are separate items for me.

Consider:






I'm inordinately fascinated with how she can trade hands on the "drone" and never blur a beat.

Actually, change of hand is very helpful here to not get exhausted. Concert harps have high string tension. You would be surprised how much force you have to put into the harp to get good sound.
 

ctrl

Major Contributor
Forum Donor
Joined
Jan 24, 2020
Messages
1,633
Likes
6,241
Location
.de, DE, DEU
The problem is than in most cases there are other "factors". Loudspeakers are significantly flawed in comparison to electronics which can be nearly "transparent" (accurate).
I cannot quite agree with the first part.
Nowadays, it's not difficult to make a driver selection during loudspeaker development that reduces the "other factors" very much.
Thus, the radiation of the loudspeaker comes to the fore as a decisive factor in most comparisons, if the potential of the loudspeakers to be compared has been exhausted in terms of quality of sound.
A well developed and tuned loudspeaker with less good radiation can of course sound better than a moderately tuned LS with very good radiation.

When interpreting the CEA2034/Spinorama and the score generated from the algorithm, trends and probabilities are shown, not mathematically accurate statements.
Therefore, the basic trend (prediction of sound quality based on the Spinorama) is not disproved by constructing counter examples.

However, the score is only valid for the special test conditions of Harman (mono, loudspeaker almost in the middle of the room, large distance to lateral boundary surfaces), where I have my doubts whether the sequence determined in this way can be transferred to music listening in the average listening room.
 

Tks

Major Contributor
Joined
Apr 1, 2019
Messages
3,221
Likes
5,497
It's a catch-22 though: if you don't publish model names, people will complain that data is being withheld. If you publish model names, people will complain about conflicts of interest because the highest-rated speaker is from Harman and the study will read like an advertisement, greatly damaging its credibility. It's damned if you do, damned if you don't.

This all makes sense if science had to concern itself with such (the concerns of everyday people taking precedent over the desired endevour of truth extrapolation). Even if the highest rated speaker is from Harman, how/why could that possibly matter if the interest is to establish a truth claim? As long as the research is done without oversight of pre-published results , I don't see why this would be a problem.

If the pre-published results are being vetted by the financier, then you have bigger problems than conflicts of interest to begin with regardless if you reveal the speaker models or anything else really..

This sort of move is a sort of reverse-conflict of interest. Where the conflict is an attempt at averting stated interests. By not publishing models, you breath an air of not stating one of your primary interests (that being "appearing" to not look like you want to form a study that looks more like an advertisement).

The problem with not publishing full details of a study, is it loses relevance with respect to replication of results. Like how would one even attempt to formulate a solid falsification of his results in the first place? "Oh look I found a bunch of speaker models where the preference wasn't Harman leaning, that mean's Olive's research is wrong!" If it's not the same conditions, Olive/Harman can always claim "those aren't the speakers we used, so, perhaps your results were different because of that".

Damned if you do, more damned if you don't - that's my take on such.
 
Last edited:

tuga

Major Contributor
Joined
Feb 5, 2020
Messages
3,984
Likes
4,285
Location
Oxford, England
I cannot quite agree with the first part.
Nowadays, it's not difficult to make a driver selection during loudspeaker development that reduces the "other factors" very much.
Thus, the radiation of the loudspeaker comes to the fore as a decisive factor in most comparisons, if the potential of the loudspeakers to be compared has been exhausted in terms of quality of sound.
A well developed and tuned loudspeaker with less good radiation can of course sound better than a moderately tuned LS with very good radiation.

You can also have two speakers with adequately flat FR and very good radiation and one sound better than the other.
I still feel that the Spinorama leaves out important aspects of speaker performance which focus on those "other factors" and I am pleased that other complementary measurements are slowly becoming standard at ASR.

When interpreting the CEA2034/Spinorama and the score generated from the algorithm, trends and probabilities are shown, not mathematically accurate statements.
Therefore, the basic trend (prediction of sound properties based on the Spinorama) is not disproved by constructing counter examples.

However, the score is only valid for the special test conditions of Harman (mono, loudspeaker almost in the middle of the room, large distance to lateral boundary surfaces), where I have my doubts whether the sequence determined in this way can be transferred to music listening in the average listening room.

Good point regarding mono listening in the middle of the room and preference assessment. I've also raise that issue a few times, no one will use speakers like that so the preference listening impression has little relation with how the speakers will be listend to in real-world conditions. Not positioning the speakers optimally in the room also falsifies the results.
 

tuga

Major Contributor
Joined
Feb 5, 2020
Messages
3,984
Likes
4,285
Location
Oxford, England
The problem with not publishing full details of a study, is it loses relevance with respect to replication of results. Like how would one even attempt to formulate a solid falsification of his results in the first place? "Oh look I found a bunch of speaker models where the preference wasn't Harman leaning, that mean's Olive's research is wrong!" If it's not the same conditions, Olive/Harman can always claim "those aren't the speakers we used, so, perhaps your results were different because of that".

Toole defends (based on his research and preference) that speakers should produce a particular type of Spinorama performance. Many Harman speakers are aiming at that standard but so are speakers produced by several other manufacturers.
But because the Spinorama is limited in scope – it doesn't show the whole the picture – we need more measurements of different aspects of performance to complement spins and we need more effective listening tests.
 

RayDunzl

Grand Contributor
Central Scrutinizer
Joined
Mar 9, 2016
Messages
13,250
Likes
17,201
Location
Riverview FL
Actually, change of hand is very helpful here to not get exhausted.

I thought about that as I played along on my air harp...

And then tried to just tap out the eighths and triplets as consistently as she played.

The pianist doesn't get a break on her left hand, but she throws in some ritards, unlike the steady tick-tock-tick-tock of the harp interpretation.
 
Last edited:

Tks

Major Contributor
Joined
Apr 1, 2019
Messages
3,221
Likes
5,497
Toole defends (based on his research and preference) that speakers should produce a particular type of Spinorama performance. Many Harman speakers are aiming at that standard but so are speakers produced by several other manufacturers.
But because the Spinorama is limited in scope – it doesn't show the whole the picture – we need more measurements of different aspects of performance to complement spins and we need more effective listening tests.

Sure, I can sympathize with that. In the same way one doesn't look at jitter performance of a DAC and come to the conclusion they'll enjoy the audio it spits out when all is said and done. But none of that is contentious obviously, quality of data sets is always of paramount importance. My whole thing was, knowing what the data itself is seeing as how something that measures preferences (which is still something heavily subjective and contained within the relatively unbreached confines of our minds) using scientific metrics will always draw eyebrows, as that is almost always a very strong claim.
 

riker1384

Member
Joined
Jul 19, 2019
Messages
67
Likes
97
Are the Spinrorama results really all that perfect?

Are we sure that 5 kHz peak isn't significant? I would think a peak in the low to mid treble would be the worst place to have one. And there's the 6-700 Hz peak, as well as the dip in the directivity index in the mid/high treble. Are you sure you aren't just hearing those flaws? It seems premature to look for exotic explanations when there are already visible flaws in the frequency response.
 

tuga

Major Contributor
Joined
Feb 5, 2020
Messages
3,984
Likes
4,285
Location
Oxford, England
Are the Spinrorama results really all that perfect?

Are we sure that 5 kHz peak isn't significant? I would think a peak in the low to mid treble would be the worst place to have one. And there's the 6-700 Hz peak, as well as the dip in the directivity index in the mid/high treble. Are you sure you aren't just hearing those flaws? It seems premature to look for exotic explanations when there are already visible flaws in the frequency response.

I think that the 5 kHz peak is likely to be more audible than the narrow THD spike at 1.6khz with some material. And some people may even like it. :p
 

pkane

Master Contributor
Forum Donor
Joined
Aug 18, 2017
Messages
5,724
Likes
10,418
Location
North-East
In an actual mix, the distortion (could be IMD instead of THD though) is much easier to hear. All the individual tracks and effects will start to fall apart and become indistinct.

Harmonic distortion and IMD are just a different way that the same non-linearity manifests itself. THD is the effect as measured with a single tone, while IMD is with multiple tones.
 

mhardy6647

Grand Contributor
Joined
Dec 12, 2019
Messages
11,414
Likes
24,779
My loudspeakers are all "used" now.
Some of mine are older than I (and I am on par with dirt, age-wise).
I don't know (with few exceptions) if, e.g., the drivers are operating at anything like their original parameters -- especially those with AlNiCo magnets, which are shock sensitive and prone to "run down" over the decades.

but I digress...

At any rate, frequency response notwithstanding, I wouldn't assume that 20 year old loudspeakers of (I presume) unknown provenance are operating "as new".
 

pozz

Слава Україні
Forum Donor
Editor
Joined
May 21, 2019
Messages
4,036
Likes
6,827
I could point you to a paper by Genelec about what they call slow listening but you'll dismiss it by saying that even though they are one of the top speaker manufacturers they have no idea what they are talking about when it comes to listening assessment...
That's presumptuous considering how many times I've engaged you directly in the past.

Do you have other sources other than the Genelec paper? If not, I can start from there.
 

Absolute

Major Contributor
Forum Donor
Joined
Feb 5, 2017
Messages
1,085
Likes
2,131
I think perspective is a key word in a discussion such as this. Remember that Toole et al didn't try to create a set of measurements to define the absolute best speaker, they tried to create a set of measurements that would reliably show us the quality of a speaker.
They succeeded with that.

You can glance at a spinorama and within seconds determine if the speaker is good or bad, but you can't glance at a spin and say that one unknown speaker can compete with other unknown speakers based on the spinorama alone.
Which is all we really need to conclude that the spinorama isn't the only factor involved. The most important, yes, but not the only.
 

youngho

Senior Member
Joined
Apr 21, 2019
Messages
487
Likes
802
If you have been reading our speaker reviews, you have no doubt seen the "Preference Scores" for speakers. This was ground breaking research by Sean Olive published back in 2004 with the goal of predicting listener preference using anechoic chamber speaker measurements. Seemed like an impossible task but Sean pulled it off going beyond people's intuition that "everyone prefers a different sound." Clearly if we can predict preference based on measurements, then it is listener independent.
...
Conclusions
There are none as yet. I expect this to be a living research thread where we discuss what we have found here, and whether we can better rationalize speaker preference from measurements. The preference score for this speaker will be high (@MZKM will post shortly) putting me once again at odds with it. We have to figure out why before I lose all face. :)

Amir, I had posted the following quote on AVS Forum last year but hope you won't mind my reposting here. Although I was comparing the curves of the F228Be vs the Salon 2, and none of this is new, I think much of it is relevant here.

"1. The curved baffles on the Salon 2 reduce diffraction, which I wondered might at least partially explain the MUCH more similar on-axis and listening window responses for the Salon 2, as opposed to the F228Be that shows more differences between 700-800 Hz and 1-10 kHz between the two response curves.
2. Both of the Olive models use smoothness and/or flatness of on-axis but NOT listening window response as dominant factors. The on-axis smoothness (narrow band or not) actually outweighs the bass extension in both models, so whether the bass or the on-axis response gives "the edge" could be a matter of contention
3. Although the F228Be initially appears smoother overall, the two significant deviations in the on-axis curves are bumps around 700-800 Hz (broader) and 5 kHz (narrower). These are of similar magnitude and width in the sound power curve, which is also a direct factor in one of the Olive models and indirectly (though the predicted in-room response) in the other. The Salon 2 bumps are generally of less magnitude in the on-axis curves, also less magnitude and/or width in the sound power curves, so one tends to smooth out compared with the other. I didn't know whether the F228Be bumps could represent some sort of resonance, but these could adversely affect its audible performance until equalized
4. Dips in the on-axis or sound power curves are less audible compared with bumps.
5. The Salon 2 does have a larger dip in the sound power curve (above 2 kHz, almost reminiscent of the BBC dip but likely due to the crossover and directivity of the drivers), but the sound power curve is a significantly smaller factor in both Olive models compared with the on-axis curve, plus less audible as above in 4
6. Wider bumps or dips are more audible than narrow ones, but dips are less audible than bumps (#4)
7. The F228Be's 700-800 Hz bump is relatively wider and higher than any of the Salon 2's, using a broader baseline, like 500 Hz-10 kHz

TL;DR The F228Be may look smoother but may not be as smooth when and where it counts, may have flaws of commission that are more audible rather than omission, and may have less bass?"

Additionally, when I asked Kevin Voecks about some of these points, he responded "I suspect one of the differences is heard when they are played louder, where nonlinearities can occur. It is especially at moderately high levels where I hear the biggest difference. (I don't play anything dangerously loud.) The other possibility is as you stated, the almost total lack of diffraction with the Salon2s."

It's clear that some of the issues above apply when comparing the IL10 and M16, particularly when looking at the resonances and the PIR curves.

Others have already pointed out that the speaker preference models were developed using speakers of extremely varying quality, so it's reasonable to expect predictive issues when applied to a group of speakers of relatively high quality. I vaguely suspect that an updated speaker preference model would differentially weight Deviation (whether AAD or NBD) for bumps versus dips, as well as take into account the Q for bumps, plus factor in nonlinearities like distortion and perhaps effects of diffraction, which I think is reflected in the relative similarity between on-axis and listening curves.

Young-Ho
 

ctrl

Major Contributor
Forum Donor
Joined
Jan 24, 2020
Messages
1,633
Likes
6,241
Location
.de, DE, DEU
I can't see that narrow THD spike at 1.6khz been guilty for the "grunginess and lack of clarity to everything it played".
Such narrow THD spikes, if audible at all, don't manifest like that and certainly don't have such general influence on "everything" been played.
That's why I wrote:
Assuming @amirm's listening impression is correct

In how far @amirm is able to hear distortions, I cannot judge.
But we can take a look at psychoacoustics and ask ourselves if it is possible from a scientific point of view.

Source: Psychoacoustics - Zwicker, Fastl
1593089530906.png


The masking is shown in the diagram for a critical-band wide noise around 1kHz (approximately 1kHz+-100Hz width), therefore the result can still be transferred quite well to 1.5kHz.

As blue lines I have drawn the audibility threshold of HD3 (a test tone at 3kHz) once with an 80dB and a 100dB masker.
With the 80dB masker, HD3 is theoretically audible from an attenuation of -53dB (0.2%), with the 100dB masker from about 1%.
Thus @amirm's perception may well be correct, since with 1.5% HD3 the detection threshold is far exceeded.

In comparison, HD2 (a test tone at 2kHz) with an 80dB masker would only be audible from about 1% distortion.
Due to the lack of masking, higher order harmonic distortions are therefore considered to be more "sound damaging".
 

SimpleMan

Member
Joined
May 29, 2020
Messages
5
Likes
0
A factor remains that my hearing acuity for distortion/small detail is well above average due to extensive training. Harman listeners did not have such skills. So it is entirely possible that what bothers me doesn't bother hardly anyone else.

Amrim--Do you listen to the speakers before or after you run them through all the tests?
 

jazzendapus

Member
Joined
Apr 25, 2019
Messages
71
Likes
150
I'm sorry this might sound a bit disrespectful, but the whole premise of this thread is nonsense and goes hard against the "S" of ASR.
So Amir didn't like a certain speaker at a certain point of time with certain physical and psychological circumstances, which basically means we have no idea if his impression is even related to actual sound. So why is it even worth discussing?
Once ASR/Amir/one of the members set ups a proper blind testing facility then we can dig into the issue, emulate distortions/tonality imbalances/other speaker characteristics and faults and properly compare their impact on either preference or general audibility. As it is now this is a regression, a fodder for "I trust my ears, not measurements!!" people I hoped this forum tried to shoo away...
 
Last edited:
Top Bottom