• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Is Toole and Olive's Spinorama model incomplete and limited?

Kal Rubinson

Master Contributor
Industry Insider
Forum Donor
Joined
Mar 23, 2016
Messages
5,303
Likes
9,867
Location
NYC
1. I presume the Atmos channels can be up-mixed from conventional two-channel stereo... how well does that work, in your opinion?
I have not heard enough with sufficient information about source and process to generalize. What I have heard from what I presume to be stereo upmixes have not impressed. That does not include Atmos conversions from original multitrack/multichannel sources.
2. Are you planning to attend the Capital Audio Fest this November?
I have not been to an audio show in years, so I would like to but probably not this one.
 

Cars-N-Cans

Addicted to Fun and Learning
Joined
May 19, 2022
Messages
819
Likes
1,009
Location
Dirty Jerzey
In a stereo system , the phantom sound image illusion is created by the brain only when using two stereo loudspeakers. This fact is in my opinion the biggest flaw with Dr. Tooles and Olives spinorama. Their statement is - mono is good enough to make conclusions about the sound of a loudspeaker. This is also true, I believe,- If we are talking about only one loudspeaker.
Dr. Toole stipulates that the stereo system is seriously flawed, and he is also correct about that.

But….You most often dont listen to mono with only one speaker, you always have two speakers to make the stereo illusion real for the brain. The spinorama dont care about this fact.

There is much more investigations to be needed about phantom images when using two loudspeakers in stereo - in my opinion.

Dr.Toole and Olive has done a great job of what makes a loudspeaker sound good - at least in mono, and maybe also in a multichannel homecinema setup.
The reason mono is used is precisely for that reason: It deprives the brain of additional information. With two speakers or more, the auditory center can use the additional information to "interpolate" and flaws in the speakers tonality are less obvious.

When it comes to imaging, the elephant in the room is crosstalk with conventional 2-channel stereo speakers. As far as I can tell, in most systems this will wipe out pretty much all the ITDs and corrupt the ILDs to varying degrees, but room treatment and other variables will factor in. Never the less, in many situations you will never really perceive the sound sources in the recording as distinct entities like they are in real life. Instead you end up with a phantom image containing the various sound sources within it. Headphones help somewhat in that you have spatial effects and instrument separation, but the weak or absent HRTFs hurts imaging there as well with everything more or less being imaged just behind the head. From my experience, crosstalk cancellation fixes all the issues with stereo setups and allows for real, lifelike imaging with accurate width, depth, and even height with some form of convolution with suitable filters, but the huge catch is all the baggage it comes with. Additionally variability in listeners will also likely factor in as well. With all that in mind its impossible to really develop any sort of hard and fast metric for conventional imaging itself since it will be so variable given the room, listener, speaker configuration, etc. It would just be perfect imperfection. The only real choice you have is the width of the listening window and what listeners prefer for various situations (e.g. professional monitors, hi-fi, home cinema, etc.), and that seems to be adequately captured in the current regimens done via the speaker's directivity measurements. As far as the rest of it, we will have to wait for some clever, new solution that can be objectively and quantitatively defined in the measurements.
 

Alice of Old Vincennes

Major Contributor
Joined
Apr 5, 2019
Messages
1,426
Likes
920
To my knowledge, Harman's Spinorama related double-blind studies have not been reproduced and confirmed by other independent researchers in other listening rooms.

Can the findings be extrapolated to rooms other than Harman's specific listening room?

What are the acoustic properties of the room in which the Spinorama model is created? Absorbent surfaces, reflective surfaces, etc.?
Exact size, speaker position and listening position?. Are passive comparison speaker positions and relative acoustic properties always the same?

View attachment 227707
Is this a typical listening room?

In the study below, Sean Olive admits that the method is limited to this room and mainly box speakers. No other rooms have been evaluated. 70 different box speakers are the foundation of the study. One dipole speaker, Martin Logan, was tested.
Floyd Toole has pointed out on several occasions that the only dipole speaker in the study, Martin Logan, measured poorly in the physical and psychological dimensions. Even the direct sound was strongly deviant.
The room is clearly optimized for box speakers. Dipole speakers and omnipole speakers will not create optimal reflections in this room.

For me, as an amateur at speaker measurements with some knowledge of how the brain reacts to early broad-spectrum reflexes behind the evaluated speaker from the other speakers, it is not surprising that in the physical dimension destructive interferences occur with the direct sound which cannot be compensated for in the brain in the neurophysiological dimension.

I foresee great potential in a completed general spinorama model where each unique room's size and acoustic properties are factored into the calculation to be able to predict how each spinorama examined speaker will sound in each unique listening room. The measurement results must be supplemented by taking into account some important crucial neurophysiological and neuropsychological aspects of how we hear in rooms in the calculation to create a complete spinorama algorithm.
Completing the algorithm with neurophysiological and neuropsychological data is not particularly difficult.

A Multiple Regression Model for Predicting Loudspeaker Preference Using Objective Measurements: Part II - Development of the Model
Sean E. Olive, AES Fellow

Brother.
 

Alice of Old Vincennes

Major Contributor
Joined
Apr 5, 2019
Messages
1,426
Likes
920
The test conditions in Harman's room are different from the set-up conditions of "typical listening rooms". Whether or not these differences matter is subject to debate. I'm of the (apparently minority) opinion that they do matter.

For one thing, the abnormally-long lateral reflection paths result in the first sidewall reflections arriving significantly later than they would in a normal room. My understanding is that the arrival time and intensity of the first lateral reflections play a role in both perceived sound quality and perceived spatial quality, and the Harman test conditions increase the arrival time and reduce the intensity of these reflections relative to a "typical listening room" situation.

I'm not saying this makes a big difference, but in my opinion it does make a worth-paying-attention-to difference.
Gee. If only Floyd had your input.
 

Alice of Old Vincennes

Major Contributor
Joined
Apr 5, 2019
Messages
1,426
Likes
920
For a multichannel listening room ("home theater") --- kitchen? Unlikely. Bedroom? Unlikely. Living room, yeah. Or a dedicated room.

Let's quote the conclusion:



seems pretty reasonable and useful to me.

I'd advise all the 'minority opinions' to also take a look at Floyd Toole's home setup.
Main problem with open design "modern" homes is cheap drywall expanses. I'd rather listen to music in the county jail visitor's center with cinder block.
 

phoenixdogfan

Major Contributor
Forum Donor
Joined
Nov 6, 2018
Messages
3,335
Likes
5,236
Location
Nashville
Thanks for the encouragement! I have two questions:

1. I presume the Atmos channels can be up-mixed from conventional two-channel stereo... how well does that work, in your opinion?

2. Are you planning to attend the Capital Audio Fest this November?
I think the best upmixer by far is not an Atmos, but rather the Auromatic mixer which can upmix anything from two channel to 7.1 to Auro 3D. It's really very good.
 

Thomas_A

Major Contributor
Forum Donor
Joined
Jun 20, 2019
Messages
3,469
Likes
2,466
Location
Sweden
And, at that time, many thought so.

I can just also add that the notion "they being here" is a bit wrong. It is rather "me and my room being there", with the room acting as a lounge to the musical event.
 

krabapple

Major Contributor
Forum Donor
Joined
Apr 15, 2016
Messages
3,197
Likes
3,767
Main problem with open design "modern" homes is cheap drywall expanses. I'd rather listen to music in the county jail visitor's center with cinder block.
Whether those expanses are kept free of reflecting, absorbing, and diffusing objects, is entirely the homeowner's choice. And as gypsum drywall has been the material of choice for well on 50 years now in the west, I suggest many listeners have managed to live with it, acoustically.
 

krabapple

Major Contributor
Forum Donor
Joined
Apr 15, 2016
Messages
3,197
Likes
3,767
I think the best upmixer by far is not an Atmos, but rather the Auromatic mixer which can upmix anything from two channel to 7.1 to Auro 3D. It's really very good.
I was a big fan of the old Dolby Pro Logic II (Music) upmixer, though alas it has been deprecated on most AVRs in favor of the (arguably inferior for music) Dolby Surround Upmixer.

Gene Della Salla of Audioholics has done a series of videos on upmixers. Here's an early one in the series, comparing several different brands.
 

Duke

Major Contributor
Audio Company
Forum Donor
Joined
Apr 22, 2016
Messages
1,579
Likes
3,896
Location
Princeton, Texas
Gee. If only Floyd had your input.
Limitations of a study are not flaws in the study, they are simply aspects that may be worth being aware of aware of. Limitations are virtually inevitable.

And neither Floyd Toole nor Sean Olive have any need for my input in order to acknowledge the existence of limitations. From Sean Olive's paper entitled "A Multiple Regression Model for Predicting Loudspeaker Preference Using Objective Measurements: Part II – Development of the Model", emphasis mine:

"LIMITATIONS OF MODEL

"The conclusions of this study may only be safely generalized to the conditions in which the tests were performed. Some of the possible limitations are listed below.

"1. Up to this point, the model has been tested in one listening room.

"2. The model doesn't include variables that account for nonlinear distortion (and to a lesser extent, perceived spatial attributes).

"3. The model is limited to the specific types of loudspeakers in our sample of 70.

"4. The model's accuracy is limited by the accuracy of the subjective measurements.
 
Last edited:

krabapple

Major Contributor
Forum Donor
Joined
Apr 15, 2016
Messages
3,197
Likes
3,767
Limitations of a study are not flaws in the study, they are simply aspects that may be worth being aware of aware of. Limitations are virtually inevitable.

And neither Floyd Toole nor Sean Olive have any need for my input in order to acknowledge the existence of limitations. From Sean Olive's paper entitled "A Multiple Regression Model for Predicting Loudspeaker Preference Using Objective Measurements: Part II – Development of the Model", emphasis mine:

"LIMITATIONS OF MODEL

"The conclusions of this study may only be safely generalized to the conditions in which the tests were performed. Some of the possible limitations are listed below.

"1. Up to this point, the model has been tested in one listening room.

"2. The model doesn't include variables that account for nonlinear distortion (and to a lesser extent, perceived spatial attributes).

"3. The model is limited to the specific types of loudspeakers in our sample of 70.

"4. The model's accuracy is limited by the accuracy of the subjective measurements.


All valid, but I believe Harman's multichannel listening room design postdates this?
 
OP
Neuro

Neuro

Member
Joined
May 23, 2019
Messages
65
Likes
93
Location
Sweden
The test conditions in Harman's room are different from the set-up conditions of "typical listening rooms". Whether or not these differences matter is subject to debate. I'm of the (apparently minority) opinion that they do matter.

For one thing, the abnormally-long lateral reflection paths result in the first sidewall reflections arriving significantly later than they would in a normal room. My understanding is that the arrival time and intensity of the first lateral reflections play a role in both perceived sound quality and perceived spatial quality, and the Harman test conditions increase the arrival time and reduce the intensity of these reflections relative to a "typical listening room" situation.
_____________________________________________________________________________________________________
_____________________________________________________________________________________________________
MLL Dimensions
Length 9.14 m
Width 6.58 m
Height 2.59 m
Floor Area 60.20 m 2
Volume 155.92 m 3
____________________
1662657695927.png

Harman room at the time of the study.
https://www.harman.com/documents/HarmanWhitePaperMLLListeningLab_0.pdf
_________________________________________________________________________________

1662651191545.png

April 1971, Journal of Sound and Vibration 15(4):475-494

I'm not an expert on acoustics, but I have some insight into how the brain reacts to different stimuli. In rooms like Harman's, we primarily hear the direct sound, where the reflections provide a secondary coloration. Psychologically, only certain lateral reflexes give a positive coloring. Reflections from the front wall, back wall and ceiling all give a more negative perception of the direct sound.
Barron's elegant figure clearly shows the window for optimal delay and perceived loudness of lateral reflexes for classical music.
In Harman's room, reflections with negative coloring dominate the direct sound. The positive coloring of the direct sound from lateral reflections is negligible.
The ranking is likely not affected by box speakers with similar dispersion patterns.
Are the spinorama target curves with regard to early reflexes and in-house curves extrapolated from the perceived best speaker results?
Are the in-house curves a summation of direct sound and reflected sound in this special room? If so where are they measured and over what time window?
 

Duke

Major Contributor
Audio Company
Forum Donor
Joined
Apr 22, 2016
Messages
1,579
Likes
3,896
Location
Princeton, Texas
View attachment 229524
April 1971, Journal of Sound and Vibration 15(4):475-494

I'm not an expert on acoustics, but I have some insight into how the brain reacts to different stimuli. In rooms like Harman's, we primarily hear the direct sound, where the reflections provide a secondary coloration. Psychologically, only certain lateral reflexes give a positive coloring. Reflections from the front wall, back wall and ceiling all give a more negative perception of the direct sound.
Barron's elegant figure clearly shows the window for optimal delay and perceived loudness of lateral reflexes for classical music.
GREAT post!

That figure from Barron's study clearly shows that lateral reflections arriving after less than 5 milliseconds delay will shift the image. My understanding is that these early lateral reflections can also be a source of coloration, which is not clear to me from the figure.

Suppose we have a speaker placed about three feet from the side wall, which is not uncommon in home audio. Its sidewall reflection will arrive about 5 milliseconds behind the direct sound. How much image shift and coloration we get from that reflection depends on how loud it is and what its spectral content is.

On the other hand in the Harman room, the long path length for the first lateral reflections increases their time delay (which is desirable) and reduces their relative loudness (which is also desirable). Finally, the greater horizontal angular separation of source and reflection in the Harman room reduces the coloration effects of the reflection (again, desirable).

In my opinion this long path length disproportionately benefits a speaker whose first lateral reflection would have been particularly detrimental either because of its strength or its spectral balance.

On the other hand the benefits of a design which deliberately minimizes the strength of that first lateral reflection and/or which pays particular attention to getting its spectral balance correct will not show up very well in the Harman room.

I'm not saying this first lateral reflection is of overwhelming importance to timbre, but it is relatively important if precise image localization is a high priority (which is one of the reasons why control room acoustic design goes to great lengths to prevent or minimize that first lateral reflection).

These are not FLAWS in the Harman approach!! But they are inherent limitations, which all studies have.

In my opinion.

Here is an example of a speaker designed to work well in close proximity to a side wall. Underneath the grill are a 12" midwoofer and a 90-degree constant-directivity (in the horizontal plane) horn. This is one of my discontinued models.

PhantomCenter-002.jpg
 
Last edited:

preload

Major Contributor
Forum Donor
Joined
May 19, 2020
Messages
1,559
Likes
1,704
Location
California
Limitations of a study are not flaws in the study, they are simply aspects that may be worth being aware of aware of. Limitations are virtually inevitable.

And neither Floyd Toole nor Sean Olive have any need for my input in order to acknowledge the existence of limitations. From Sean Olive's paper entitled "A Multiple Regression Model for Predicting Loudspeaker Preference Using Objective Measurements: Part II – Development of the Model", emphasis mine:

"LIMITATIONS OF MODEL

"The conclusions of this study may only be safely generalized to the conditions in which the tests were performed. Some of the possible limitations are listed below.

"1. Up to this point, the model has been tested in one listening room.

"2. The model doesn't include variables that account for nonlinear distortion (and to a lesser extent, perceived spatial attributes).

"3. The model is limited to the specific types of loudspeakers in our sample of 70.

"4. The model's accuracy is limited by the accuracy of the subjective measurements.
THANK YOU. HALLELUJAH! Someone else here gets it and is capable of interpreting the published work on this topic.
 

tuga

Major Contributor
Joined
Feb 5, 2020
Messages
3,984
Likes
4,285
Location
Oxford, England
Question: If you take a human voice singing live in a virtual room about 2-3 metres behind your front wall, how would the dispersion vs freqeuncy from that voice look like in your room? Wide up to 8 kHz?

Does this help?

TUo8UQf.jpg

Human voice - horizontal dispersion
 
Last edited:

Thomas_A

Major Contributor
Forum Donor
Joined
Jun 20, 2019
Messages
3,469
Likes
2,466
Location
Sweden
Does this help?

TUo8UQf.jpg
Thanks,

not sure though what the different colours represent?

Why I am interested in this is because of the model using "me and my room being there" at the musical event. Placing a room as a lounge with an opening to the venue of the musical event would imply that all sound that emerges from the musicians, instruments and reflections from that venue will reach the listener directly, as well as via reflections from boundaries of the lounge room.

In a reproduction situation, you have captured the sound from the microphones and reflections at the recording venue, and now have the choice to listen to this by different means. Using headphones, you are canceling the reflections from the lounge room and you will not be able to sense natural directions of the event itself (unless the sound is processed somehow). The next situations, using very directive speakers in a stereo setup or speakers in anechoic room, you still risk the "in-the head" sound. Neither headphone or high directivity would mimic the lounge situation. To mimic the live lounge situation you must allow speakers with sufficient dispersion to get "natural" reflections from the lounge room boundaries (or use multi-channel setup). This despite that the reproduction that include additional reflections is not true to the recording itself, but it "may" be true to what the studio worker heard, "if" he or she sat in a similar reflective room during evaluation. When the studio worker or the consumer listens to the recording via traditional and accurate headphones, it is in one way the truth of what is on the recording, but quite far from the event itself.

Dispersion of acoustic instruments differ very much, but the most familiar instrument to us is the human voice. A wide dispersion up to 6000 Hz as the pictures show, would mean that a singing voice 2 meters into the event room would mean strong reflections from the walls in the lounge room quite high up in frequency. Almost constant to 6000 Hz +/-60 degrees? There is no scale indications though....
 
Last edited:
OP
Neuro

Neuro

Member
Joined
May 23, 2019
Messages
65
Likes
93
Location
Sweden
GREAT post!

That figure from Barron's study clearly shows that lateral reflections arriving after less than 5 milliseconds delay will shift the image. My understanding is that these early lateral reflections can also be a source of coloration, which is not clear to me from the figure.

Suppose we have a speaker placed about three feet from the side wall, which is not uncommon in home audio. Its sidewall reflection will arrive about 5 milliseconds behind the direct sound. How much image shift and coloration we get from that reflection depends on how loud it is and what its spectral content is.

On the other hand in the Harman room, the long path length for the first lateral reflections increases their time delay (which is desirable) and reduces their relative loudness (which is also desirable). Finally, the greater horizontal angular separation of source and reflection in the Harman room reduces the coloration effects of the reflection (again, desirable).

In my opinion this long path length disproportionately benefits a speaker whose first lateral reflection would have been particularly detrimental either because of its strength or its spectral balance.

On the other hand the benefits of a design which deliberately minimizes the strength of that first lateral reflection and/or which pays particular attention to getting its spectral balance correct will not show up very well in the Harman room.

I'm not saying this first lateral reflection is of overwhelming importance to timbre, but it is relatively important if precise image localization is a high priority (which is one of the reasons why control room acoustic design goes to great lengths to prevent or minimize that first lateral reflection).

These are not FLAWS in the Harman approach!! But they are inherent limitations, which all studies have.

In my opinion.

Here is an example of a speaker designed to work well in close proximity to a side wall. Underneath the grill are a 12" midwoofer and a 90-degree constant-directivity (in the horizontal plane) horn. This is one of my discontinued models.

View attachment 229546
The floor reflex is debated. Toole suspects that, for evolutionary reasons, floor reflections do not affect the coloration of direct sound.
For me, there is possibly no coloration of the direct sound of the floor reflex via image shift according to the summation location principle in the vertical plane.
It is a well-known fact that localization in the vertical plane is significantly worse than in the horizontal plane. Location summation probably takes place in the vertical plane at a longer time window and at different sound intensities than in the horizontal plane. Can't recall reading anything about this in any study. Maybe someone else has a tip.
As I see it, summation location gives no coloration. Summation location is a psychological variant of image shift which determines the horizontal localization during stereo listening. When perceiving stereo localization, there is no coloration of the direct sound.
The floor reflex will move the sound sources slightly closer to the floor through summation location.

There is consensus that the lateral reflex is the most important reflex in a room for normal hearing individuals. Optimized lateral reflex provides maximum spatial experience.
In individuals with mild hearing loss, often in audiophiles, optimized lateral reflexes for normal hearing can result in an impaired perception of the direct sound.
The characteristics of the sound source determine how we best perceive the direct sound. Click sounds, speech, dense music are optimized at different times, sound intensity, frequency spectrum, etc. The arrival angle of the reflected sound in the horizontal plane is not insignificant.

I must emphasize that the spinorama of today has been a giant step towards a more objective description of box speakers in rooms. I agree with Olive that the spinorama can be further optimized through studies in other types of rooms.
 
Last edited:

Duke

Major Contributor
Audio Company
Forum Donor
Joined
Apr 22, 2016
Messages
1,579
Likes
3,896
Location
Princeton, Texas
Why I am interested in this is because of the model using "me and my room being there" at the musical event. Placing a room as a lounge with an opening to the venue of the musical event would imply that all sound that emerges from the musicians, instruments and reflections from that venue will reach the listener directly, as well as via reflections from boundaries of the lounge room.

This sounds to me like you are anticipating a perceptual synthesis of the two venues, the recording venue and the playback room, such that the playback room is in effect "a lounge with an opening to the venue of the musical event", presumably the cues of BOTH rooms being in play perceptually at the listening position.

In my opinion if the venue cues on the recording are perceptually dominant, you don't really get a "synthesis" of the two venues; instead you perceive the one and not the other. I realize this is somewhat counter-intuitive, and I don't have any directly applicable examples, but let me offer an example from a different area of psychoacoustics:

I assume you are familiar with the "Yanni vs Laurel" recording? If not, it's easy to Google.

Regardless of whether you perceive "Yanni" or "Laurel", BOTH SETS OF CUES are present! Your ear/brain system selects one set of cues as being the dominant set, and largely or entirely suppresses your awareness of the other set. I think something similar happens in the playback room, in situations where the venue acoustics in the recording are selected by the ear/brain system as being the more plausible set of acoustic environment cues.
 
Top Bottom