• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Imaging and Soundstage Test Metrics

BenB

Active Member
Joined
Apr 18, 2020
Messages
285
Likes
448
Location
Virginia
Speaker testing is predominantly done with a single speaker, including the measurements and listening tests. We don't have quantitative ways to compare the imaging and soundstage potential of different speakers. We understand (or theorize) that level and phase matching between the stereo speakers will impact imaging, and also that the dispersion pattern (and it's interaction with the room) will impact soundstage (as well as imaging to some degree).

Given the opportunity to do stereo A/B testing, we can make comparative assessments such as "speaker A images better than speaker B", or "Speaker B has a wider soundstage than speaker A". However, it would be great to be able to put these attributes on a fixed scale to facilitate comparisons between speakers without having them in the same place at the same time.

I think if we pool our collaborative knowledge and creativity, that we may actually be able to crack this nut.

As a step toward this endeavor, I have created a prototype (and almost certainly flawed, or at least not optimized) scoreable imaging test. The mp3 that I have linked modulates the breadth of pink noise sources over time. At first, the breadth is across the entire range separating the speakers. By 5 seconds it collapses completely to mono. At 10 seconds it expands to 80% of the range between the speakers. At 15 seconds it collapses to mono again. At 20 seconds it spans 64% of the speaker separation. This modulation continues, with 5,15,25,35,45 (anything ending in 5) being completely mono. At 0,10,20,30, (anything ending in 0), the breadth reaches a local maximum, but that span decreases by 20% every time it modulates (100%, 80%, 64%, 51%, 41%, 33%, etc).

At some point, the listener will no longer be able to discern a difference between the mono noise (on the 5s), and the broadened noise (on the 0s). The longer it takes for this difference to become imperceptible, the better the speaker's imaging is. Of course, the room (and probably the listener) will also impact the result, but this might allow a decent level of consistency if performed by the same listener (Amir?) in the same room.

I believe there are 15 modulations in the mp3 linked below, so there's a potential of scoring 15 points. Simply divide the time when the changes to noise breadth became imperceptible by the period of 10 seconds to get the score. There will probably be somewhat of a training effect if people actually try this, as they get better at listening for and identifying the breadth changes.

An improved version would use a computer program to challenge the listener to do ABX testing with one perfectly mono (matched) noise source, and one with varying levels of distortion (differences) between the channels. The program would know the extent of the differences, and would be able to determine how much distortion was perceptible to the listener. A system with good imaging would allow the listener to differentiate even minor distortions from perfection.

Thoughts, comments, help?

 
Of course, the room (and probably the listener) will also impact the result,
Those are PROBABLY MORE IMPORTANT than the speaker. And in "real life" when you're not playing a test file, of course the recording-production.

Unless you have an "unusual" speaker like a dipole, omnidirectional, constant-directivity, etc. But that all affects how the speaker interacts with the room so you'd have to swap-out speakers in the same room.
 
Speaker testing is predominantly done with a single speaker, including the measurements and listening tests. We don't have quantitative ways to compare the imaging and soundstage potential of different speakers. We understand (or theorize) that level and phase matching between the stereo speakers will impact imaging, and also that the dispersion pattern (and it's interaction with the room) will impact soundstage (as well as imaging to some degree).

Given the opportunity to do stereo A/B testing, we can make comparative assessments such as "speaker A images better than speaker B", or "Speaker B has a wider soundstage than speaker A". However, it would be great to be able to put these attributes on a fixed scale to facilitate comparisons between speakers without having them in the same place at the same time.

I think if we pool our collaborative knowledge and creativity, that we may actually be able to crack this nut.

As a step toward this endeavor, I have created a prototype (and almost certainly flawed, or at least not optimized) scoreable imaging test. The mp3 that I have linked modulates the breadth of pink noise sources over time. At first, the breadth is across the entire range separating the speakers. By 5 seconds it collapses completely to mono. At 10 seconds it expands to 80% of the range between the speakers. At 15 seconds it collapses to mono again. At 20 seconds it spans 64% of the speaker separation. This modulation continues, with 5,15,25,35,45 (anything ending in 5) being completely mono. At 0,10,20,30, (anything ending in 0), the breadth reaches a local maximum, but that span decreases by 20% every time it modulates (100%, 80%, 64%, 51%, 41%, 33%, etc).

At some point, the listener will no longer be able to discern a difference between the mono noise (on the 5s), and the broadened noise (on the 0s). The longer it takes for this difference to become imperceptible, the better the speaker's imaging is. Of course, the room (and probably the listener) will also impact the result, but this might allow a decent level of consistency if performed by the same listener (Amir?) in the same room.

I believe there are 15 modulations in the mp3 linked below, so there's a potential of scoring 15 points. Simply divide the time when the changes to noise breadth became imperceptible by the period of 10 seconds to get the score. There will probably be somewhat of a training effect if people actually try this, as they get better at listening for and identifying the breadth changes.

An improved version would use a computer program to challenge the listener to do ABX testing with one perfectly mono (matched) noise source, and one with varying levels of distortion (differences) between the channels. The program would know the extent of the differences, and would be able to determine how much distortion was perceptible to the listener. A system with good imaging would allow the listener to differentiate even minor distortions from perfection.

Thoughts, comments, help?

An interesting idea. I often do a simple panning test on a mono signal to see how far I can get it to pan left and right. When things are working well it will pan equally well to the left and right and have clear positions everywhere else along the way. Listening to your test I get an impression about 30 seconds in to it that I'm hearing a relatively constant mono center while the wide sound is super-imposed on the background. It doesn't seem to be one sound getting narrower and wider, it's just the uncorrelated signal getting quieter relative to the center mono signal. I'm not sure how you made the signal, and what that might say about how my sytem is imaging. I have noticed that some music sounds like just 3 main things, hard panned sounds left, hard panned right, and dead center. It makes me worry a little, so I'm releived when I put something on that proves my system and room can indeed place images everywhere inbetween too.
 
An interesting idea. I often do a simple panning test on a mono signal to see how far I can get it to pan left and right. When things are working well it will pan equally well to the left and right and have clear positions everywhere else along the way. Listening to your test I get an impression about 30 seconds in to it that I'm hearing a relatively constant mono center while the wide sound is super-imposed on the background. It doesn't seem to be one sound getting narrower and wider, it's just the uncorrelated signal getting quieter relative to the center mono signal. I'm not sure how you made the signal, and what that might say about how my sytem is imaging. I have noticed that some music sounds like just 3 main things, hard panned sounds left, hard panned right, and dead center. It makes me worry a little, so I'm releived when I put something on that proves my system and room can indeed place images everywhere inbetween too.
I forget what system I was using, but I had a similar perception to what you describe at one point. I don't think it invalidates the efficacy of the test. A better imaging system should have a more precise center image, which should be distinguishable from the "wide sound" for longer. I wont go into the boring details of how I made it, but I will say that the level is remarkably consistent over time, which you can verify by viewing the waveform in something like audacity. So in order for the "wide sound" to modulate, the mono sound can't actually be constant.

I have been meaning to come back to this and perform 2 trials: 1 with my speakers, and 1 where I purposely apply different EQ to each channel to make them less well matched. Then I can compare the scores.

I'd love to hear if anyone actually performed the test as described, and bonus points for doing it again with EQ to mismatch the channels.
 
This is a great effort, but I think you'd want to test speakers under the same conditions, if the goal is to evaluate speaker designs.

However, this test does have some use for evaluating rooms+speakers together, and could yield useful comparative data.

One added thing - I wouldn't presume that we know much about how stereo speakers generate an image. It's easy to make assumptions about what performance characteristics produce this effect, but very little data. One thing to consider is the fact that different recordings encode spatial information in different ways. There's phase difference and magnitude difference, for one. Also, I find steady state tones less localizable than percussive tones, and higher pitches more localizable than lower. Room effects and physiology both probably play into that.

I listened to the tones you generated and they do smoothly sweep between the sides and center, I think they could be the basis of a useful metric. However I think you are right about A+B testing. I like the idea of scoring an individual speaker, but maybe it should be more like a test - is tone A or B wider? And when you start getting them wrong, it identifies the threshold for that speaker. This assumes that good imaging speakers are good at identifying slightly off-center sounds, but it's also about overall breadth as well.
 
Oh and I tested it on my desktop system, a pair of jbl lsr 305ii or whatever - differences became very subtle at 50s. Changing head position would probably render more or less. I know for a fact this pair has poor matching between L+R.
 
I forget what system I was using, but I had a similar perception to what you describe at one point. I don't think it invalidates the efficacy of the test. A better imaging system should have a more precise center image, which should be distinguishable from the "wide sound" for longer. I wont go into the boring details of how I made it, but I will say that the level is remarkably consistent over time, which you can verify by viewing the waveform in something like audacity. So in order for the "wide sound" to modulate, the mono sound can't actually be constant.

I have been meaning to come back to this and perform 2 trials: 1 with my speakers, and 1 where I purposely apply different EQ to each channel to make them less well matched. Then I can compare the scores.

I'd love to hear if anyone actually performed the test as described, and bonus points for doing it again with EQ to mismatch the channels.
I did the test and got 10 with the speakers equal, and a 9 with some disruptive EQ. I might say 11 and 10 but it wasn't obvious. I could clearly still hear it at 10 and 9. The refigerator came on somewhere during all that and I'm not playing it super loud so that might have done something. Listening more carefully on these las two, from start to finish I hear the uncorrelated pink noise always coming in at full width and superimposing itself against the mono center, which is also going up and down in volume. I definitely do not percieve it as a single sound growing wider and narrower.
 
Maybe a few tests need to be developed. Overall width, center specificity, center stability, etc.
 
I did the test and got 10 with the speakers equal, and a 9 with some disruptive EQ. I might say 11 and 10 but it wasn't obvious. I could clearly still hear it at 10 and 9. The refigerator came on somewhere during all that and I'm not playing it super loud so that might have done something. Listening more carefully on these las two, from start to finish I hear the uncorrelated pink noise always coming in at full width and superimposing itself against the mono center, which is also going up and down in volume. I definitely do not percieve it as a single sound growing wider and narrower.
Are you familiar with the mid/side recording technique, or the (Haar) discrete wavelet transform? In both cases, the main idea is decomposing things into a component that's the same, and a component that's different. I believe I was listening with headphones when I had a similar perception to yours. Any stereo sound can be decomposed into those components. Still, I make no promises that I have realized the optimum manifestation of this test concept. I could certainly be convinced to try other things (assuming I find time).
 
Speaker testing is predominantly done with a single speaker, including the measurements and listening tests. We don't have quantitative ways to compare the imaging and soundstage potential of different speakers. We understand (or theorize) that level and phase matching between the stereo speakers will impact imaging, and also that the dispersion pattern (and it's interaction with the room) will impact soundstage (as well as imaging to some degree).
....

My other big gripe with speaker testing these days is that bookshelf speakers are not measured when offloaded of lower frequencies (which improves their behavior).
 
Are you familiar with the mid/side recording technique, or the (Haar) discrete wavelet transform? In both cases, the main idea is decomposing things into a component that's the same, and a component that's different. I believe I was listening with headphones when I had a similar perception to yours. Any stereo sound can be decomposed into those components. Still, I make no promises that I have realized the optimum manifestation of this test concept. I could certainly be convinced to try other things (assuming I find time).
With MS recording the sides aren't really different just phase reversed. There's a "song" on Spotify thats called "phase test" with dialog and drums. When there out of phase the it sounds like its almost coming from the side walls where the sound from the speakers would be reflecting.
 
Back
Top Bottom