• Welcome to ASR. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

A Broad Discussion of Speakers with Major Audio Luminaries

Recording, esp. microphones choice and placement is an Art.
 
The concept of a microphone pickup of any such instrument being "In phase" is hard to understand.

As mentioned previously, I don't think it makes sense to talk about anything being ´in phase´ when it comes to non-transient instrument sounds or without meaningful lower bass. However complex these sounds may be and whatever happens to group delay and phase, would most probably not audibly change the recorded sound, or the resulting reverb pattern would dominate what we actually perceive of that instrument in the concert hall (similar to the balloon pop experiment which is transient but does not have a lot of lower bass).

The picture changes with tympany, big drums, kick drums and in some cases picked double bass. These have the combination of enough of lower bass level and transient behavior over time, so group delay in the room, with recording, mixing or home reproduction, might actually exceed the audibility thresholds. Majority of recording engineers has some own personal tricks to make a kick drum ´kicking´ or tympanies perceived as transient and fast. If not, you will definitely hear it in the mix, even via headphones.

Loudspeaker transducers, individually over their operating frequency ranges, behave as minimum-phase devices.

What about bandpass subwoofers of higher order? They consist of several Helmholtz resonators or high-pass filters, each with its own minimum-phase behavior, but when the soundwaves of several sources get blended, the group delay distortion does not disappear just by restoring flat frequency response.

Have you ever listened to a Bose Acoustimass 8th order bandpass type subwoofer and would you consider these sounding like a minimum-phase system? If I recall it correctly, they employ 3 ports as Helmholtz resonators, partly in cascaded manner.
 
Strong room resonances also behave like minimum-phase systems, so they too respond to matched equalization, but only at one point in a room. The room is not changed, but the sounds delivered through it are.

And can someone clarify for me what "timing" means in relation to group delay and phase? If both of those are below thresholds of audibility, does this make timing acceptable?
Given that our hearing is, under certain conditions, sensitive to the shape of the waveform – not just its frequency content – and that frequency-dependent phase shifts from speakers and room acoustics can alter that shape, it seems fair to ask:

If those phase shifts cause the waveform arriving at each ear to differ sufficiently – even when the signals have the same power spectrum – due to ear spacing and the way sound propagates, especially in the range where we're most sensitive to timing and detail, isn't it possible this could lead to binaural audible artifacts?

That much, I think, seems reasonably clear from the perspectives of physical acoustics, signal processing, and basic psychoacoustics.

The real question – at least for me – is: how often can this actually happen in practice, enough to matter? In real rooms, with real speakers, under reasonably typical conditions – not just exotic edge cases.
 
Given that our hearing is, under certain conditions, sensitive to the shape of the waveform – not just its frequency content – and that frequency-dependent phase shifts from speakers and room acoustics can alter that shape, it seems fair to ask.
How do you alter the shape of a waveform without changing its frequency content?
 
How do you alter the shape of a waveform without changing its frequency content?
If you only alter the phases - in any way - it will not change the power spectrum.
 
How do you alter the shape of a waveform without changing its frequency content?
I always show this example:

3sines_inph.png 3sines_pi2ph.png

The signals differ in phase only. If the 1st one is near clipping, the other one will clip. Depending on frequencies used, it is audible as well, even without clipping. In some cases, The phase is audible.
 
What about bandpass subwoofers of higher order?
I said transducers, not combined transducers and enclosures. Above low bass frequencies enclosures are well-damped and sealed - simple acoustical systems with a single low frequency resonance that, if possible, is designed to be below the used bandwidth. Woofers and subwoofers are the issue.

Sealed woofers are simple single resonance systems. Bass reflex adds one superimposed Helmholtz resonance and the combination amplitude and time performance is predictable within limits, as discussed below. More elaborate systems add more coupled Helmholtz resonances.

All systems using ports, any number, internal or external, experience turbulence in those ports that varies with sound level, Harman and others have seriously investigated designs to minimize this, but, ultimately a slug of air in a port is not a well-behaved mass. At sound levels sufficiently high to generate turbulence the effective mass is reduced and the result is a different system tuning for every sound level, and this is not equalizable. The resonance frequency varies with sound level, but one supposes, and experience shows, that over a range of useful sound levels these are not audible problems. This is why the best professional and consumer systems have gravitated to simple transducer/enclosure systems, and those designed for high sound levels are on the large side. Sealed vs. reflex arguments tend to arise with bookshelf loudspeakers, where the resonance is within the musical frequency range. But even these cease to be problems if crossed over (high pass and low pass) to a subwoofer, where, with good design the resonance can be so low that it is in the feeling rather than hearing frequency range. My system uses multiple subwoofers to address room resonances and the efficiency gain of doing so allows all of them to be small closed boxes. All other loudspeakers in the system are reflex designs, but all have been crossed over above the tuned frequency, so are essentially closed boxes.

The advantage of passive radiators is that they are truly moving masses, and any non-linearity is in the suspension of the diaphragm. Even so, at very high sound levels it will become non-linear. These are good alternatives, but cost more than ports. The fact that naive purchasers think that they are additional drivers is a marketing advantage.

I think it is reasonable to assume that one can expect significant non-ideal acoustical output from small multi-reflex systems that are designed to maximize acoustical output over a narrow bandwidth with steeply declining output at the high and low frequency extremes - meaning transient misbehaviour. But, let's face it, the intended audience for those products is assumed not to be a critical one. Right?
 
Last edited:
This is indeed the case and was tested by David Clark in his AES paper: Measuring Audible Effects of Time Delays in Listening Tests

Here are his conclusions:

"7 SUMMARY AND CONCLUSIONS
In this study a cascaded all-pass circuit and three full-band dual path systems were evaluated in a realistic listening environment. Two speaker mono was considered superior to the one speaker, one path mono. A reflection from a vertical surface was barely audible but a horizontal reflector were more audible. An electronic delay comb filter was highly audible and annoying. The all-pass filter was inaudible. With careful interpretation e TOS analyzer wee able to show probable cause for the audibility differences. It ia felt that elementary "two eared processing would give even closer correlation."
Bear in mind that a large cascaded set of allpass filters can very closely approximate a delay. Again, the only phase shift that matters is the phase shift AFTER the delay is removed. In at least one of those famous papers, the allpass filter set very closely matched a delay. Offhand I don't recall which one.
 
Actually Mr Toole clearly states in his document that listeners are able to listen through the room (or simlar wording). So we are capable of somehow "substract" the room from the speaker sound.

For first arrival you can listen through the room very effectively. Short-term nonlinearities in hearing suppress early reflections.
 
If you read what I have actually said, many times, it comes down to a few simple observations. Loudspeaker transducers, individually over their operating frequency ranges, behave as minimum-phase devices. That means that an anechoic flat frequency response is a reliable indicator that there are no resonances, no associated phase shift, and no ringing. This is a good start, it seems to me, but evidence of that behaviour is only available in anechoic or equivalent measurements, not room curves. Any sound, impulses included, will be fairly treated by such devices - over their operating frequency ranges. Things can go astray in the crossover regions, which is where serious measurements, including phase, are needed to ensure a smooth summation in those regions. So, the conclusion is that frequency response is not "uber alles" but if it is wrong, nothing else may matter as much as one might think.
^Thanks^ and I got it…
However I usually get stumbled up on step-response and transducers playing 180 degrees out of phase.
When the output of the speaker inverts the signals in frequency bands corresponding to the driver’s bands, then in a mathematical sense… if we difference the input (signal) with output (SPL) it seems like fidelity has been lost.

However in practice, and with engineered and mixed music, there is no guarantee that the signals from the various tracks have not been inverted.
The inversions likely matter the most when there are fewer mics, fewer tracks, and less mixing… If they matter at all.
Hence… “Frequency Response Uber Alles”… but yes compression, resonances, etc. all come in as well.
When move towards electronics from the speakers, then it is almost always “Frequency Response Uber Alles”, as exhibited with the SINAD ranking.
There is something happening with DACS near the 20kHz side and various cutoff filters that are manifesting differences in the time domain, or something else that is allowing ABX differences to be heard.

But we have what we have with the testing equipment and methodologies…

If moderate resonances should exist in such minimum-phase transducers, they can be addressed by matched equalization based on anechoic or equivalent data, not room curves. This is why Amir's data is so useful. A loudspeaker that is "almost" really good, can be improved by equalization, especially those with well-behaved directivity.
I got it.
And yes. But if the Klippel can do the measurements and separate the direct sound from the echos, then it seems certainly possible that the human can also do that.

Which @j_j seems to state:
For first arrival you can listen through the room very effectively. Short-term nonlinearities in hearing suppress early reflections.


Strong room resonances also behave like minimum-phase systems, so they too respond to matched equalization, but only at one point in a room. The room is not changed, but the sounds delivered through it are.

And can someone clarify for me what "timing" means in relation to group delay and phase? If both of those are below thresholds of audibility, does this make timing acceptable?
I dunno.
I am only responding to the idea that one can hear the direct sound of the speakers, and listen through the room.
This is less true when the sounds are steady state and the reflections and reverberations are mixed with the direct sound.

In implusive music/sounds, there are no pre-existing copies of the impulse to mix in with the direct sound.
 
Of course, you're right, but my direct sound must be in phase. An instrument, a violin or a cello, playing in my room is in phase
Somebody might like to explain where one should place a microphone relative to a complex musical instrument - like violin, cello, bass, piano, etc. etc.- where the "sound" is like what one hears in a decent auditorium/concert hall. If that is the reference. Such sources radiate extremely complex sound patterns from various parts of the instrument, all acoustically interfering with each other, altering amplitude and phase response, when they arrive at the mic. The concept of a microphone pickup of any such instrument being "In phase" is hard to understand. It is what it is, and neither amplitude or phase are "standardized" for any musical instrument, before we even consider reflections from nearby surfaces at the point of capture - like a floor. There is a room at the beginning of the recording process too.

Care to elaborate?
I asked the same question earlier in the thread, but got no response.

I am getting the impression that some members here have committed themselves to some in-principle ideas, maybe even put a lot of expense or effort into developing their hifi around such ideas, and are then resistant to evidence that the ideas are not perceptibly beneficial. For example:-
...I want that acoustic energy to exactly match the electrical signal given to the speaker.
Whatever the signal is...phase corrupted, phase perfect, whatever any of those nebulous terms even mean.....it doesn't matter one iota imo.
A truly technically excellent speaker will match signal completely....mag and phase, impulse et al.

The only arguments against such technical excellence that I see, are the time -is we can't hear phase, small-rooms mask everything anyway, source material is screwed up to begin with....yada yada and more yada.

Well, I say who knows...maybe the reason for all that 'yada why bother' is that we never had the ability to make such phase and time aligned technically excellent speakers before. Muti-way active DSP has truly changed what speakers are capable of.
I say, why not get on board with the idea we can at least make one component in the whole "why bother/can't hear" circle of confusion quagmire ....at least make the speakers.... NOT be part of the problem.

Technical excellence means both mag and phase matter. To the degree speaker phase eventually proves to be audible/inaudible remains to be seen.
Anyone saying it's already proven, is operating under a very subjective state of mind, imnsho.
This argument is exactly the same as the argument that super-high sampling rates and bit depths eg 32/768 are an extremely good idea because..."getting the input and output signals to exactly match" is "technical excellence" and hence 24/48 will never do. And any arguments that 24/48 is completely transparent to the ear will be dismissed as "yada yada and more yada".

If we have some evidence and it suggests that phase and time coherence is not beneficial beyond certain levels (eg where it audibly affects frequency response), then if you want to dispute it, do it with better evidence. Not just arguing and denial.

Which is not to say that I am unsympathetic to pursuing technical excellence for its own sake. I totally get that. Of course, it leads to selling your 105 dB SINAD DAC to get one with 120 dB, then selling that to get next-gen with 135 dB. Putting 00 gauge wire in your interconnects. Lining your listening room with two inches of lead. What people do in the privacy of their living rooms is their business and no criticism will be heard from my direction.

But what is not helpful is to come into these discussions insisting that these measures are important in any sonic way. Denying the existing science in favour of what hasn't been discovered "yet". Arguing that you are unconvinced because science can't prove anything. Implying that people who quote the best available science are sycophants or misquoting or vastly overstating things. All that gameplay.

I implore that we all explore the audio science of sound reproduction with curiosity and integrity, namely: being open to changing our minds when the available evidence contradicts our prior beliefs; being honest with ourselves when we are feeling resistant to new incoming evidence that challenges our presuppositions; disputing evidence with better evidence instead of just arguments; and adopting the current working hypotheses of those researchers in the field with the broadest oversight, at least until contradictory evidence gathers enough weight to change the consensus working hypotheses.

cheers
 
I said transducers, not combined transducers and enclosures. Above low bass frequencies enclosures are well-damped and sealed - simple acoustical systems with a single low frequency resonance that, if possible, is designed to be below the used bandwidth. Woofers and subwoofers are the issue.
Sealed woofers are simple single resonance systems. Bass reflex adds one superimposed Helmholtz resonance and the combination amplitude and time performance is predictable within limits, as discussed below. More elaborate systems add more coupled Helmholtz resonances.

We are on the same page here. Thanks for confirming that conventional loudspeaker concepts with multiple resonators in the lower bass region are not minimum-phase. I agree that this includes the most commonly used vented designs as well, as we have two resonators and a resulting plateau of group delay. If this is audible, or if decay issues/coupling with resonators in the room, are the defining aspect of how we perceive lower bass impulses, is a question left open.

Me thinks, your condition that woofers should ideally be in a sealed enclosure with fs below the used bandwidth, is not the most common solution. Rather hard to find in reasonably-sized loudspeakers.

At sound levels sufficiently high to generate turbulence the effective mass is reduced and the result is a different system tuning for every sound level, and this is not equalizable. The resonance frequency varies with sound level, but one supposes, and experience shows, that over a range of useful sound levels these are not audible problems.

Can confirm the existence and scale of this problem, usually it is easily visible when increasing the SPL in steps of +5 or +10dB and see the true Helmholtz resonance frequency shifting (which I prefer to identify in the nearfield measurement of the active woofer, i.e. the frequency at which its diaphragm comes to a standstill). Same compact sealed design are also prone to this shift, probably due to overly high compression. An indirect variant to identify it is just looking at the dynamic FR:

Revel F35_Compression.png


This Revel model is a floorstander afaik. We see compression occurring in the 80-150Hz region as well as dynamic expansion around 50-60Hz. The latter is indicative of an increase in port fs with rising SPL. Imagine a kick drum sound with approx 50Hz fundamental and 100Hz first harmonic. The spectrum of its transient sound will shift in relative 1.5dB.

Does not sound like a lot, but for synthesized kick drums sounds, a significant drift in resonance frequency hence lower cutoff frequency and group delay depending on the actual SPL, can become am audible problem as it comes with other implications. Cannot deliver solid evidence that this is the single reason, as compression and port noise occurs simultaneously, but most of speakers it tried showing this behavior, also ´don't really kick´ in the lowest bass region. EDM sounds are either boomy or hollow, if that makes sense.

Sealed vs. reflex arguments tend to arise with bookshelf loudspeakers, where the resonance is within the musical frequency range. But even these cease to be problems if crossed over (high pass and low pass) to a subwoofer, where, with good design the resonance can be so low that it is in the feeling rather than hearing frequency range.

Would not consider this to be a really common solution in stereo reproduction. A lot of floorstanders show a similar behavior, see example above.

The advantage of passive radiators is that they are truly moving masses, and any non-linearity is in the suspension of the diaphragm. Even so, at very high sound levels it will become non-linear. These are good alternatives, but cost more than ports.

Fully agree from technical point, although we should note that the more compact designs have become incredibly cheap, see Flip Essential 2. If they are competently integrated, they can deliver astonishingly accurate kick drum sounds just until reaching their SPL capability which with the best specimen is pretty much defined like a limiter when reaching the mechanical xmax.

Again, it is a matter of the whole system driver + passive radiator + enclosure. Have come across combinations which produce hefty booming. Try Marshall Middleton I, for example.

I think it is reasonable to assume that one can expect significant non-ideal acoustical output from small multi-reflex systems that are designed to maximize acoustical output over a narrow bandwidth with steeply declining output at the high and low frequency extremes - meaning transient misbehaviour. But, let's face it, the intended audience for those products is assumed not to be a critical one. Right?

I agree to the assumption of sound quality, but I am not sure people who bought such things in the past would agree. They were pretty expensive, sold with a lot of claims about best sound quality through research. The subwoofer is as tall and deep as astonishingly broad bandwidth (I think this was the whole point, to have a woofer with a pair of simple 5" drivers working from 40 to 250Hz with maximum efficiency). Pretty interesting cascading design, they even named the air chambers ´spring A/B/C´ and ´summing chamber´:

bose-acoustimass-10.jpeg
 
However I usually get stumbled up on step-response and transducers playing 180 degrees out of phase.
When the output of the speaker inverts the signals in frequency bands corresponding to the driver’s bands, then in a mathematical sense… if we difference the input (signal) with output (SPL) it seems like fidelity has been lost.

So - what do you read from isolated time domain responses ? ;)

STEP1.png STEP2.png
 
So - what do you read from isolated time domain responses ? ;)

View attachment 467996
This “Step-1” to me looks like the tweeter is 0.3- 0.4 seconds ahead of the MR. So ~4” ahead, and not time coherent if it is a 2-way.
Or if it is a 3-way then the MR must be inverted… however I would expect it to be deeper if it was a 3-way.


So - what do you read from isolated time domain responses ? ;)

View attachment 467996 View attachment 467997
The “Step-2” has the tweeter in phase, but the MR is 180 out, and then the woofer is back to being “in phase”… I am assuming that this is a 3 way.
 
This “Step-1” to me looks like the tweeter is 0.3- 0.4 seconds ahead of the MR. So ~4” ahead, and not time coherent if it is a 2-way.

My reading is rather 0.3ms, not seconds ;-)

A runtime difference of 0.3ms might look ugly on the step response graph, same to the inverted midrange, but I see no evidence that these are audible if it is just phase shift or group delay between midrange and tweeter (identical for all channels). On the other hand, a potential delay of the lowest bass frequencies caused by a steep subwoofer lowpass or port resonator, might cause audible group delay but is difficult to identify on such a graph spanning over just 10ms.

Step response is IMHO rather useless for direct predictions on audible aspects.
 
This “Step-1” to me looks like the tweeter is 0.3- 0.4 seconds ahead of the MR. So ~4” ahead, and not time coherent if it is a 2-way.
Or if it is a 3-way then the MR must be inverted… however I would expect it to be deeper if it was a 3-way.

The “Step-2” has the tweeter in phase, but the MR is 180 out, and then the woofer is back to being “in phase”… I am assuming that this is a 3 way.
That is not much information from the step responses.
And I assume @pma was more asking along the line of which speaker has “kicking bass" and which has not or other results in respect to SQ and preference. If phase/step response is so important that should be easy.

My guess is the step resposes are from similar speakers ( same woofer in same enclosure obviously). There is an additional 180 degree phase rotation in the step1. The woofer is inverted and the crossover has higher order for step1 ( LR2 ->LR4?).
All this doesn’t tell much about sound or quality or preference unless one performs an FFT to get a magnitude FR ;-)
 
Step response is IMHO rather useless for direct predictions on audible aspects.

It is more relevant for rooms, subwoofer integration, and DSP ... and less relevant for speakers. A quasi-anechoic step response does not tell you much about a speaker unless it is truly horrible. I think of the step response in two parts - the head and the tail. The "head" (or the main impulse) should look nice and tight with no pre-ringing. But even if it isn't, it doesn't really matter unless the time alignment is really far off. The tail should be nice and short, otherwise it means there is a lot of post-ringing. These aren't real concerns in most typical speakers, but if you DSP, then any side effects from excessive correction in the frequency domain appears in the time domain, in the form of pre-ringing or post-ringing. So it is very important to check the step response.
 
I asked the same question earlier in the thread, but got no response.
Sorry I didn't answer you sooner. Are you aware of all the studies conducted by these disciplines, including physics, mathematics, engineering, and neurophysiology, and have you created speakers that respect or utilize this knowledge? Have you ever listened to speakers designed and built for time and phase coherence in controlled systems? Just answer yes/no to these two questions.
Thanks
 
This Revel model is a floorstander afaik. We see compression occurring in the 80-150Hz region as well as dynamic expansion around 50-60Hz. The latter is indicative of an increase in port fs with rising SPL. Imagine a kick drum sound with approx 50Hz fundamental and 100Hz first harmonic. The spectrum of its transient sound will shift in relative 1.5dB.
This seems like reasonable performance from a pair of 5 1/4" woofers in Revel's entry level floorstander, the F35. If one wants serious high level kick drum sound this little speaker will be stressed, but if high passed and supplemented by a subwoofer, things would be more satisfactory. The port would not be energized, and the system can play louder. Otherwise, according to Amir's and Erin's data, this is a respectable speaker at the price. As always there are choices, and knowledge and data help make them.
 
So - what do you read from isolated time domain responses ? ;)

View attachment 467996 View attachment 467997
I would guess that the first is a 3 way with 1st order or no xover between low and mid drivers ( mid hi pass from a separate enclosure ) all drivers are in phase and I don't know about the tweeter slope.
The second looks to have limited hi freq. output so probably a single driver.
 
Back
Top Bottom