• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Step Response: Does It Really Matter?

pma

Major Contributor
Joined
Feb 23, 2019
Messages
4,616
Likes
10,803
Location
Prague
This has nothing to do with time resolution of PCM or subsample delays. It may read (if the detail is there) on how to calculate acoustic impedance.

I am arguing about nothing, it was my first post here in this thread. (OK, you edited the post) I was just interested if you knew the author, for the reason that you mentioned the acoustical impedance measurement. The world is not limited to PCM subsample delays.

https://www.aes.org/e-lib/browse.cfm?elib=6008
 
Last edited:

j_j

Major Contributor
Audio Luminary
Technical Expert
Joined
Oct 10, 2017
Messages
2,282
Likes
4,792
Location
My kitchen or my listening room.
I am arguing about nothing, it was my first post here in this thread. (OK, you edited the post) I was just interested if you knew the author, for the reason that you mentioned the acoustical impedance measurement. The world is not limited to PCM subsample delays.


Sorry, I confused posters there for a minute.

I do not know that work (well, I can't read the actual paper), but I do know what I suspect are very similar results described in English (sorry, I'm your typical monolingual American).
 

pma

Major Contributor
Joined
Feb 23, 2019
Messages
4,616
Likes
10,803
Location
Prague
Sorry, I confused posters there for a minute.

I do not know that work (well, I can't read the actual paper), but I do know what I suspect are very similar results described in English (sorry, I'm your typical monolingual American).

That's fine :). Now to interchannel subsample delays, to stay on-topic. Yes the resolution is great and "almost" unlimited, I just made a quick measurement of one channel delayed against another by 20 metres of the link cable (in fact propagation delay + R(out)*C(cable) time constant, total about 1us). Years ago I did similar experiments so I am not surprised. In fact there is not much difference from analog view of the signals. (Soundcard output, simultaneous sampling, Fs=48kHz, one channel recorded back to soundcard in the loop with 1m, second with 20m. Recorded file captured by scope, from soundcard analog output, blue channel corresponds to 1m, red channel to 20m).

20mcable_delay.png
 
Last edited:

KSTR

Major Contributor
Joined
Sep 6, 2018
Messages
2,799
Likes
6,263
Location
Berlin, Germany
@pkane's DeltaWave does a good job for subsample shifts. Shifting a file first by 0.333 samples, then a second time by 0.667, gives 1.000 sample total shift. After rotating left by one sample the difference to the original file is basically an infinite null, with the exception of Nyquist (fs/2) ripple present, with some peaks every 512 samples (the FFT block size used for shifting, it seems)
 

j_j

Major Contributor
Audio Luminary
Technical Expert
Joined
Oct 10, 2017
Messages
2,282
Likes
4,792
Location
My kitchen or my listening room.
@pkane's DeltaWave does a good job for subsample shifts. Shifting a file first by 0.333 samples, then a second time by 0.667, gives 1.000 sample total shift. After rotating left by one sample the difference to the original file is basically an infinite null, with the exception of Nyquist (fs/2) ripple present, with some peaks every 512 samples (the FFT block size used for shifting, it seems)

There are a lot of ways to do subsample or variable-length resampling. DC and Nyquist present issues for real signals when using FFT methods. Proakis' method of oversampling and filtering in the oversampled domain is pretty much bulletproof, and gives you approximately an SNR of 'n^4' where n is the oversampling rate. 64 is almost bulletproof, and 128 pretty much exceeds any sort of real signal. At that rate, you can simply use linera interpolation between the two nearest samples at 128 times the sampling rate. Of course, only calculate the samples you actually need.
 

KSTR

Major Contributor
Joined
Sep 6, 2018
Messages
2,799
Likes
6,263
Location
Berlin, Germany
So, in a practical example using Adobe Audition, one would upsample to 64x without filter (as that would impose a FR change), then calculate the required shift count and implement the remaining fractional shift (in upsampled sample periods) by linear interpolation, then downsample again without filtering?
I just tried (without the interpolation, just shifting the 64x by 7 samples) but get odd results: Result is polarity-inverted and reduced in level by some 11dB
 

KSTR

Major Contributor
Joined
Sep 6, 2018
Messages
2,799
Likes
6,263
Location
Berlin, Germany
^ Probably not, but when filtering is applied (at the upsampling stage and/or the downsampling stage), I get the expected HF drop near fs/2 from the filter which is not what we want.
At upsampling, Audition does apply a filter to remove the images above fs/2 even with no pre/post filter enabled, which seems correct.
I might try with SoX and see if I get better results.
 

j_j

Major Contributor
Audio Luminary
Technical Expert
Joined
Oct 10, 2017
Messages
2,282
Likes
4,792
Location
My kitchen or my listening room.
So, in a practical example using Adobe Audition, one would upsample to 64x without filter (as that would impose a FR change), then calculate the required shift count and implement the remaining fractional shift (in upsampled sample periods) by linear interpolation, then downsample again without filtering?
I just tried (without the interpolation, just shifting the 64x by 7 samples) but get odd results: Result is polarity-inverted and reduced in level by some 11dB

You must always filter. Always. See Proakis' signal processing paper. Of course if your upsampler filtered, you're set.

Also remember dealing properly with gain when doing upsampling and downsampling. But there should be no absolute polarity inversion. If you used a midfrequency sine way, you should still have amplitude preserved. Not sure offhand what's gone down with that. The process works like a champ, that part I do know. Any "inversion" would be due strictly to the time delay putting you at a phase where the signal is negative, rather than it "inverting" per se. For partial samples of delay, this should not be possible Consider the period of the sine wave at fs/2.
 
Last edited:

KSTR

Major Contributor
Joined
Sep 6, 2018
Messages
2,799
Likes
6,263
Location
Berlin, Germany
It turns out that Audition cannot handle extreme factors like 64x not foreseen by the developers which in general did a excellent job. CoolEdit/Audition for a long time was the only editor that gets 99% of things right, and still doesn't have much competition these days.

Polarity inversion bug pops up somewhere above 16x (didn't check at which exact integer multiple of fs this is happening. Level reduction is only happening when no pre-filtering is applied in the downsampling. As the upsampled data is already filtered I had assumed that downsampling without filtering would be simple decimation but for some reason this breaks...

Sidenote closed, now back to topic.
 

ctrl

Major Contributor
Forum Donor
Joined
Jan 24, 2020
Messages
1,633
Likes
6,241
Location
.de, DE, DEU
I think step reponse is usefull which time it need to reach 0 and how much % it later reach max to see in this value if a speaker is precise and can produce good stereo width or not.
Sorry, but this is not correct either!
Let's consider two ideal wideband chassis (minimum phase, no excess phase). The following step responses are available for these:
A) 1612724439483.png B) 1612724461515.png
Chassis A, there the voice coil comes to rest after about 2-3ms.
Chassis B, there the damped oscillation is not in a constant state even after 4ms.

The differences are caused by the different frequency transmission range of chassis A and B:
A) 1612778969021.png B) 1612778995198.png

Only if the frequency transmission ranges of the chassis are completely identical, one could read off conclusions about the decay behavior of the chassis. This is unlikely to be the case in reality when comparing loudspeakers.

In order to be able to make a statement about the decay behavior of a loudspeaker (to get a "value if a speaker is precise"), the measured impulse response is converted into the cumulative spectral decay (with time-based or oscillation-based axis).
In this way, it is much easier to make statements about the decay behavior of loudspeakers - provided that reflections are suppressed via gate.


The same applies to the evaluation of multi-way loudspeakers on the basis of the step response.
Let's consider two ideal two-way loudspeakers with identical filter order but one with a crossover frequency of 5kHz (C) and one of 1.5kHz (D):
C) 1612791099387.png D) 1612792469428.png
I don't know what the sources for your conclusions were, but if someone wants to tell you in the future that speaker C is more precise and can reproduce transients better because, for example, the woofer "reaches the maximum" faster in the step response, you know that this is nonsense.
Due to the lower crossover frequency of speaker D, there the oscillation periods of the woofer ("starting" at 1.5kHz) are longer, hence the slower rise and the "later" maximum.


...produce good stereo width or not
On the subject stereo phantom image, width, elevation, spaciousness, envelopment,... you will find much useful information in Floyd Toole's book "Sound Reproduction" ... and no, you cannot infer the stereo image width established by loudspeakers based on the step response
.
 
Last edited:

pkane

Master Contributor
Forum Donor
Joined
Aug 18, 2017
Messages
5,726
Likes
10,424
Location
North-East
There are a lot of ways to do subsample or variable-length resampling. DC and Nyquist present issues for real signals when using FFT methods. Proakis' method of oversampling and filtering in the oversampled domain is pretty much bulletproof, and gives you approximately an SNR of 'n^4' where n is the oversampling rate. 64 is almost bulletproof, and 128 pretty much exceeds any sort of real signal. At that rate, you can simply use linera interpolation between the two nearest samples at 128 times the sampling rate. Of course, only calculate the samples you actually need.

DeltaWave uses the FFT method.
 

FeddyLost

Addicted to Fun and Learning
Joined
May 24, 2020
Messages
752
Likes
543
I would not like to be a prophet of phase linearity, but i'd say currently for decent active monitors or any high level digital speakers it's a must. It can be done almost at no additional cost, so why not?
For passive customer speakers it's great quality, but just after all other properties are as good as it gets. Classic approach is too "expensive" for mass market and not very valuable in typical use case.
In case of cost-no-option high end speakers it might be beneficial in direct comparison, but also - after all other qualities. It will not overcome bad FR or high distortion.
 

thewas

Master Contributor
Forum Donor
Joined
Jan 15, 2020
Messages
6,904
Likes
16,937
I would not like to be a prophet of phase linearity, but i'd say currently for decent active monitors or any high level digital speakers it's a must. It can be done almost at no additional cost, so why not?
Linear phase including the bass means a high latency which is often not acceptable for monitoring or even enjoying combined video-audio.
 

ernestcarl

Major Contributor
Joined
Sep 4, 2019
Messages
3,113
Likes
2,330
Location
Canada
Linear phase including the bass means a high latency which is often not acceptable for monitoring or even enjoying combined video-audio.

The time cost for linearizing the bass is not worth-it in many applications, but "300ish" or 500Hz to 20kHz can be quite acceptable (2.5 - 20ms). Offsetting the tweeter back is also also another way to reduce the time delay -- for example, Presonus Sceptre S8 and Meyer Sound Amie horn designs have their tweeters' positioned way further back than in many other monitors -- dunno how much actual time cost in delay that practically saves, but it surely must be able to save some extra DSP processing -- maybe.

I've been experimenting with different convolution settings in order to try to reduce this latency... and I found multichannel convolution significantly more processor intensive -- as you add more channels -- where phase "correction" is arguably more important/useful for system "optimization" due to the summing of multiple speakers possibly having different phase curves/profiles.
 
Last edited:

j_j

Major Contributor
Audio Luminary
Technical Expert
Joined
Oct 10, 2017
Messages
2,282
Likes
4,792
Location
My kitchen or my listening room.
Alternatively you can simply compensate each driver, and the crossover as well, digitally, using DSP and it all works like a charm.
 

FeddyLost

Addicted to Fun and Learning
Joined
May 24, 2020
Messages
752
Likes
543
Linear phase including the bass means a high latency which is often not acceptable for monitoring or even enjoying combined video-audio.
Low-latency mode like in Kii for live monitoring (exact phase here is not so important), lipsync for customer AV case.
I mean it might be one of available presets, for example.
Or limit phase linearisation to i.e. midbass, when pinpoint localisation at least troublesome to our earbrain and phase error will not really matter.

The time cost for linearizing the bass is not worth-it in many applications
In critical application like mastering studio it will be done in complex with subwoofers and delay lines.
When we talk about single active speaker or even soundbar, we shall not forget cost-to-benefit analysis.
Typical 2-way compact monitor compromises bass so much (oh, and the room!) that linear phase in bass is not more than pig lipstick ...
 

thewas

Master Contributor
Forum Donor
Joined
Jan 15, 2020
Messages
6,904
Likes
16,937
Low-latency mode like in Kii for live monitoring (exact phase here is not so important), lipsync for customer AV case.
I mean it might be one of available presets, for example.
Yes, as a mode it surely can be done, my above reply was a reason why it has a cost/disadvantage when fully used.
Or limit phase linearisation to i.e. midbass, when pinpoint localisation at least troublesome to our earbrain and phase error will not really matter.
On the other side unfortunately the biggest audible differences for non pathological constructions are in the bass region where group delay can exceed known psychoacoustic limits.
 

thewas

Master Contributor
Forum Donor
Joined
Jan 15, 2020
Messages
6,904
Likes
16,937
Alternatively you can simply compensate each driver, and the crossover as well, digitally, using DSP and it all works like a charm.
Although drivers should be better placed geometrically in a way that their sound generation centres don't have a significant distance differences as correcting differences per DSP delay works only for one angle and compromises the generated radiation pattern and reflected sound.
 
Top Bottom