OMG I was totally doing it wrong in Audactiy lol. I get the reason for a delay with convolution when doing real time playback (because it needs to work so many samples in to the "future" so that it can process the entire length of the desired impulse response), but it seemed strange that the the delay was manifesting in my processed audio signal, because the impulse should get "baked in" to the whole audio stream. I tried
another free convolution plugin and the results were the same with the same settings (after accounting for the second plugin applying -3dB automatically).
I eventually twigged that it was because the plugin had a "mix" setting that was described as "wet" or "dry", and it was defaulted at 0.5 (50%). I tested 0.0 which looked like the "difference" of the impulse, and 1.0 was the source stream unaffected, and adding them together gave the same result as processing at the default 0.5. I figured the default 0.5 must therefore be correct. But actually, it's just mixing the source stream with the processed stream at a certain ratio, but for what I am doing I want to see just the processed stream entirely. So to sum the problem up, mix=0 isn't the "difference", it's the "result". Doh!
Now it's all come together. Because I was also mucking around with the "IR Window" settings in REW, because these can (and seemingly should) be applied to the exported IR. It seems by default REW has windows Left = 100ms and Right = 500ms, but without ticking "Apply IR Window before Export" it exports a wav with Left ~1 seconds and Right ~4 seconds, hence the huge "delay". So I tried a few different settings like a token amount left and sufficient right (10,120), no left and sufficient right (0,100), and 1 cycle left and 4 cycles right (at 33Hz that's 30.3,121.2). Here's the result of those convolutions;
All responses look the same from the moment of the "dirac pulse" as OCA calls it (actually it just seems to be called an impulse). I think it makes sense to have the impulse at 0, because the impulse contains all frequencies and we want all frequencies to be output immediately (because effectively the impulse is the original source). The right window needs to be sufficient to contain the delayed low pass filter, and looking at the Impulse graph on dBFS it is pretty obvious when the LPF has blended back in to the noise, so the right window needs to be a bit beyond that.
So in that convolved response (of a continuous 33Hz 0dBFS tone), the initial peak is -4dBFS, which seemingly coincides with the +4dB offset that OCA chose in his video. The repetitive peaks are about -8.5dBFS. When I try to rationalise what I'm seeing, it looks like it starts of with a single cycle at "high" volume, and then settles in to a phase shifted signal at "low". But also the "high" volume cycle is volume reduced. I guess the initial "high" cycle is to build up the energy in the room quickly, but not so fast that it becomes "boomy". Then the phase shift kind of tempers the energy build up due to the initial cycle, and then it settles in to a reduced volume that possibly sounds as loud as it is meant to once the room modes are at full power.
So apart from the initial 1.5cycles, isn't the signal basically the same as a -8.5dB PEQ? Because actually this VBA impulse response does not seem to do anything to tame room decay (beyond what a PEQ could do), unlike an actual DBA that outputs a cancellation signal. The first cycle is the difference to a PEQ, and maybe it is a good thing?! I reckon the louder first cycle would help give the impression of "fast bass", because the energy in the room would build up quicker than it would with PEQ only method.
Whilst I was writing this post I realised there were some similarities between the manual "delay and invert" I started with, and the convolved response. I have highlighted the similar points in the following image.
- 1 and 2 are the "high" peaks of the first cycle.
- 3 is the phase change point
- 4 is the first peak of the "low" signal, but it may not be the same as the following peaks due to its proximity to the "transition" zone
- 5 and 6 are when the "low" signal has settled in.
Tomorrow I will have a go at convolving different frequencies and tone pulses with the same IR, like I said I would in a previous post.