A deep dive presentation on the fundamentals of "proper" Digital Room Correction (DRC). Includes hands-on DSP FIR Filter Designer demos using Acourate and Audiolense.
Having participated in many audio forum discussions, having watched online videos on Digital Room Correction (or DRC), and having reviewed over a dozen DRC products over the past 11 years, I have come to two conclusions. One is that there is considerable misunderstanding about DRC, how it works and even what problems DRC is trying to solve. And, just as important, understanding what is possible using the SOTA of DRC. I hope you find the content educational and practical.
First of all thank you for taking the time to make this video and explaining your process with Acourate and other software which you – as I understand it – offer as a commercially available service. No excuse for people anymore to not know about the fundamentals.
As much as I tend to agree with JJ's "laws" there has been no formal verification, i.e. scientific studies. Such a study would need to include many different configurations, small rooms, large roooms, different reverberation times, different ratios of direct and reflected sound, reflections angles, timing and spectrum, stereo, multichannel, etc.pp.
Regarding your approach, some remarks:
Latency with ultra long FIR filters: The signal has to pass the whole filter first before sound comes out at the end. This prohibits such filters for any application that requires (near) realtime processing like gaming or really any video streaming. One could build a video buffer to sync audio and video but at this point such a solution does NOT exist in consumer AV space (and the gaming problem remains).
Measuring the (quasi) anechoic speaker response: This is virtually impossible even when using windowing as room boundaries, furniture (seat back!) and objects are too close to the microphone. The magnitude response is skewed and the resolution is coarse.
Psychoacoustic filtering: This is something Uli Brüggemann introduced without providing any information how he arrived at it. It seems more like an educated guess that this is closer to what we hear but it is certainly not backed up by any scientific study (I know of).
"We don't hear dips (as much)": While I agree they greatly contribute to perceived overall timbre. Simply ignoring them (by visually filling them in) isn't probably helpful in that regard.
Audibility of pre-ringing: This is not a well researched topic either. Threshold is probably depending on frequency, signal and specific room reverberation time (masking effects).
Single mic position: We have two ears, so how do other points around a central position look like? You're only showing what looks like heavily smoothed measurements. Did these points improve too? Or are they worse? What about multiple seat optimizations?