@Nowhk
Some quotes:
Martijn Mensink of Dutch & Dutch
No DSP filtering above the Schroeder frequency! One must be careful with automatic room-correction systems. More often than not they make things worse. What a measuring microphone picks up is not what two ears and a brain hear.
Direct vs. indirect sound.
• The 8c sounds natural and neutral anywhere in the room. Sit close and listen into the recording - you are there. Sit farther away and hear more indirect sound with perfect timbre because of even dispersion and flat power response for a 'they are here' experience.
• No voicing required. Other loudspeakers usually require voicing. Based on listening to a lot of recordings, the tonal balance of the loudspeaker is changed so that most recordings sound good. Voicing is required to balance differences between direct and off-axis sound. The 8c has very even dispersion. It is the first loudspeaker I ever designed that did not benefit from voicing. The tonal balance is purely based on anechoic measurements.
http://www.6moons.com/audioreviews2/dutchdutch/2.html
Floyd Toole:
We adapt to several aspects of the rooms we listen in, allowing us to hear through them to identify sound qualities intrinsic to the source itself, and to identify the correct direction and distance of the source in spite of a massively complicated sound field. We need to have measures of the limits of this adaptation, at what points and in what ways our perceptual processes can use some help. The following are a few salient points to ponder. • Voices, musical instruments, and other sounds are instantly recognizable in many rooms and through seriously flawed communication channels. We seem to be able to separate a spectrum that is changing from one that is fixed. What range of spectral variation can we adapt to, and at what level, deviation, and so on, is it necessary to intervene manually? • Once we adapt to the room, subtle differences in quality among a group of loudspeakers are recognizable, and the distinctions are retained when the comparison is done in other rooms.
http://www.wghwoodworking.com/audio/loudspeakers_and_rooms_for_sound_reproduction.pdf
John Watkinson
We live in a reverberant world which is filled with sound reflections. If we could separately distinguish every different reflection in a reverberant room we would hear a confusing cacophony. In practice we hear very well in reverberant surroundings, or better than microphones can, because of the transform nature of the ear and the way in which the brain processes nerve signals.
What the above people are saying (very few here believe it!) is that what you hear is not represented by the quantities of frequency components that are found in the bins after you have taken a Fourier transform in a room. That in fact human hearing combines frequency analysis with time domain analysis to obtain a true picture of the audio 'scene'. They are saying that just as your vision can see past the surroundings to focus on an object, you can focus on the source of a sound while hearing past the reverberation.
On a laptop screen the dumb measurement looks like chaos, and most audio people think that that view translates into what we must be hearing. Something must be done about it! At the backs of their minds, they are slightly puzzled that they just don't seem to notice huge frequency response changes as they walk about or people sit down in front of them at concerts, even though the measurements would change significantly. They also know that if they fed those changes into a graphic equaliser, they would hear them clearly. Strange...
Here is an example of where someone (our very own Amir) tries to square the circle of this phenomenon!
http://www.madronadigital.com/perceptual-effects-of-room-reflecti
He still prefers a frequency response explanation, however, concluding that if we don't hear acoustic comb filtering it must be being masked by other things or that our hearing just smooths over the comb filtering anyway. It's still a mystery why it sounds so bad from a speaker, however. Oh well...
The alternative would be to accept that acoustic comb filtering is one side of a two-sided coin that our brains are capable of seeing both sides of simultaneously. We simply don't hear the comb filtering as part of the source, because it corresponds perfectly with time domain phenomena that tell us why it is there. Our brain automatically 'reads' the situation.
But notice that if we fiddle with the source itself (changing its EQ for example), we will hear a mutilated source - even if it gives us a flat Fourier transform at the listening position.
As the D&D guy above says, if the speaker is genuinely neutral (in terms of its dispersion as well as everything else) you don't need to 'voice' it. For the rest of us, we need some gentle in-room EQ - that can be summarised as 'baffle step compensation' and variants thereof. This isn't a genuine 'correction' but is a fudge that gives us an acceptable subjective balance. Using DSP, this (and the crossover filtering itself) can be performed while maintaining overall linear phase from the speaker itself, so the direct sound still works in terms of its timing, even if its EQ is slightly non-neutral.