This is the second in a series of follow-up posts to https://www.audiosciencereview.com/...-of-lokki-bech-toole-et-al.27540/#post-950580, this time focussing on bass perception with a brief digression on subwoofers, since there's relevance to (reproduction of) concert hall acoustics. I’ve included links for further details, selected relevant quotes, and some of my own notes in italics.
Bass and subwoofers
Borenius 1985
https://secure.aes.org/forum/pubs/conventions/?elib=11465
Holman's description in Surround Sound: Up and Running was Swedish (presumably national) [but actually Finnish] radio moving a subwoofer around, using the most sensitive program material and the most sensitive listeners, and 80 Hz was 2 standard deviations below the mean
Later comment by Nastasa 2022 (below): “The task of the listeners was to adjust both the crossover frequency and the time delay between the loudspeakers until they found the threshold at which the sound image becomes"messy and disturbing". The author concludes that, based on the crossover frequency threshold values, there is little directional information contained in the sound signal below 200Hz, and none at all, below 100Hz. Furthermore, it is noted that speech signals are highly sensitive to delay errors, even with crossover frequencies below 100Hz, and that music signals on average tolerate much higher delay errors.”
Welti 2002
https://www.harman.com/documents/multsubs_0.pdf
The classic multisub paper quoted around the Internet, suggesting that for rectangular rooms with seating in or near center, in terms of frequency response smoothness across seating positions, there were diminishing returns beyond four equalized subwoofers, placed symmetrically in corners or mid-wall. Two mid-wall subs did nearly as well.
Limiting assumptions were rectangular room with monaural equalized subs along walls, primary goal was minimizing response variation over seating area, secondary was maximizing LF output.
Welti reviewed nodal placement not exciting relevant (primary and +2 order modes), symmetric placement WRT nodes resulting in relevant (primary and +2 order) mode cancellation
Welti 2004
https://1drv.ms/u/s!AnprNKFcgo3wgQ1pPIKNI5c5_vZe
Subjective Comparison of Single Channel versus Two Channel Subwoofer Reproduction
“Audible differences for music signals were found only for the comparison of single channel front center subwoofer vs. two channel subwoofers at +/- 90 degrees.”
Later comment by Nastasa 2022: “Welti (2004) has set out to compare the audibility of two channel versus single channel bass reproduction in a small room environment (6.4 m x 7.3 m. x 2.7 m). Four different subwoofer configurations were tested, namely: one mono subwoofer placed at centre front; two summed mono subwoofers located at left and right front corners; two discrete left and right channel subwoofers located at the same position as the previous configuration; and two channel subwoofers located at +/- 90° relative20 to the listener. The stimuli used comprised of four short program loops and three music programs. The test was conducted using an ABX triangle method and in total, f ive trained listeners participated. The author concluded that the only noticeable difference occurred when comparing the centre mono subwoofer and the stereo configuration located to the left and right of the listener. Contrarily, no difference could be heard when comparing the front mono or stereo configurations. While Welti’s test partly dismissed that audible differences would be heard across the different subwoofer configurations, the author did suggest that this test does not prove in any way that there are no merits to stereo subwoofer reproduction”
Fazenda 2004
https://eprints.hud.ac.uk/id/eprint/3538/
“The perception of resonances is highly dependent on the temporal content of the excitation signal.”
“If level differences are removed, subjects are still sensitive to effects introduced by resonances. This fact supports the idea that any low frequency correction method should concentrate on reducing the temporal resonant characteristic of the modes. A technique that solely addresses magnitude frequency irregularities is likely to leave evidence of the temporal effects, especially after the removal of the excitation stimulus. In some cases, in the presence of fast tonal transients, the effects may even be detected simultaneous to the input stimulus. Any attempt to reduce this effect is only likely to be successful if the temporal characteristic of the resonance is modified by reducing its decay time.”
Griesinger 2005
http://www.davidgriesinger.com/asa05.pdf
“It is assumed that the presence of room modes makes localization impossible. The statement that low frequencies cannot be localized is easily shown to be true when a sine tone as used as a signal. However it is equally easy to show that an broadband – such as low-pass filtered noise or a low-pass filtered click – is easily localized in most rooms. In free field we detect the azimuth of sounds at low frequencies by detecting the time differences between the zero-crossings of the pressure waveform at the two ears. Human perception is particularly sensitive to sounds with sharp onsets. We tend to localize the beginnings of sounds. We can localize an impulsive sound at low frequencies in part because it takes time for standing waves to establish themselves.“
“In general, we want to put the subwoofers on either side of the listeners! If the back wall is closer to the listening position, put the subwoofers on the back wall. This will maximize both the bass uniformity and the spatial impression.“
“Putting the front speakers along the long wall of a small room can be helpful, as can a somewhat asymmetric speaker layout. In many rooms it can be helpful to place low frequency drivers at the sides of the listening position rather than at the front of the room. “
“Although widely held to be unnecessary or impossible, reproduction of envelopment at low frequencies in small rooms can be achieved, particularly with a multi-channel sound system. Successful results depend on: 1. having an input recording that includes at least two channels where the reverberation is independently recorded, and thus uncorrelated with the other channels. 2. The presence of independently driven room modes that overlap in such a way that the lateral pressure gradient of one mode combines with the pressure of another. In the case of two channel stereo, the best results usually occur when an asymmetric lateral mode (driven by the difference signal between the loudspeakers) creates a pressure gradient at the listening position, and a medial mode (usually a front/back mode) supplies the pressure. Ideally both modal systems should be broad enough in frequency that there is a substantial frequency overlap, as well as a spatial overlap.”
Modes or standing waves have anti-nodes where pressure is maximal (and velocity minimal) and nodes where pressure is minimal (and velocity maximal). Velocity=pressure gradient, so the asymmetric lateral mode is the first-order width mode, where listeners positioned near the midline will be sitting in this mode’s node
Griesinger later in 2018 (see below) remarks “I re-read [this “AES preprint], and was mystified”
Miller 2005
http://www.filmaker.com/papers/RM-2SW_AES119NYC.pdf
“Binaural detection by humans in the octave 45~90Hz is physiologically possible; Music/movie VLF content exists (no new recording/mixing procedures are demanded); High quality reproduction implies two-channel bass management and two subwoofers”
“In demonstrations with presentations of this paper at the 23rd VDT Tonmeisters in Leipzig, November 2004, and to the combined Acoustical Society of America and Canadian Acoustical Association in Vancouver, May 2005, the descending 13 step test 100~25Hz was played using two 18in (45cm) drivers at the mid side wall positions. As above, each step was played first by mixing tones differing by 0.5Hz electrically so as to drive the subwoofers in monaural, then unmixed to reproduce VLF binaurally. At VDT by a simple show of hands, nearly all of approx. 40 attendees reported at 100Hz perceiving no “swirling motion” in monaural, but a definite “impression of motion” in stereo. Half the attendees heard “motion” down to 50Hz, 1/3 heard to 45Hz, and 1/5 heard to 40Hz. At ASA/CAA, again half of approx. 70 attendees reported perceiving “motion” to 50Hz, 1/3 heard to 45Hz, and ¼ heard down to 40Hz. “
Suggests varying sensitivity to stereo/spatial low bass
Welti, Devantier 2006
https://audioroundtable.com/misc/Welti_Multisub.pdf
Sound Field Management
Discussed by Toole in second edition of Sound Reproduction and at https://www.audioholics.com/room-acoustics/history-of-multi-sub-sfm
Wilson 2006
https://secure.aes.org/forum/pubs/conferences/?elib=17270
"Use of two subwoofers placed to the left and right of the listener and playing left and right LF signals allows presentation of spatial information.”
Welti 2012
https://www.aes.org/e-lib/online/browse.cfm?elib=16490
Optimal Configurations for Subwoofers in Rooms Considering Seat to Seat Variation and Low Frequency Efficiency
Discussed in third edition of Sound Reproduction, also at https://hometheaterhifi.com/technic...n-interview-with-todd-welti-and-kevin-voecks/
https://www.harman.com/documents/AES_Preprint_8748_color_plots_1.zip
Examined mean spatial variation (MSV) and mean output level (MOL) for varying seating arrangements and subwoofer configurations
Fazenda 2012
https://www.avsforum.com/attachments/jaes_v60_5_perception_modal_control-pdf.2273992/
“It appears that, for high quality critical listening conditions, those systems ensuring a faster decay of low frequency energy are preferred over those attempting a direct “flattening” of the magnitude frequency response.
A significant limitation would seem to be that the results of magnitude equalization demonstrated between figures 2 and 4, also 5 and 6, were surprisingly bad! Still, configuration FB suggested by results from Welti 2002 did not seem to perform significantly better than the surprisingly bad equalized corner subwoofer.
Hill, Hawksford 2013
https://www.researchgate.net/public...ation_as_a_function_of_closed_acoustic_spaces
Low frequency localization in closed spaces depends on location of source, listener, and room characteristics like dimensions and reverberation time
“The key point here [from simulation results] is that as room size increases listeners benefit from a longer duration of uncorrupted ITD phase information, allowing for precise localization (although this is strongly dependent on listener and subwoofer location as well a signal characteristics). This extra time is believed to be critical especially at the onset of a signal where human ability for low-frequency directional discrimination is likely to be at a maximum compared for example to a pseudo steady-state signal where reflections will induce continual impairment.”
“The key point illuminated by the subjective evaluation results is that listening location makes a significant difference in low-frequency localization. In both rooms, the location furthest from the subwoofer exhibited poor localization, with specifically poor performance across all frequencies in the larger room. The location closer to the subwoofer, on the other hand, gave consistently good localization over most frequencies. In addition to benefiting from a longer duration of uncorrupted directional information, the closer listening location has a higher direct-to-reverberant sound intensity ratio, also allowing for improved localization. The closer location did give poor localization at 30 Hz in the small room, but this is as predicted by the simulations since the time for accurate localization gives only three-quarters (at best) of an uncorrupted wavelength at 30 Hz, which is insufficient for accurate localization.”
“It seems the conjecture that humans cannot localize low- frequencies is inaccurate and, in reality, the question concerning localization demands the response: “it depends”.”
Fazenda 2015
https://salford-repository.worktrib...cts-of-room-modes-as-a-functionof-modal-decay
“Perceptual modal thresholds are independent of presentation level, except for thresholds obtained with artificial stimuli below 63 Hz, where a significant effect of level has been found”
“The content in music stimuli has an effect on how well modal problems are detected and leads to statistically significant interactions and differences in the thresholds measured.”
“Perceptual thresholds for modal effects when testing with artificial stimuli decrease rapidly with increasing frequency up to about 100Hz where they appear to level out. For music stimuli, thresholds decrease monotonically with frequency…Average thresholds measured with music stimuli are 0.51s at 63Hz, 0.3s at 125Hz, and 0.12s at 250Hz”
Supports acceptability of increasing decay times with decreasing frequency
Griesinger 2018
https://tonmeistertagung.com/download/tmt30-2018-proceedings.pdf
“Frequencies below about 120Hz behave differently. ITD and ILD can be detected for tones, and in a free field these sounds can be localized in the horizontal plane. The perception of spaciousness and envelopment arises when ITDs and ILDs are detectable but not stable. Room reflections arrive at the listener chaotically from all directions, causing the ILDs and ITDs at the listener’s ears to fluctuate randomly at rates that depend on the frequency of the music and the size and reverberation time of the space. Fluctuations lower than about three Hertz are heard as motion – a kind of wander or swirl. Rates faster than about 20°Hz are heard as a kind of tone or buzz. But between 3°Hz and 20°Hz these fluctuations are perceived as a sense of enveloping space. Fluctuations above 20°Hz are simply perceived as a widening of the sound image.
“These observations are not new. Blauert observed them as a graduate student, and ascribed the perception to what he called “localization lag”.”
“To make pressure differences between the ears you need to find room modes or other methods that produce a lateral pressure gradient that varies with the phases of the recording. At low frequencies these pressure differences can be small, but they need to be heard. Unfortunately most modes in rooms have no lateral pressure gradients, and the gradients of those modes that have them are largest at the nulls of the sound pressure. To be able to hear the differences these gradients provide we need to limit the strength of competing modes. All symmetric lateral modes, the ones that are excited by in-phase signals in the drivers, typically produce maximum pressure at the listening position. In addition all up-down and front-back modes have no lateral gradient, and also usually have maxima at the listening position. All mixed modes, of which there are a great many, have little or no lateral gradients. We need to somehow keep all of these under control.”
“I have found before that putting subwoofers that augment frequencies between 30 and 150°Hz at the sides of the listening position can solve spatial problems in many rooms!”
Hill and Hawksford 2021
Low-frequency sound source localization as a function of closed acoustic spaces
https://repository.derby.ac.uk/item...ation-as-a-function-of-closed-acoustic-spaces
“This research highlights that the difference between the arrival of the direct sound and first reflection to a listener is the primary determinant of localization time. In addition, the average absorption of a space affects localization time, whereby high absorption coefficients (above 0.7) give multiple additional milliseconds of localization time.“
“Where previous work makes gross generalizations (such as nothing under 100 or 200 Hz can be localized in any space), this research has shown that the minimum localizable frequency is a function of the configuration of a space (dimensions, source/listener location, absorption).”
Nastasa 2022
https://aaltodoc.aalto.fi/bitstream...tasa_Anamaria_2022.pdf?sequence=1&isAllowed=y
Starts with excellent review of spatial hearing and low frequency localization
“Larger rooms allow for localisation of lower frequencies, and listeners located closer to a subwoofer exhibit less directional confusion.”
“Localisation abilities decrease when the reference azimuth moves away from the median plane.”
“Humans can reliably detect a change of 10° in the direction [from the median plane] of pure tones with a frequency of over 63.5Hz and octave bands of pink noise with a centre frequency as low as 31.5Hz.”
“ITD is the primary cue utilised by the auditory system to determine changes in the location of a low-frequency sound source.”
Supports low frequency localization in free field, ITD as primary cue/mechanism
@j_j's comments on spatial bass
Fundamentals of hearing 2004
https://www.aes-media.org/sections/pnw/ppt/jj/jj_aes04_ts1.ppt
“Various people have reported, sometimes anecdotally, that above 40Hz (and below 90Hz), although one can not localize a sound source, differences in interaural phase can create a sensation of space. This suggests that for accurate perception of space, 2 or 3 subwoofers may be necessary. This also, as in many other places in audio, creates a situation where what one might consider the “optimum” solution (maximum bass flatness) does not in fact convey the perceptual optimum.”
Soundfields vs human hearing 2012
https://www.aes-media.org/sections/pnw/ppt/jj/soundfields_vs_human_hearing_edited.ppt
“Specifically, although one can not LOCALIZE signals below about 90 Hz, one can detect spatial effects from interaural phase differences down to about 40Hz. The AT&T Labs “Perceptual Soundfield Reconstruction” Demo, no longer available, contained a very nice example of these effects, and how they can change “boomy bass” in the 2-radiator case into “bass spread about a room” in the 5-channel case.”
https://www.audiosciencereview.com/...s-using-subwoofers.11034/page-29#post-1757805
“f you have a good, well-designed acoustic recording that hasn't been treated with the "mono bass" routine, etc, you may get quite a bit from it, especially if you have 2 front, 2 back, properly (that word!) recorded.
“Below 40Hz it's not as much of an issue, but between 40 and 90Hz, yes, there is "stereo content" possible in a good venue with a good recording.”
https://www.audiosciencereview.com/forum/index.php?threads/room-modes.51257/page-3#post-1854536
“It's pretty much clear that 20 milliseconds (the period of a 50Hz tone) is longer than the minimum time that the ear can distinguish. Additionally, interaural delay, which has been claimed to be not interfered with at 50Hz, can most certainly have timing issues. Like many people have reported, non-phase-locked sound at 40 Hz and up does not have a direction component, but it most certainly has a spatial sensation component.”
“That's what I know. I am not eager to get into he-said, she-said arguments. As to minimum phase, I have measured a lot of things at a lot of frequencies in a room that are not minimum phase. That, however, is totally room dependent.”
@Thomas Lund
https://www.genelec.com/-/immersive-monitoring-a-perceptive-perspective
“From 50 Hz to 700 Hz, however, fast-firing synapses in the brainstem are responsible for localisation, employed in a phase-locking structure to determine interaural time difference (ITD). Humans can localise at even lower frequencies, but we will come back to that in a specific ultra low frequency blog.
The ability to position sound sources with precision spherically is a key benefit of immersive systems. Another is the possibility to influence the sense of space in human listeners. For the latter, the lowest two octaves of the ITD range (i.e. 50-200 Hz) play an essential role; but may be compromised in multiple ways”
https://www.genelec.com/-/blog/how-to-analyse-frequency-and-temporal-responses
“With both stereo and immersive, for your room and system to be able to reliably convey the envelopment latent in the content, perceived-direct sound should dominate in the 50 to 700 Hz range – where audible patterns characteristic of the recording space may have been picked up. If they have, you can be sure that the recording engineer has gone to great lengths in doing so. Fig 2 is a recording setup in Olavshallen in Trondheim, Norway, and shows a main mic array with sufficient distance between capsules to capture LF differences and moving patterns – two of the most precious qualities of a hall to preserve….Fig 3 shows the GRADE graphs from section 4.3 of the report, and reveals a monitor to the left which is able to convey envelopment latent in the content; and one to the right that is less able to do so. If listening to the latter, you are unable to judge recorded space precisely. Such ability may also be sacrificed when relying on bass management with only one subwoofer to reproduce all LF sound, rather than multiple channels and acoustic in-room summation. If possible with delicate content, don’t use a higher bass management cross-over frequency than necessary, and preferably below 60 Hz.”
https://www.audiosciencereview.com/...view-room-eq-setup.26397/page-21#post-1526313
“It's not primarily about localization, more about reproducing the swirling LF patterns a fine concert hall generates when music is being played. With acoustical summation in a reproduction room, there is a chance of hearing them, while electrical summation surely kills such joy. Also, we actually localize all the way down to a static pressure change (DC). It's indoor conditions messing up our senses”
https://www.audiosciencereview.com/...view-room-eq-setup.26397/page-21#post-1524620
“To faithfully reproduce great acoustic recordings, a flattish frequency response of perceived-direct sound is just one of the goals. More importantly, to me, the monitoring room and sound system need to convey moving patterns of sound latent in the recording, especially between 40 and 200 Hz. This is where to hear the soul of a concert hall or church, in case it has been recorded.
"Collapsing discrete channels to a single sub channel should therefore be a last resort, e.g. if the reproduction room/placement is difficult and/or to accommodate multiple listeners.
"Taking advantage of discrete channel reproduction at low frequency has even spread outside acoustic recordings. Top pop/rock productions now also make use of such perceptual excitement, which will remain a secret to “collapsers””
Bass and subwoofers
Borenius 1985
https://secure.aes.org/forum/pubs/conventions/?elib=11465
Holman's description in Surround Sound: Up and Running was Swedish (presumably national) [but actually Finnish] radio moving a subwoofer around, using the most sensitive program material and the most sensitive listeners, and 80 Hz was 2 standard deviations below the mean
Later comment by Nastasa 2022 (below): “The task of the listeners was to adjust both the crossover frequency and the time delay between the loudspeakers until they found the threshold at which the sound image becomes"messy and disturbing". The author concludes that, based on the crossover frequency threshold values, there is little directional information contained in the sound signal below 200Hz, and none at all, below 100Hz. Furthermore, it is noted that speech signals are highly sensitive to delay errors, even with crossover frequencies below 100Hz, and that music signals on average tolerate much higher delay errors.”
Welti 2002
https://www.harman.com/documents/multsubs_0.pdf
The classic multisub paper quoted around the Internet, suggesting that for rectangular rooms with seating in or near center, in terms of frequency response smoothness across seating positions, there were diminishing returns beyond four equalized subwoofers, placed symmetrically in corners or mid-wall. Two mid-wall subs did nearly as well.
Limiting assumptions were rectangular room with monaural equalized subs along walls, primary goal was minimizing response variation over seating area, secondary was maximizing LF output.
Welti reviewed nodal placement not exciting relevant (primary and +2 order modes), symmetric placement WRT nodes resulting in relevant (primary and +2 order) mode cancellation
Welti 2004
https://1drv.ms/u/s!AnprNKFcgo3wgQ1pPIKNI5c5_vZe
Subjective Comparison of Single Channel versus Two Channel Subwoofer Reproduction
“Audible differences for music signals were found only for the comparison of single channel front center subwoofer vs. two channel subwoofers at +/- 90 degrees.”
Later comment by Nastasa 2022: “Welti (2004) has set out to compare the audibility of two channel versus single channel bass reproduction in a small room environment (6.4 m x 7.3 m. x 2.7 m). Four different subwoofer configurations were tested, namely: one mono subwoofer placed at centre front; two summed mono subwoofers located at left and right front corners; two discrete left and right channel subwoofers located at the same position as the previous configuration; and two channel subwoofers located at +/- 90° relative20 to the listener. The stimuli used comprised of four short program loops and three music programs. The test was conducted using an ABX triangle method and in total, f ive trained listeners participated. The author concluded that the only noticeable difference occurred when comparing the centre mono subwoofer and the stereo configuration located to the left and right of the listener. Contrarily, no difference could be heard when comparing the front mono or stereo configurations. While Welti’s test partly dismissed that audible differences would be heard across the different subwoofer configurations, the author did suggest that this test does not prove in any way that there are no merits to stereo subwoofer reproduction”
Fazenda 2004
https://eprints.hud.ac.uk/id/eprint/3538/
“The perception of resonances is highly dependent on the temporal content of the excitation signal.”
“If level differences are removed, subjects are still sensitive to effects introduced by resonances. This fact supports the idea that any low frequency correction method should concentrate on reducing the temporal resonant characteristic of the modes. A technique that solely addresses magnitude frequency irregularities is likely to leave evidence of the temporal effects, especially after the removal of the excitation stimulus. In some cases, in the presence of fast tonal transients, the effects may even be detected simultaneous to the input stimulus. Any attempt to reduce this effect is only likely to be successful if the temporal characteristic of the resonance is modified by reducing its decay time.”
Griesinger 2005
http://www.davidgriesinger.com/asa05.pdf
“It is assumed that the presence of room modes makes localization impossible. The statement that low frequencies cannot be localized is easily shown to be true when a sine tone as used as a signal. However it is equally easy to show that an broadband – such as low-pass filtered noise or a low-pass filtered click – is easily localized in most rooms. In free field we detect the azimuth of sounds at low frequencies by detecting the time differences between the zero-crossings of the pressure waveform at the two ears. Human perception is particularly sensitive to sounds with sharp onsets. We tend to localize the beginnings of sounds. We can localize an impulsive sound at low frequencies in part because it takes time for standing waves to establish themselves.“
“In general, we want to put the subwoofers on either side of the listeners! If the back wall is closer to the listening position, put the subwoofers on the back wall. This will maximize both the bass uniformity and the spatial impression.“
“Putting the front speakers along the long wall of a small room can be helpful, as can a somewhat asymmetric speaker layout. In many rooms it can be helpful to place low frequency drivers at the sides of the listening position rather than at the front of the room. “
“Although widely held to be unnecessary or impossible, reproduction of envelopment at low frequencies in small rooms can be achieved, particularly with a multi-channel sound system. Successful results depend on: 1. having an input recording that includes at least two channels where the reverberation is independently recorded, and thus uncorrelated with the other channels. 2. The presence of independently driven room modes that overlap in such a way that the lateral pressure gradient of one mode combines with the pressure of another. In the case of two channel stereo, the best results usually occur when an asymmetric lateral mode (driven by the difference signal between the loudspeakers) creates a pressure gradient at the listening position, and a medial mode (usually a front/back mode) supplies the pressure. Ideally both modal systems should be broad enough in frequency that there is a substantial frequency overlap, as well as a spatial overlap.”
Modes or standing waves have anti-nodes where pressure is maximal (and velocity minimal) and nodes where pressure is minimal (and velocity maximal). Velocity=pressure gradient, so the asymmetric lateral mode is the first-order width mode, where listeners positioned near the midline will be sitting in this mode’s node
Griesinger later in 2018 (see below) remarks “I re-read [this “AES preprint], and was mystified”
Miller 2005
http://www.filmaker.com/papers/RM-2SW_AES119NYC.pdf
“Binaural detection by humans in the octave 45~90Hz is physiologically possible; Music/movie VLF content exists (no new recording/mixing procedures are demanded); High quality reproduction implies two-channel bass management and two subwoofers”
“In demonstrations with presentations of this paper at the 23rd VDT Tonmeisters in Leipzig, November 2004, and to the combined Acoustical Society of America and Canadian Acoustical Association in Vancouver, May 2005, the descending 13 step test 100~25Hz was played using two 18in (45cm) drivers at the mid side wall positions. As above, each step was played first by mixing tones differing by 0.5Hz electrically so as to drive the subwoofers in monaural, then unmixed to reproduce VLF binaurally. At VDT by a simple show of hands, nearly all of approx. 40 attendees reported at 100Hz perceiving no “swirling motion” in monaural, but a definite “impression of motion” in stereo. Half the attendees heard “motion” down to 50Hz, 1/3 heard to 45Hz, and 1/5 heard to 40Hz. At ASA/CAA, again half of approx. 70 attendees reported perceiving “motion” to 50Hz, 1/3 heard to 45Hz, and ¼ heard down to 40Hz. “
Suggests varying sensitivity to stereo/spatial low bass
Welti, Devantier 2006
https://audioroundtable.com/misc/Welti_Multisub.pdf
Sound Field Management
Discussed by Toole in second edition of Sound Reproduction and at https://www.audioholics.com/room-acoustics/history-of-multi-sub-sfm
Wilson 2006
https://secure.aes.org/forum/pubs/conferences/?elib=17270
"Use of two subwoofers placed to the left and right of the listener and playing left and right LF signals allows presentation of spatial information.”
Welti 2012
https://www.aes.org/e-lib/online/browse.cfm?elib=16490
Optimal Configurations for Subwoofers in Rooms Considering Seat to Seat Variation and Low Frequency Efficiency
Discussed in third edition of Sound Reproduction, also at https://hometheaterhifi.com/technic...n-interview-with-todd-welti-and-kevin-voecks/
https://www.harman.com/documents/AES_Preprint_8748_color_plots_1.zip
Examined mean spatial variation (MSV) and mean output level (MOL) for varying seating arrangements and subwoofer configurations
Fazenda 2012
https://www.avsforum.com/attachments/jaes_v60_5_perception_modal_control-pdf.2273992/
“It appears that, for high quality critical listening conditions, those systems ensuring a faster decay of low frequency energy are preferred over those attempting a direct “flattening” of the magnitude frequency response.
A significant limitation would seem to be that the results of magnitude equalization demonstrated between figures 2 and 4, also 5 and 6, were surprisingly bad! Still, configuration FB suggested by results from Welti 2002 did not seem to perform significantly better than the surprisingly bad equalized corner subwoofer.
Hill, Hawksford 2013
https://www.researchgate.net/public...ation_as_a_function_of_closed_acoustic_spaces
Low frequency localization in closed spaces depends on location of source, listener, and room characteristics like dimensions and reverberation time
“The key point here [from simulation results] is that as room size increases listeners benefit from a longer duration of uncorrupted ITD phase information, allowing for precise localization (although this is strongly dependent on listener and subwoofer location as well a signal characteristics). This extra time is believed to be critical especially at the onset of a signal where human ability for low-frequency directional discrimination is likely to be at a maximum compared for example to a pseudo steady-state signal where reflections will induce continual impairment.”
“The key point illuminated by the subjective evaluation results is that listening location makes a significant difference in low-frequency localization. In both rooms, the location furthest from the subwoofer exhibited poor localization, with specifically poor performance across all frequencies in the larger room. The location closer to the subwoofer, on the other hand, gave consistently good localization over most frequencies. In addition to benefiting from a longer duration of uncorrupted directional information, the closer listening location has a higher direct-to-reverberant sound intensity ratio, also allowing for improved localization. The closer location did give poor localization at 30 Hz in the small room, but this is as predicted by the simulations since the time for accurate localization gives only three-quarters (at best) of an uncorrupted wavelength at 30 Hz, which is insufficient for accurate localization.”
“It seems the conjecture that humans cannot localize low- frequencies is inaccurate and, in reality, the question concerning localization demands the response: “it depends”.”
Fazenda 2015
https://salford-repository.worktrib...cts-of-room-modes-as-a-functionof-modal-decay
“Perceptual modal thresholds are independent of presentation level, except for thresholds obtained with artificial stimuli below 63 Hz, where a significant effect of level has been found”
“The content in music stimuli has an effect on how well modal problems are detected and leads to statistically significant interactions and differences in the thresholds measured.”
“Perceptual thresholds for modal effects when testing with artificial stimuli decrease rapidly with increasing frequency up to about 100Hz where they appear to level out. For music stimuli, thresholds decrease monotonically with frequency…Average thresholds measured with music stimuli are 0.51s at 63Hz, 0.3s at 125Hz, and 0.12s at 250Hz”
Supports acceptability of increasing decay times with decreasing frequency
Griesinger 2018
https://tonmeistertagung.com/download/tmt30-2018-proceedings.pdf
“Frequencies below about 120Hz behave differently. ITD and ILD can be detected for tones, and in a free field these sounds can be localized in the horizontal plane. The perception of spaciousness and envelopment arises when ITDs and ILDs are detectable but not stable. Room reflections arrive at the listener chaotically from all directions, causing the ILDs and ITDs at the listener’s ears to fluctuate randomly at rates that depend on the frequency of the music and the size and reverberation time of the space. Fluctuations lower than about three Hertz are heard as motion – a kind of wander or swirl. Rates faster than about 20°Hz are heard as a kind of tone or buzz. But between 3°Hz and 20°Hz these fluctuations are perceived as a sense of enveloping space. Fluctuations above 20°Hz are simply perceived as a widening of the sound image.
“These observations are not new. Blauert observed them as a graduate student, and ascribed the perception to what he called “localization lag”.”
“To make pressure differences between the ears you need to find room modes or other methods that produce a lateral pressure gradient that varies with the phases of the recording. At low frequencies these pressure differences can be small, but they need to be heard. Unfortunately most modes in rooms have no lateral pressure gradients, and the gradients of those modes that have them are largest at the nulls of the sound pressure. To be able to hear the differences these gradients provide we need to limit the strength of competing modes. All symmetric lateral modes, the ones that are excited by in-phase signals in the drivers, typically produce maximum pressure at the listening position. In addition all up-down and front-back modes have no lateral gradient, and also usually have maxima at the listening position. All mixed modes, of which there are a great many, have little or no lateral gradients. We need to somehow keep all of these under control.”
“I have found before that putting subwoofers that augment frequencies between 30 and 150°Hz at the sides of the listening position can solve spatial problems in many rooms!”
Hill and Hawksford 2021
Low-frequency sound source localization as a function of closed acoustic spaces
https://repository.derby.ac.uk/item...ation-as-a-function-of-closed-acoustic-spaces
“This research highlights that the difference between the arrival of the direct sound and first reflection to a listener is the primary determinant of localization time. In addition, the average absorption of a space affects localization time, whereby high absorption coefficients (above 0.7) give multiple additional milliseconds of localization time.“
“Where previous work makes gross generalizations (such as nothing under 100 or 200 Hz can be localized in any space), this research has shown that the minimum localizable frequency is a function of the configuration of a space (dimensions, source/listener location, absorption).”
Nastasa 2022
https://aaltodoc.aalto.fi/bitstream...tasa_Anamaria_2022.pdf?sequence=1&isAllowed=y
Starts with excellent review of spatial hearing and low frequency localization
“Larger rooms allow for localisation of lower frequencies, and listeners located closer to a subwoofer exhibit less directional confusion.”
“Localisation abilities decrease when the reference azimuth moves away from the median plane.”
“Humans can reliably detect a change of 10° in the direction [from the median plane] of pure tones with a frequency of over 63.5Hz and octave bands of pink noise with a centre frequency as low as 31.5Hz.”
“ITD is the primary cue utilised by the auditory system to determine changes in the location of a low-frequency sound source.”
Supports low frequency localization in free field, ITD as primary cue/mechanism
@j_j's comments on spatial bass
Fundamentals of hearing 2004
https://www.aes-media.org/sections/pnw/ppt/jj/jj_aes04_ts1.ppt
“Various people have reported, sometimes anecdotally, that above 40Hz (and below 90Hz), although one can not localize a sound source, differences in interaural phase can create a sensation of space. This suggests that for accurate perception of space, 2 or 3 subwoofers may be necessary. This also, as in many other places in audio, creates a situation where what one might consider the “optimum” solution (maximum bass flatness) does not in fact convey the perceptual optimum.”
Soundfields vs human hearing 2012
https://www.aes-media.org/sections/pnw/ppt/jj/soundfields_vs_human_hearing_edited.ppt
“Specifically, although one can not LOCALIZE signals below about 90 Hz, one can detect spatial effects from interaural phase differences down to about 40Hz. The AT&T Labs “Perceptual Soundfield Reconstruction” Demo, no longer available, contained a very nice example of these effects, and how they can change “boomy bass” in the 2-radiator case into “bass spread about a room” in the 5-channel case.”
https://www.audiosciencereview.com/...s-using-subwoofers.11034/page-29#post-1757805
“f you have a good, well-designed acoustic recording that hasn't been treated with the "mono bass" routine, etc, you may get quite a bit from it, especially if you have 2 front, 2 back, properly (that word!) recorded.
“Below 40Hz it's not as much of an issue, but between 40 and 90Hz, yes, there is "stereo content" possible in a good venue with a good recording.”
https://www.audiosciencereview.com/forum/index.php?threads/room-modes.51257/page-3#post-1854536
“It's pretty much clear that 20 milliseconds (the period of a 50Hz tone) is longer than the minimum time that the ear can distinguish. Additionally, interaural delay, which has been claimed to be not interfered with at 50Hz, can most certainly have timing issues. Like many people have reported, non-phase-locked sound at 40 Hz and up does not have a direction component, but it most certainly has a spatial sensation component.”
“That's what I know. I am not eager to get into he-said, she-said arguments. As to minimum phase, I have measured a lot of things at a lot of frequencies in a room that are not minimum phase. That, however, is totally room dependent.”
@Thomas Lund
https://www.genelec.com/-/immersive-monitoring-a-perceptive-perspective
“From 50 Hz to 700 Hz, however, fast-firing synapses in the brainstem are responsible for localisation, employed in a phase-locking structure to determine interaural time difference (ITD). Humans can localise at even lower frequencies, but we will come back to that in a specific ultra low frequency blog.
The ability to position sound sources with precision spherically is a key benefit of immersive systems. Another is the possibility to influence the sense of space in human listeners. For the latter, the lowest two octaves of the ITD range (i.e. 50-200 Hz) play an essential role; but may be compromised in multiple ways”
https://www.genelec.com/-/blog/how-to-analyse-frequency-and-temporal-responses
“With both stereo and immersive, for your room and system to be able to reliably convey the envelopment latent in the content, perceived-direct sound should dominate in the 50 to 700 Hz range – where audible patterns characteristic of the recording space may have been picked up. If they have, you can be sure that the recording engineer has gone to great lengths in doing so. Fig 2 is a recording setup in Olavshallen in Trondheim, Norway, and shows a main mic array with sufficient distance between capsules to capture LF differences and moving patterns – two of the most precious qualities of a hall to preserve….Fig 3 shows the GRADE graphs from section 4.3 of the report, and reveals a monitor to the left which is able to convey envelopment latent in the content; and one to the right that is less able to do so. If listening to the latter, you are unable to judge recorded space precisely. Such ability may also be sacrificed when relying on bass management with only one subwoofer to reproduce all LF sound, rather than multiple channels and acoustic in-room summation. If possible with delicate content, don’t use a higher bass management cross-over frequency than necessary, and preferably below 60 Hz.”
https://www.audiosciencereview.com/...view-room-eq-setup.26397/page-21#post-1526313
“It's not primarily about localization, more about reproducing the swirling LF patterns a fine concert hall generates when music is being played. With acoustical summation in a reproduction room, there is a chance of hearing them, while electrical summation surely kills such joy. Also, we actually localize all the way down to a static pressure change (DC). It's indoor conditions messing up our senses”
https://www.audiosciencereview.com/...view-room-eq-setup.26397/page-21#post-1524620
“To faithfully reproduce great acoustic recordings, a flattish frequency response of perceived-direct sound is just one of the goals. More importantly, to me, the monitoring room and sound system need to convey moving patterns of sound latent in the recording, especially between 40 and 200 Hz. This is where to hear the soul of a concert hall or church, in case it has been recorded.
"Collapsing discrete channels to a single sub channel should therefore be a last resort, e.g. if the reproduction room/placement is difficult and/or to accommodate multiple listeners.
"Taking advantage of discrete channel reproduction at low frequency has even spread outside acoustic recordings. Top pop/rock productions now also make use of such perceptual excitement, which will remain a secret to “collapsers””
Last edited: