BACCH4Mac Pro Edition: a report

Dialectic · Apr 17, 2018

Many thanks to Dr. Toole for joining the forum and providing some context on the development of crosstalk cancellation. In my ignorance, I had not known of Cooper-Bauck.

Perhaps I need to hear a state-of-the-art multichannel system before I make more statements about the relative merits of BACCH and multichannel.

Nevertheless, one of the wonderful things about BACCH is that I can have an enveloping soundfield, using only two speakers, with nearly all stereo recordings. Rear speakers would be physically impossible to set up in our space in Manhattan.

EDIT: For the record, I still think BACCH represents the current pinnacle in music reproduction. I'm listening to BACCH-dSP-dHP (the headphone version) now, and it's incredible.

Edgar Choueiri · Apr 17, 2018

Floyd Toole said:
I think I have a slight. bias because in the early phases of his work, Dr. Chouieiri promoted it as if he had discovered binaural sound and crosstalk cancellation. There was a "gee whiz" quality to the promotions. That was clearly not the case. Since then, he and his aids have obviously gone further, and I have no problems with this. The concepts are valid, and with today's computing power, many things are possible - for a single listener willing to listen in a sweet spot or through headphones.

...

Not long after I joined Harman I negotiated a licensing arrangement to employ the Cooper-Bauck "transaural" crosstalk cancelling method into a product called VMAX - Virtual Multi-Axis..... Sadly it was not to be. Management of the period could not find a way to incorporate "software" as a product in our offerings - licensing being an obvious possibility. It drifted into nothingness.
....

The key difference between what I have described and what I think you are hearing through the BACCH system is that VMAX simulated loudspeakers at the correct stereo locations. Of course we listened to the "naked" crosstalk canceller, and what we heard parallels some of the descriptions I have read in this thread. When there are no phantom loudspeakers to provide directional anchors, almost anything is possible, and much of it is far from what the artists and recording engineers intended - the "soundstage" is very fluid, and envelopment can be profound. In this mode it is really a sound-effects generator and opinions of like or dislike will predictably vary. Please correct me if I am wrong.

Cheers, Floyd

I am glad to read the lively, well informed and insightful discussions on BACCH and crosstalk cancellation in this forum. Also, I would like to thank you, Dr. Toole, for sharing your memories of getting Harman products licensed to use the "Cooper-Bauck crosstalk cancelling method” back in the mid-nineties. Indeed crosstalk cancellation (XTC) in audio goes back to the seminal work of Cooper and Bauck in 1961, and binaural audio further back by decades, while the idea of crosstalk cancellation for visual 3D perception, which is a perfect analog to the audio XTC, goes back even further to the work of Wheatstone who invented the stereoscope in 1838.

It would be preposterous for anyone since Wheatstone (for photo/video) and Cooper and Bauck (for audio) to claim they invented crosstalk cancellation (or binaural audio). My research team and myself go to great pains describing this history in our XTC-related papers and presentations. For an excellent account of that history I recommend the first chapter of the recently published book “Immersive Sound: The Art and Science of Binaural and Multi-Channel Audio” from Focal Press (ISBN-13: 978-1138900004). That chapter was written by Prof. Braxton Boren of American University (who prior was a post-doctoral researcher at my lab for two years).

Even in interviews when I am not asked about the history of XTC I mention that XTC has existed for a long time and that our work focused on finding a solution of the central problem of the inherent coloration of XTC filers (even the permanent FAQ page on theoretica's website states "the technique of crosstalk cancellation (XTC) has been known for some time") . This tonal coloration problem is described in detail in Chapter 5 of the same book referenced above (which is an extended and updated version of the paper Mr. Borduin gave a link to) and which incidentally has a discussion and bibliography of the many papers on XTC since Cooper and Bauck (Note: I get no proceeds or benefits from the sales of this book). Basically, as I detailed in that chapter, XTC filters can be shown analytically (and through audition) to cause coloration to the sound (i.e. tonal distortion) that is problematic by Hi-Fi standards. BACCH filters, are a particular case of optimized XTC filters that have no tonal distortion. In practice BACCH filters are produced from a 2-point HRTF measurement done in-situ using the individual two-speakers-one-listener set-up and involve a patented method for producing a non-causal finite impulse XTC filter that has no tonal distortion.

I am sincerely sorry that some well-intended people, who reported on the internet about hearing BACCH, found our contribution too technical (or perhaps not sensational enough) to either comprehend or explain, and implied through their “gee whiz” reporting that we had invented XTC (or binaural audio). Unfortunately, not only this gives the wrong impression to some that we are making claims of inventing a technique whose existence precedes the work of my team by decades, but also denigrates the true value of our contributions and those of other workers in the field. Echoing your wise dictum “meanwhile there are still live concerts” I can add in the above context "meanwhile there are still journals” (and serious publications to which the serious inquirer can turn to find out the truth on both the history and the technical contributions.)

The combination of these tonal-distortion-free optimized XTC filters, advanced and robust head tracking algorithms that extend the sweet spot over the wide view range of a webcam or infrared sensor, and powerful 64-bit audio processing with modern multi-core CPUs (among many other advanced algorithms for high-precision impulse response (IR) deconvolution, IR alignment and interpolation and other methods driven by research in AR/VR) have allowed XTC-based 3D audio to enter the realm of not only AR/VR (where the requirements are exacting) but also of high-quality HI-FI. My research team and I would be delighted to host you for a visit to our lab in Princeton NJ for a set of demos that would give you an update on the state of the art of a field to whose commercialization you and your co-workers at Harman significantly contributed 23 years ago. (I should add that XTC, which is very well developed now, is no longer a focus of research at my research lab, where most of the work is on 3D soundfield navigation using arrays of higher order ambisonic (HOA) mics; HRTF synthesis from head scans; and acoustic isolation between adjacent sound fields.)

Please allow me respectfully to correct your guess ("simulated loudspeakers at the correct stereo locations") at what people are reporting (on this forum and elsewhere) about what they are hearing with BACCH. I believe your guess is perhaps a bit biased by your recollections of the circa 1995 Harman demos using the Cooper-Bauck XTC method to use "binaural cues simulating loudspeakers at specific locations” whose focus was on “deliver[ing] phantom 5 channel home theater from two loudspeakers”.

While we have developed a commercial 3D mixer, that uses individualized HRTFs for binaural virtualization (through both a pair of speakers or headphones) of sources (including virtual surround systems) in 3D space and navigating 3D sound fields, the use of BACCH filters that is of most interest to audiophiles is not the virtualization of speakers (creation of phantom speakers) but rather the removal of the limitations, due to crosstalk, on the levels of ILD and ITD cues that exist in virtually all stereo recordings and that can be transmitted to the ears of the listener by a pair of loudspeakers. Properly executed XTC allows the listener to receive these cues and better perceive the 3D location of the sources associated with them instead of the locations of the speakers. For instance the frequency-averaged ILD of each of the two loudspeakers in a regular stereo triangle does not exceed about 5 dB, therefore (speaking first of acoustic recordings) any source during recording that presents more than 5 dB of ILD to the stereo microphones (which would be the case of any source to the extreme right or left of the stage at angles much larger than 30 degrees) would be necessarily perceived by the listener during playback as locked into the speaker if XTC is not applied. A BACCH filter can provide an XTC level as high as 20 dB which would allow the ILD of sources, even those with extremely high ILD (e.g someone whispering in your right ear), to be reproduced correctly during playback.

Therefore what people are reporting about when they listen to an acoustic recording played through a BACCH filter has nothing to do with phantom speakers or location of the actual speakers, but everything to do with the more correct reproduction of the sources whose spatial cues were captured in the recording. Whether the cues are only ITD (as would be the case with a recording done with spaced omni mics), or ILD (as would be the case with ORTF mics) or ILD, ITD and spectral cues (as would be the case with a dummy head mic) all acoustic stereo recordings are based on one or a combination of these cues, and all of these cues would be severely limited by the crosstalk — the BACCH filter simply raises the ceiling on the delivery of these cues to the listener during playback, and the result is not a phantom speaker but rather a literally disappearing speaker. In fact, the locations of the speakers become, to a first order (i.e. neglecting the effects of reflections) completely immaterial to the 3D sound image. One of the demos that we often use to illustrate this is to create a BACCH filter for a certain 2-speaker-1-listener geometry, listen to the 3D image through that filter, then radically change the speakers geometry (even a very asymmetric one and with different distances to the listener) create the BACCH filter for that geometry and show that the resulting 3D image has not changed a bit.

Another very important felicitous result of raising the ceiling on the levels of ILD and ITD cues that can be delivered thanks to high-level-XTC, is the much more convincing reproduction of late reflections and reverb captured in the recordings. Such reflections (if captured in the stereo recording) can now be reproduced at the ears of the listener, thanks to the high levels of XTC, with much more correct levels of de-correlation and immensely enhance the sense of spatial realism and envelopment, compared to when they are perceived to come from the speakers.

The above arguments should not be controversial in the context of acoustic recordings but admittedly they do not perfectly apply to a recording mixed from artificially panned (often mono) stems (as would be the case of most recordings of pop music). There the image is artificial in the first place, and through a BACCH filter it would be artificially projected from a limited space between the two speakers to a 3D region of space the extent of which depends on the artificial ILD and ITD cues introduced by the panning. A purist might well object to this 3D projection as not one intended by the mixing engineer (who had no way of hearing it in 3D since the mixing was not done through XTC). It has been our experience that the vast majority of such purists are purist only theoretically. When in practice they are asked to compare the artificial 3D image through BACCH to the artificial image without it, they almost invariably prefer the former artificial image to the latter. For that you can take the words of the many listeners who have reported their impressions after days of listening, or even better come visit and listen.

Apologies for the lengthy reply and, again, thank you for adding your thoughts to this discussion.

Edgar Choueiri

oivavoi · Apr 17, 2018

Thanks for contributing here, prof. @Edgar Choueiri ! It has been very enlightening for me to read the informed discussion here. This thread has made me very eager to listen to BACCH for myself. It does look like a real breakthrough in audio. I have notified one of the dealers in my country who is typically interested in the cutting edge of audio technology, and I hope he might be interested in it.

svart-hvitt · Apr 17, 2018

Edgar Choueiri said:
I am glad to read the lively, well informed and insightful discussions on BACCH and crosstalk cancellation in this forum. Also, I would like to thank you, Dr. Toole, for sharing your memories of getting Harman products licensed to use the "Cooper-Bauck crosstalk cancelling method” back in the mid-nineties. Indeed crosstalk cancellation (XTC) in audio goes back to the seminal work of Cooper and Bauck in 1961, and binaural audio further back by decades, while the idea of crosstalk cancellation for visual 3D perception, which is a perfect analog to the audio XTC, goes back even further to the work of Wheatstone who invented the stereoscope in 1838.

It would be preposterous for anyone since Wheatstone (for photo/video) and Cooper and Bauck (for audio) to claim they invented crosstalk cancellation (or binaural audio). My research team and myself go to great pains describing this history in our XTC-related papers and presentations. For an excellent account of that history I recommend the first chapter of the recently published book “Immersive Sound: The Art and Science of Binaural and Multi-Channel Audio” from Focal Press (ISBN-13: 978-1138900004). That chapter was written by Prof. Braxton Boren of American University (who prior was a post-doctoral researcher at my lab for two years).

Even in interviews when I am not asked about the history of XTC I mention that XTC has existed for a long time and that our work focused on finding a solution of the central problem of the inherent coloration of XTC filers. This tonal coloration problem is described in detail in Chapter 5 of the same book referenced above (which is an extended and updated version of the paper Mr. Borduin gave a link to) and which incidentally has a discussion and bibliography of the many papers on XTC since Cooper and Bauck (Note: I get no proceeds or benefits from the sales of this book). Basically, as I detailed in that chapter XTC filters can be shown analytically (and through audition) to cause coloration to the sound (i.e. tonal distortion) that is problematic by Hi-Fi standards. BACCH filters, are a particular case of optimized XTC filters that have no tonal distortion. In practice BACCH filters are produced from a 2-point HRTF measurement done in-situ using the individual two-speakers-one-listener set-up and involve a patented method for producing a non-causal finite impulse XTC filter that has no tonal distortion.

I am sincerely sorry that some well-intended people, who reported on the internet about hearing BACCH, found our contribution too technical (or perhaps not sensational enough) to either comprehend or explain, and implied through their “gee whiz” reporting that we had invented XTC (or binaural audio). Unfortunately, not only this gives the wrong impression to some that we are making claims of inventing a technique whose existence precedes the work of my team by decades, but also denigrates the true value of our contributions and those of other workers in the field. Echoing your wise dictum “meanwhile there are still live concerts” I can add in the above context "meanwhile there are still journals” (and serious publications to which the serious inquirer can turn to find out the truth on both the history and the technical contributions.)

The combination of these tonal-distortion-free optimized XTC filters, advanced and robust head tracking algorithms that extend the sweet spot over the wide view range of a webcam or infrared sensor, and powerful 64-bit audio processing with modern multi-core CPUs (among many other advanced algorithms for high-precision impulse response (IR) deconvolution, IR alignment and interpolation and other methods driven by research in AR/VR) have allowed XTC-based 3D audio to enter the realm of not only of AR/VR (where the requirements are exacting) but also of high-quality HI-FI. My research team and I would be delighted to host you for a visit to our lab in Princeton NJ for a set of demos that would give you an update on the state of the art of a field to whose commercialization you and your co-workers at Harman significantly contributed 23 years ago. (I should add that XTC, which is very well developed now, is no longer a focus of research at my research lab, where most of the work is on 3D soundfield navigation using arrays of higher order ambisonic (HOA) mics; HRTF synthesis from head scans; and acoustic isolation between adjacent sound fields.)

Please allow me respectfully to correct your guess ("simulated loudspeakers at the correct stereo locations") at what people are reporting (on this forum and elsewhere) about what they are hearing with BACCH. I believe your guess is perhaps a bit biased by your recollections of the circa 1995 Harman demos using the Cooper-Bauck XTC method to use "binaural cues simulating loudspeakers at specific locations” whose focus was on “deliver[ing] phantom 5 channel home theater from two loudspeakers”.

While we have developed a commercial 3D mixer, that uses individualized HRTFs for binaural virtualization (through both a pair of speakers or headphones) of sources (including virtual surround systems) in 3D space and navigating 3D sound fields, the use of BACCH filters that is of most interest to audiophiles is not the virtualization of speakers (creation of phantom speakers) but rather the removal of the limitations, due to crosstalk, on the levels of ILD and ITD cues that exist in virtually all stereo recordings and that can be transmitted to the ears of the listener by a pair of loudspeakers. Properly executed XTC allows the listener to receive these cues and better perceive the 3D location of the sources associated with them instead of the locations of the speakers. For instance (speaking first of acoustic recordings) the frequency-averaged ILD of each of the two loudspeakers in a regular stereo triangle does not exceed about 5 dB, therefore any source during recording that presents more than 5 dB of ILD to the stereo microphones (which would be the case of any source to the extreme right or left of the stage at angles much larger than 30 degrees) would be necessarily perceived by the listener during playback as locked into the speaker if XTC is not applied. A BACCH filter can provide an XTC level as high as 20 dB which would allow the ILD of sources with extremely high ILD (e.g someone whispering in your right ear) to be reproduced correctly during playback.

Therefore what people are reporting about when they listen to an acoustic recording played through a BACCH filter has nothing to do with phantom speakers or location of the actual speakers, but everything to do with the more correct reproduction of the sources whose spatial cues were captured in the recording. Whether the cues are only ITD (as would be the case with a recording done with spaced omni mics), or ILD (as would be the case with ORTF mics) or ILD, ITD and spectral cues (as would be the case with a dummy head mic) all acoustic stereo recordings are based on one or a combination of these cues, and all of these cues would be severely limited by the crosstalk — the BACCH filter simply raises the ceiling on the delivery of these cues to the listener during playback, and the result is not a phantom speaker but rather a literally disappearing speaker. In fact, the locations of the speakers become, to a first order (i.e. neglecting the effects of reflection) completely immaterial to the 3D sound image. One of the demos that we often use to illustrate this is to create a BACCH filter for a certain 2-speaker-1-listener geometry, listen to the 3D image through that filter, then radically change the speakers geometry (even a very asymmetric one and with different distances to the listener) create the BACCH filter for that geometry and show that the resulting 3D image has not changed a bit.

Another very important felicitous result of raising the ceiling on the levels of ILD and ITD cues that can be delivered thanks to high-level-XTC, is the much more convincing reproduction of late reflections and reverb captured in the recordings. Such reflections (if captured in the stereo recording) can now be reproduced at the ears of the listener, thanks to the high levels of XTC, with much more correct levels of de-correlation and immensely enhance the sense of spatial realism and envelopment, compared to when they are perceived to come from the speakers.

The above arguments should not be controversial in the context of acoustic recordings but admittedly they do not perfectly apply to a recording mixed from artificially panned (often mono) stems (as would be the case of most recordings of pop music). There the image is artificial in the first place, and through a BACCH filter it would be artificially projected from a limited space between the two speakers to a 3D region of space the extent of which depends on the artificial ILD and ITD cues introduced by the panning. A purist might well object to this 3D projection as not one intended by the mixing engineer (who had no way of hearing it in 3D since the mixing was not done through XTC). It has been our experience that the vast majority of such purists are theoretically purist. When in practice they are asked to compare the artificial 3D image through BACCH to the artificial image without it, they almost invariably prefer the former artificial image to the latter. For that you can take the words of the many listeners who have reported their impressions after days of listening, or even better come visit and listen.

Apologies for the length of this reply and, again, thank you for adding your thoughts to this discussion.

Edgar Choueiri

Thanks, @Edgar Choueiri for contributing to my understanding of important concepts.

Crosstalk is an intriguing concept. In later years, vinyl playback has made a comeback, become fashionable. What describes vinyl playback is primarily (?) lots of crosstalk. In my wisdom, I concluded that lots of people prefer crosstalk to higher separation of channels. I even speculated that lots of crosstalk was so much preferred that it may also explain the growth of mono (speakers). There are some intriguing reports out there (from Bob Ohlsson) that experienced mastering engineers preferred mono to stereo when listening blind.

Then comes BACCH, which - as far as I understand - take the different, separate channel information of the source (file) to the next level. It seems like you cannot use the high crosstalk of vinyl as an explanation of that format’s popularity after all. On the contrary, BACCH shows us that vinyl’s popularity is in spite of high crosstalk, not because of!

Do you follow my reasoning or have I misunderstood something?

One last question: Would BAACH work if you use vinyl player as a source (via an AD converter)?

Purité Audio · Apr 17, 2018

Vinyl users enjoy/suffer a whole raft of other types of distortion not just intermodulation .
Keith

Cosmik · Apr 17, 2018

svart-hvitt said:
What describes vinyl playback is primarily (?) lots of crosstalk. In my wisdom, I concluded that lots of people prefer crosstalk to higher separation of channels.

If your speakers are ropey (typical two-way audiophile speakers would typically have break-up, beaming, phase shifts, lumpy dispersion, no time alignment, wheezing port, etc.), then high stereo separation would probably produce weird, shifting imaging. Deliberate crosstalk/mono would make things a lot easier on the ears.

svart-hvitt · Apr 17, 2018

Cosmik said:
If your speakers are ropey (typical two-way audiophile speakers would typically have break-up, beaming, phase shifts, lumpy dispersion, no time alignment, wheezing port, etc.), then high stereo separation would probably produce weird, shifting imaging. Deliberate crosstalk/mono would make things a lot easier on the ears.

Yes, most people have poor setups, which means a poor format may be what such setups «need».

But there are audiophiles with high quality setups too, who like the vinyl «thing»; where the «thing» may have to do with high cross talk.

So I basically wondered:

=> Do we have data from blind tests that show people preferring low crosstalk to high crosstalk, stereo to mono?

@Floyd Toole preferred to do speaker tests on one speaker (i.e. mono) only. Maybe he has a view on crosstalk preferences?

oivavoi · Apr 17, 2018

svart-hvitt said:
Yes, most people have poor setups, which means a poor format may be what such setups «needs».

But there are audiophiles with high quality setups too, who like the vinyl «thing»; where the «thing» may have to do with high cross talk.

So I basically wondered:

=> Do we have data from blind tests that show people preferring low crosstalk to high crosstalk, stereo to mono?

@Floyd Toole preferred to do speaker tests on one speaker (i.e. mono) only. Maybe he has a view on crosstalk preferences?

This would lead to a lot of angry reactions if I posted it on any other forum, but I feel very very convinced that the vinyl revival has to do with other factors than sound. People like the physical feel of the medium, the ritual, the exclusivity, etc. But would people actually prefer the sound of vinyl under blind conditions? I have always doubted it. Some time ago I stumbled across this study, which seems to confirm my suspicions: http://eprints.hud.ac.uk/id/eprint/27345/1/UnwinsAnalogue.pdf

svart-hvitt · Apr 17, 2018

oivavoi said:
This would lead to a lot of angry reactions if I posted it on any other forum, but I feel very very convinced that the vinyl revival has to do with other factors than sound. People like the physical feel of the medium, the ritual, the exclusivity, etc. But would people actually prefer the sound of vinyl under blind conditions? I have always doubted it. Some time ago I stumbled across this study, which seems to confirm my suspicions: http://eprints.hud.ac.uk/id/eprint/27345/1/UnwinsAnalogue.pdf

I think you are right but chose to write my questions without implying that vinyl lovers are exclusively emotion-driven.

I think we both know vinyl lovers who are very systematic in their approach to «better» sound.

Let’s try and keep the question related to crosstalk (preferences) and not pro-contra vinyl.

Floyd Toole · Apr 17, 2018

Thank you Edgar for the elaborate commentary on BACCH. It is as I had suspected, and, indeed, expected. Removing the crosstalk indeed greatly expands the directional and spatial illusions. When we demonstrated that with VMAx many years ago, it was such a contrast to spatially and directionally deprived stereo that listeners were amazed. The only issue was that it was not the art as it was created, so from that perspective it is really an "effect". Some effects can be attractive, at least some of the time.

You describe this as: "Therefore what people are reporting about when they listen to an acoustic recording played through a BACCH filter has nothing to do with phantom speakers or location of the actual speakers, but everything to do with the more correct reproduction of the sources whose spatial cues were captured in the recording." Because we are not talking about an encode-decode process, one cannot say that the reproduction is "more correct". It is what it is, and what it is may be very attractive to many listeners, but it cannot be "more correct". Stereo mixing for the bulk of music is multitrack, using isolation booths, or at the very least voices and instruments individually miked. These components are amplitude panned to various locations across the soundstage as the mixer/sound designer chooses. Spatial effects are often electronically generated - artificial - having nothing to do with real acoustical spaces. Classical recordings are a very mixed bag of methods, ranging from a coincident Blumlein pair to a pair of widely spaced omnis, to several mics placed over and within the orchestra, and others farther out in the hall. Again, these are combined in the sound design. All of these are created to sound as desired by the mixer/producer/mastering engineer while listening to a pair of loudspeakers in a small room. So, because the mix did not anticipate crosstalk-cancelled reproduction, what one hears through such a system is not "more correct" but instead the result of spatial post processing of a particular kind.

The fact that stereo is so directionally and spatially deprived means that it is not difficult to generate more entertaining illusions, and as we both have noted, several such efforts have come and gone over the years. Some involved adding more loudspeakers and others were grossly simplified attempts at crosstalk cancellation using the original stereo pair. All of them were found to be attractive to some listeners. As I note in the 3rd edition of my book, evidence suggests that spatial impressions can be comparable with timbral accuracy in overall subjective ratings of sound quality - they go together.

Ideally we want an encode-decode system, so that results are predictable from the creative artists through to the listeners. When VMAx created phantom loudspeakers at +/- 30 deg. it was acknowledging the reality of the mixing situation, but incorporating the advantage of not being able to localize the loudspeakers from which the sound originates. The sense of distance was dramatic, especially in some classical tracks. It was, as I said, probably the best stereo I have ever heard. To me, and many others, it was generally preferable to the "naked" crosstalk cancellation process which was often a bit exaggerated. Obviously that is a matter of personal taste and, inevitably, greatly dependent on the recording.

Also obvious, is the fact that simple crosstalk cancelled reproduction is necessary for binaural, dummy-head or synthesized, recordings. When these were played the result was much more rewarding than the equivalent headphone reproduction because the dominant sound images were convincingly externalized and in front of the listener. Head tracking solves that problem, as you have commented, and which why it is used in Binaural Room Scanning.

What you and your team have created is good, but it will be better with material created deliberately for playback through it. I wish you luck with Ambisonics. As I note in both of my books, it theoretically should work best in a reflection free space. When we set up a precise listening experiment in the NRCC anechoic chamber the result was in-head localization. Some room reflections were necessary to externalize the images. It is a mathematical concept that attempts to reconstruct a sound field at a point. The presence of a head and head shadowing at the reconstruction point is a problem. Some sound from all loudspeakers is necessary at that reconstruction point. It ignores binaural hearing. I have not experienced the latest high-order versions - perhaps their signals can be processed to be more suitable to human listeners.

As you say, listeners are attracted to the expanded soundstage and sense of space that crosstalk cancellation yields, so there obviously is a market for such a product. For those who cannot accommodate a quality multichannel system, it serves a purpose. My personal preference remains in the multichannel/immersive domain, using tasteful upmixing to expand the sense of space from stereo and superb loudspeakers to reduce the tendency to localize them. But that is me, and I am not of the current generation.

Floyd Toole · Apr 17, 2018

svart-hvitt said:
@Floyd Toole preferred to do speaker tests on one speaker (i.e. mono) only. Maybe he has a view on crosstalk preferences?

The choice of monophonic listening tests were the result of elaborate tests showing that listeners were much fussier about sound quality without the complications of stereo imaging and space. Loudspeakers that won mono tests have always won stereo tests in tests run to satisfy those who think that stereo itself presents special challenges. It does, but mainly in the requirement that L and R be identical. This is explained in both of my books.

As for interchannel crosstalk, it seems that anything in excess of about 25-30 dB is adequate to satisfy human listeners. LPs barely qualify, and then only if the cartridge and its internal transducers are properly aligned. It was interesting to find, when looking into this in the '70s, that some cartridges needed to be tilted to reveal their greatest channel separation.

Edgar Choueiri · Apr 17, 2018

Floyd Toole said:
Thank you Edgar for the elaborate commentary on BACCH. It is as I had suspected, and, indeed, expected. Removing the crosstalk indeed greatly expands the directional and spatial illusions. When we demonstrated that with VMAx many years ago, it was such a contrast to spatially and directionally deprived stereo that listeners were amazed. The only issue was that it was not the art as it was created, so from that perspective it is really an "effect". Some effects can be attractive, at least some of the time.

You describe this as: "Therefore what people are reporting about when they listen to an acoustic recording played through a BACCH filter has nothing to do with phantom speakers or location of the actual speakers, but everything to do with the more correct reproduction of the sources whose spatial cues were captured in the recording." Because we are not talking about an encode-decode process, one cannot say that the reproduction is "more correct". It is what it is, and what it is may be very attractive to many listeners, but it cannot be "more correct". Stereo mixing for the bulk of music is multitrack, using isolation booths, or at the very least voices and instruments individually miked. These components are amplitude panned to various locations across the soundstage as the mixer/sound designer chooses. Spatial effects are often electronically generated - artificial - having nothing to do with real acoustical spaces. Classical recordings are a very mixed bag of methods, ranging from a coincident Blumlein pair to a pair of widely spaced omnis, to several mics placed over and within the orchestra, and others farther out in the hall. Again, these are combined in the sound design. All of these are created to sound as desired by the mixer/producer/mastering engineer while listening to a pair of loudspeakers in a small room. So, because the mix did not anticipate crosstalk-cancelled reproduction, what one hears through such a system is not "more correct" but instead the result of spatial post processing of a particular kind.

The fact that stereo is so directionally and spatially deprived means that it is not difficult to generate more entertaining illusions, and as we both have noted, several such efforts have come and gone over the years. Some involved adding more loudspeakers and others were grossly simplified attempts at crosstalk cancellation using the original stereo pair. All of them were found to be attractive to some listeners. As I note in the 3rd edition of my book, evidence suggests that spatial impressions can be comparable with timbral accuracy in overall subjective ratings of sound quality - they go together.

Ideally we want an encode-decode system, so that results are predictable from the creative artists through to the listeners. When VMAx created phantom loudspeakers at +/- 30 deg. it was acknowledging the reality of the mixing situation, but incorporating the advantage of not being able to localize the loudspeakers from which the sound originates. The sense of distance was dramatic, especially in some classical tracks. It was, as I said, probably the best stereo I have ever heard. To me, and many others, it was generally preferable to the "naked" crosstalk cancellation process which was often a bit exaggerated. Obviously that is a matter of personal taste and, inevitably, greatly dependent on the recording.

Also obvious, is the fact that simple crosstalk cancelled reproduction is necessary for binaural, dummy-head or synthesized, recordings. When these were played the result was much more rewarding than the equivalent headphone reproduction because the dominant sound images were convincingly externalized and in front of the listener. Head tracking solves that problem, as you have commented, and which why it is used in Binaural Room Scanning.

What you and your team have created is good, but it will be better with material created deliberately for playback through it. I wish you luck with Ambisonics. As I note in both of my books, it theoretically should work best in a reflection free space. When we set up a precise listening experiment in the NRCC anechoic chamber the result was in-head localization. Some room reflections were necessary to externalize the images. It is a mathematical concept that attempts to reconstruct a sound field at a point. The presence of a head and head shadowing at the reconstruction point is a problem. Some sound from all loudspeakers is necessary at that reconstruction point. It ignores binaural hearing. I have not experienced the latest high-order versions - perhaps their signals can be processed to be more suitable to human listeners.

As you say, listeners are attracted to the expanded soundstage and sense of space that crosstalk cancellation yields, so there obviously is a market for such a product. For those who cannot accommodate a quality multichannel system, it serves a purpose. My personal preference remains in the multichannel/immersive domain, using tasteful upmixing to expand the sense of space from stereo and superb loudspeakers to reduce the tendency to localize them. But that is me, and I am not of the current generation.

Thank you Floyd for responding to my posting.

I very much stand by the claim of “more correct” (in the context of acoustic recordings played back through XTC filters that can provide high levels of XTC) for the simple reason that the claim can be easily demonstrated both with measurements and listening. Here is a simple experiment that I can do for you if you ever come to visit: Choose any stereo mic technique (for the sake of this illustration let us choose ORTF) and record someone a couple of meters away speaking and walking from dead center along an arc to the left, about 90 degrees from the listener. From the recorded stereo signal of the source located at 90 degrees you can easily calculate the ILD, which is about 10 dB. Now, If you play this from a pair of speakers in the equilateral stereo triangle configuration (in a room where the reflected sound does not dominate) with the speakers at about 2 m from the listener, without XTC the recorded voice of that person will be quickly locked into the left speakers after the source has travelled past 30 degrees, if you measure the ILD at the listener position using a dummy head, you would of course get only about 3 to 5 dB. which is the ILD of the left speaker. Now repeat the playback with a good XTC filter (or a sound barrier between the speakers) and now you will hear the voice walking well pas the 30 degree limit of non-XTC playback. This is because the ILD is not anymore limited to the 3dB of that speaker and can now approach the real ILD of 10 dB. This can be also ascertained by a simple ILD measurement using the dummy head which will definitely show you that the ILD is much closer to the correct 10 dB, hence the conclusion “more correct”. If the recording was made binaurally the ILD will be very close to the real target of 10 dB. The same experiment can be repeated with spaced omnis where you can show that the ITD is reproduced more correctly.

Some people on this forum have experienced this enhancement of correctness by making their own recordings using the BACCH-BM in-ear binaural mic then playing it back through a high-XTC level BACCH filter and can reproduce someone whispering in their ear. Anyone with the BACCH-dSP software can do a similar recording with any pair of stereo mics and while the whisper may not be reproduced as close to the ear (compared to a binaural recording) it will be far closer to the 3D location of the original source than when the BACCH filter is bypassed (where instead of close to the ear, it will be locked in the speaker). The XTC reproduction is simply more correct.

Of course, other non-idaealties you mentionned. such as mixing in spot mic signals and hall mics may well degrade this spatial fidelity, but as long as the signals from such mics do not dominate or mess up the ILD and/or ITD cues captured by the main stereo mics, the above arguments hold solidly. (A vast majority of good acoustic recordings are made by engineers who know how to properly delay spot mics based on their distance from the main stereo pickup mics and mix them without messing up the stereo signals and these recordings are simply reproduced more correctly if you use a good XTC filter).

I am quite familiar with the capability of multi-channel playback which can be quite pleasing and enveloping (I have set up in the past many such systems, admittedly without the level of expertise that you have in multi-speaker-multi-channel) and I am confident that you would draw the same conclusion if you agree to objectively compare it to the latest generation of XTC technology. I can assure you that no one working in AR/VR would consider regular ITU multi-channel playback suitable for spatial realism (no one in those fields is using it). By contrast true 3D audio methods such as XTC, ambisonics or wavefield synthesis are now widely studied, developed and adopted in these fields. It is my the belief and hope as an amateur orchestra recording engineer and audiophile that also home hi-fi reproduction can benefit from the advances in true 3D audio made in the past 5 years. Of these three 3D audio approaches, optimized XTC is best suited for the spatial reproduction of exiting stereo recordings.

[Incidentally, I share your reservations about the practicality and efficacy of ambisonic reproduction through real speakers. In that context I must clarify that we (and many other researchers in the field of 3D audio) use ambisonic only as a means to code a 3D sounfield, which is then numerically mixed down to binaural through HRTF convolutions from an array of *virtual* speakers and NOT real speakers. For instance, we use an array of 20 virtual speakers in a perfect dodecahedron setup, thus bypassing the issues and imperfections of real speakers in real rooms, to say nothing of the practical difficulties of setting up an array of 20 real speakers in a sphere centered on a real listener. I doubt that Michael Gerzon had suspected that his method would find more success as a 3D soundfield coding tool in AR/VR rather than a real playback system in Hi-Fi.)

svart-hvitt · Apr 17, 2018

@Edgar Choueiri , how would BACCH work with TV, say daily news, and film? Is it intended more for euphonic pleasure when playing back music or are advantages generally positive across the board including TV and intelligibility of speech?

Is it one-person oriented only, or should cinemas apply BACCH?

dallasjustice · Apr 17, 2018

How does crosstalk cancellation deal with low frequencies? It seems like mono low frequency amplitude would be reduced if crosstalk cancellation were applied. Is it only applied above a certain frequency? Can the filtered result be confirmed with a loopback sweep? Does anyone have in room L+R and L,R REW .mdat loopbacks they would care to share on this thread?

Is there any scientific listener preference data for ambio? I’ve just read about a lot of anecdotal data points from users and shows. Maybe it would be difficult to do a rigorous study of listener preference since ambio seems to require low directivity loudspeakers and normal 2CH playback require excellent controlled directivity off axis for best results.

Michael.

Floyd Toole · Apr 17, 2018

Edgar Choueiri said:
I very much stand by the claim of “more correct” (in the context of acoustic recordings)

And that is precisely the problem I described - only a vanishingly small number of recordings are made with stereo mics alone - the vast majority of music, I would guess virtually all "popular" and jazz recordings are simple multi-mic, multi-track, sampled, synthetically enhanced and pan-potted creations. Most non-classical music is "mono left, mono right and double-mono pan potted phantom images, including the featured artist, with some relatively uncorrelated stuff added for spatial interest. In my extensive listening to TIDAL it is painful to hear the abundant amateurish mixes, however good the music and the performances. A stereo pair of mics is even a problem with a symphony orchestra because of the inverse-square law - some instruments are too far away, so "accent" mics are employed, with artificial spatial enhancements to allow them to blend with the main mic pickup.

I go to about a dozen live classical performances a year - that is my indulgence in pure audio. It is a useful reference. However, I also enjoy a lot of new pop and jazz, and if one looks there are some truly excellent and creative studio creations, with different artists or artificial sounds placed in different acoustical spaces. Sadly, far too many are best described as subtle variations on mono. The tune, lyrics and rhythm survive but more directional and spatial imagination would be nice. It is heresy to say, perhaps, but I have been in some live performances (e.g. Sydney Opera House) after which I have commented sarcastically that I would have preferred a recording. So, there can be problems in "paradise" too.

I think I can end by quoting myself from the earlier post:" Ideally we want an encode-decode system, so that results are predictable from the creative artists through to the listeners." This necessarily includes neutral loudspeakers throughout, and that too is a huge problem. Binaurally post processing existing recordings mixed and mastered for loudspeaker reproduction is not that. Neither is upmixing stereo to a multichannel system. Both can enhance a basic stereo playback, which is why both of us have found our ways to do it. We need a different approach, but from where I sit, I cannot imagine the global audio industry changing from its pathetically obsolete two-channel habits. One can only hope.

Good luck

Dialectic · Apr 18, 2018

dallasjustice said:
How does crosstalk cancellation deal with low frequencies? It seems like mono low frequency amplitude would be reduced if crosstalk cancellation were applied. Is it only applied above a certain frequency? Can the filtered result be confirmed with a loopback sweep? Does anyone have in room L+R and L,R REW .mdat loopbacks they would care to share on this thread?

BACCH-dSP provides several options for low-frequency cutoff. The default is 94 Hz. I have not experimented with the other options and, obviously, have been very pleased using the 94 Hz option, despite my difficulties with some uneven bass (now mostly resolved).

I can run the requested measurements via REW, but I'm not sure it can be done via loopback. Because BACCH compensates for speaker directivity and, to an extent, room reflections, my measurements would differ from those that a listener with a nearly ideal room (like you @dallasjustice) would get.

I think the focus on bass reproduction is slightly misplaced. Yes, bass reproduction is enormously important in a conventional system. While bass remains important with BACCH, it fades in relative importance because BACCH's reproduction of the space in which the recording was made is so realistic. As I said above, I'm always stunned immediately after I switch BACCH off because I'm always unpleasantly reminded what my room sounds like.

Dialectic · Apr 18, 2018

svart-hvitt said:
@Edgar Choueiri , how would BACCH work with TV, say daily news, and film? Is it intended more for euphonic pleasure when playing back music or are advantages generally positive across the board including TV and intelligibility of speech?

Is it one-person oriented only, or should cinemas apply BACCH?

Here's a response from a less august member of the forum.

BACCH is for music. It's not surround sound, though the effect is more convincing than that of any surround system I've experienced. (And why would one use BACCH to watch the news?)

BACCH is not a good candidate for cinemas because the effect is maximized in one listening position. That listening position usually is tracked via webcam, although one can set the listening position to be in the sweet spot (or anywhere else) between the two speakers. I strongly prefer webcam mode.

RayDunzl · Apr 18, 2018

svart-hvitt said:
how would BACCH work with TV, say daily news, and film?

Much TV talk is mono. Checked several news channels.

What does BACCH do for that?

Well, the commercials are often in Stereo... to varying degrees...

oivavoi · Apr 18, 2018

Further question to you BACCH users: How does the headphone thing work? I've been curious about trying out the Smyth Realizer (which apparently externalizes the acoustic scene and places it outside the head when listening to headphones). But if BACCH can accomplish the same thing in one big pacakge it's one more argument for trying it out...

Dialectic · Apr 18, 2018

oivavoi said:
Further question to you BACCH users: How does the headphone thing work? I've been curious about trying out the Smyth Realizer (which apparently externalizes the acoustic scene and places it outside the head when listening to headphones). But if BACCH can accomplish the same thing in one big pacakge it's one more argument for trying it out...

Yes, the headphone feature, BACCH-dHP, can do essentially what the Smyth Realiser does. One calibrates the headphone feature with one's speakers, which become spatial anchors for the sound.

The first few times I used it, I couldn't believe my speakers weren't blasting. I like the effect on most material.

BACCH4Mac Pro Edition: a report

Major Contributor

Member

Major Contributor

Major Contributor

Master Contributor

Major Contributor

Major Contributor

Major Contributor

Major Contributor

Senior Member

Senior Member

Member

Major Contributor

Major Contributor

Senior Member

Major Contributor

Major Contributor

Grand Contributor

Major Contributor

Major Contributor

Similar threads