• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

AES Paper Digest: Differences among Several High Sampling Digital Recording Formats

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
44,676
Likes
241,082
Location
Seattle Area
This is another formal listening test published at Audio Engineering Society Convention on differences between high resolution formats: http://www.aes.org/e-lib/browse.cfm?elib=13185

Differences of Hearing Impressions Among Several High Sampling Digital Recording Formats
Toshiyuki Nishiguchi, Kimio Hamasaki
NHK Science & Technical Research Laboratories, Tokyo, 157-8510, Japan
[email protected], [email protected]


Presented at the 118th Convention
2005 May 28–31 Barcelona, Spain

The Abstract tells the whole story:

"To study the difference of hearing impression of recorded sound among several high sampling digital recording formats, we conducted subjective evaluation tests. Perceptual discrimination was evaluated among the following digital recording formats: 24 bit/48 kHz, 24 bit /192 kHz and DSD.

The sound reproduction system for the subjective evaluation tests was carefully designed in order to reproduce the highest quality of sound on each digital recording format. Listening panels were selected from students of a university of music, recording engineers, and musicians. Sound stimuli for the evaluation were originally recorded to have exactly the same quality of analog signal which was fed to different A/D conversion systems. The results of subjective evaluation using pair test method showed that the sound quality of the auditory frequency band in this experimental system might not depend on the sampling format."

Here is the setup:

upload_2016-3-16_17-3-16.png


In other word, parallel capture of a live session into three formats: 24/48, 24/192 and 1x DSD.

Nice confirmation of the ADC/DAC bandwidth:

upload_2016-3-16_17-5-28.png


And the results? Another buzzkill:

"The six subjects completed listening tests consisting of
24 pairs on three different stimuli in which they
evaluated each pair. The results showed no significant
difference. This means that the experiments could not
identify any subject or any stimulus that might show a
significant difference among the digital recording
formats."


Note: the test lacks statistical analysis. The paper doesn't have sufficient detail to perform such an analysis. In vast majority of test cases however, the % right was below 50 so their conclusion is likely correct.
 
Last edited:

Krunok

Major Contributor
Joined
Mar 25, 2018
Messages
4,600
Likes
3,069
Location
Zg, Cro
@amirm This paper is done in an impressive way, thank you for sharing it! Comparing it's result to this paper, do you think they correlate or conflict with each other?

https://www.academia.edu/441305/Sampling_Rate_Discrimination_44.1_KHz_Vs._88.2_KHz

Edit: If I understood correctly the difference in the "Sampling Rate Discrimination.." paper was observed only with complex (orchestral) music so it would be interesting to know what kind of music material was used in the "Differences among.." paper.
 
Last edited:

oivavoi

Major Contributor
Forum Donor
Joined
Jan 12, 2017
Messages
1,721
Likes
1,940
Location
Oslo, Norway
This experiment was also published as a convention paper after that meta-review: http://www.aes.org/tmpFiles/elib/20180427/18228.pdf

This is one of the areas where I've grudgingly been forced to change my convictions. It does seem that there may in fact be a difference between cd and hi-rez that can be audible for certain people on certain equipment, even though I have been highly skeptical of that claim.
 

Cosmik

Major Contributor
Joined
Apr 24, 2016
Messages
3,075
Likes
2,180
Location
UK
This experiment was also published as a convention paper after that meta-review: http://www.aes.org/tmpFiles/elib/20180427/18228.pdf

This is one of the areas where I've grudgingly been forced to change my convictions. It does seem that there may in fact be a difference between cd and hi-rez that can be audible for certain people on certain equipment, even though I have been highly skeptical of that claim.
Don't give up too easily...

The paper doesn't give much detail, but says they used Matlab's resample function with an "FIR low pass filter".
https://uk.mathworks.com/help/signal/ref/resample.html

How do we know that filter was as good as it should be? It should be a SINC function, and indeed someone has developed their own SINC resampling function
https://uk.mathworks.com/matlabcentral/fileexchange/12268-sinc-resample

...which presumably suggests that the standard function isn't up to scratch when it comes to ultimate accuracy.
 

oivavoi

Major Contributor
Forum Donor
Joined
Jan 12, 2017
Messages
1,721
Likes
1,940
Location
Oslo, Norway
You mean that the audible difference may in fact be due to the downsampling method, and that better downsampling would work better and not be audible? It's a possibility, of course.

EDIT: but if they use standard downsampling and this downsampling is audible, then their results would still have significance for the real world. If they use some exotic downsampling on the other hand, not so much.
 

Cosmik

Major Contributor
Joined
Apr 24, 2016
Messages
3,075
Likes
2,180
Location
UK
If they use some exotic downsampling on the other hand, not so much.
Not exotic, just 'standard'. Resampling using "a 16th order low pass filter" (or whatever they used if it wasn't sinc) is not dissimilar to the sorts of shenanigans that boutique manufacturers and amateurs play about with after their DACs, but in this case, because of the sample rate shift the results could be more audible. A modified frequency response, for example would be a dead giveaway, or if aliasing was at any level above exemplary.

(And the high res track they used should be examined by Amir for spurious strong ultrasonic content e.g. the video monitor interference he sometimes sees, in case it is causing audible IMD in the playback equipment, not to mention the aforementioned aliasing if the resampling is not as good as it could be).
 

oivavoi

Major Contributor
Forum Donor
Joined
Jan 12, 2017
Messages
1,721
Likes
1,940
Location
Oslo, Norway
Not exotic, just 'standard'. Resampling using "a 16th order low pass filter" (or whatever they used if it wasn't sinc) is not dissimilar to the sorts of shenanigans that boutique manufacturers and amateurs play about with after their DACs, but in this case, because of the sample rate shift the results could be more audible. A modified frequency response, for example would be a dead giveaway, or if aliasing was at any level above exemplary.

That seems like a valid point. What they should do if they wanted to take the paper further, I guess, is to display the content of the files, so other can try to determine whether there are in fact any such differences (beyond the resolution of the files) which can account for the differences heard.
 
Last edited:

Krunok

Major Contributor
Joined
Mar 25, 2018
Messages
4,600
Likes
3,069
Location
Zg, Cro

Cosmik

Major Contributor
Joined
Apr 24, 2016
Messages
3,075
Likes
2,180
Location
UK
From what I read here I see no reason to doubt their method ow downsampling.

https://se.mathworks.com/help/dsp/examples/designing-low-pass-fir-filters.html
I don't know who you are talking about, Krunok. Matlab obviously provides various facilities for filtering; and knowledgeable individual users know how to implement other types of filter (specifically the sinc function - windowed and truncated to what length?) even if Matlab doesn't.

The question is whether the experimenters knew what they were doing when it came to downsampling the audio. And as they use the term "low pass filter" without automatically using the 'sinc' word, it makes you wonder...
 

Krunok

Major Contributor
Joined
Mar 25, 2018
Messages
4,600
Likes
3,069
Location
Zg, Cro
I don't know who you are talking about, Krunok. Matlab obviously provides various facilities for filtering; and knowledgeable individual users know how to implement other types of filter (specifically the sinc function - windowed and truncated to what length?) even if Matlab doesn't.

The question is whether the experimenters knew what they were doing when it came to downsampling the audio. And as they use the term "low pass filter" without automatically using the 'sinc' word, it makes you wonder...

It is quite easy to make good FIR low pass filter and I see no reason to doubt there was anything wrong with it. Do you have any specific reason to doubt their filter or you have doubts because you think it's ok to doubt everything? Because if that is so, I can doubt if it is really you posting this or maybe somebody stole your forum credentials and now is driving everybody mad with various doubts..? :D
 

Wombat

Master Contributor
Joined
Nov 5, 2017
Messages
6,722
Likes
6,464
Location
Australia
It is quite easy to make good FIR low pass filter and I see no reason to doubt there was anything wrong with it. Do you have any specific reason to doubt their filter or you have doubts because you think it's ok to doubt everything? Because if that is so, I can doubt if it is really you posting this or maybe somebody stole your forum credentials and now is driving everybody mad with various doubts..? :D

Sound like a good imitation of posts that are becoming more commonplace. :p
 

Cosmik

Major Contributor
Joined
Apr 24, 2016
Messages
3,075
Likes
2,180
Location
UK
It is quite easy to make good FIR low pass filter...
Is it?!! Well the controversy about MQA, for example, is all about the finer details of low pass filtering. Check out this article:
http://archimago.blogspot.co.uk/2018/02/musingsmeasurements-on-blurring-and-why.html

As you can see, "filter" covers a multitude of sins, and if the debate over high res over non-high res is about the very low level details, then you can't just use the nearest filter you have to hand, that may pass levels of aliasing that are higher than the lowest levels in the signal or affects frequency response to any significant degree. The experimenters need to be very careful about that, and give all the details.

In Linux there are resampling filters that go from total rubbish to probably quite good. You can substitute different ones, and the best ones use considerable processing power. You would clearly be a user who didn't care about that and would just go with the default (which would be pretty rubbish!).
 

oivavoi

Major Contributor
Forum Donor
Joined
Jan 12, 2017
Messages
1,721
Likes
1,940
Location
Oslo, Norway
I do think it's appropriate to ask questions about scientific papers like this. It's not like these guys are trying to find out how billiard balls interact with each other. It's about human perception, a highly fickly subject, and it's a truism in research methodology that there are tons of potentially confounding factors in experimental designs (just do a google search for "replication crisis in social psychology", and you will see what I mean).

In this case, reasons for doubt are for example that there are some a priori reasons for doubting the finding. Humans can't consciously hear or detect ultrasonic frequencies, so the claim that higher bitrates are important for that reason rests on the assumption that we detect such frequencies in an unconscious manner. This may be true, of course, as perception is largely subconscious, with consciousness just being the icing on the cake. But still, I remain skeptical. I see a stronger argument for hi-rez with regards to dynamic range, if one has a system and a room which is up for it. In that case, the case for hi-rez rests on mechanisms and physics which is well understood and not controversial.

Furthermore, there are several other blind tests which show that listeners often struggle to reliably separate 320 kbs from 16-bit, even. Here's a recent study that will be presented at an AES convention soon:
HiyhyWU.png

HiyhyWU.png

5yc5l.png


Which makes the claim that 16-bit can so easily be detected from hi-rez by the listeners in this test open to the objection that something may have gone wrong in the experimental design. But it could also be that they have been able to detect a real, objective and audible difference, and that it's those other blind tests which have failed to do so.
 
Last edited:

Jakob1863

Addicted to Fun and Learning
Joined
Jul 21, 2016
Messages
573
Likes
155
Location
Germany
This raises interesting questions.... :)

The authors were using a Matlab tool box function (i.e. the resample function), that can have some problems with insufficient antiimaging/antialiasing filters, but provides the ability to use carefully customised lowpass filters, which the authors used (high order ?16th? butterworth lowpass filter).
I´d assume that they used another Matlab toolbox function to develop the lowpass, without further information about the actual process.

That these questions are discussed illustrates the difficulties to rule out every imaginable variable in the experimental process; up to now i´ve seen these arguments/recommendations about the propper way to provide the same music samples with different sample rates:

-) two ADCs (same model) set to the according sample rate while converting the signal to avoid any impact of a (perhaps) flawed resampling software algorithm
-) using high resolution material with a resampling software algorithm to avoid any impact of a (perhaps) flawed implementation in the hardware

the second argument is/was even maintained as usually specifications/measurement for the ADCs is provided while that luxury in case of any software often isn´t available.
If the source code is available that surely might help but isn´t always obtainable.

But again, it depends on the research question that is under examination. Is it to find out about possible flaws in todays production methods or is it to explore the abilities of human listeners (or even both).

It is getting increasingly more difficult to get the needed informations by measurements as the traditional black box approach (assuming a LTI system works inside, provided that it is used within its limits) might fail in case of complex software implementations that may or may not be processing signals in a nonlinear manner.
And to get all this needed information for published experiments is even more difficult, although sometimes additional web resources are available.

Btw, due to this difficulties the ITU emphasizes the need for perceptual evaluation ........


Edit: it´s an "LTI system"
 
Last edited:

Wombat

Master Contributor
Joined
Nov 5, 2017
Messages
6,722
Likes
6,464
Location
Australia
This raises interesting questions.... :)

The authors were using a Matlab tool box function (i.e. the resample function), that can have some problems with insufficient antiimaging/antialiasing filters, but provides the ability to use carefully customised lowpass filters, which the authors used (high order ?16th? butterworth lowpass filter).
I´d assume that they used another Matlab toolbox function to develop the lowpass, without further information about the actual process.

That these questions are discussed illustrates the difficulties to rule out every imaginable variable in the experimental process; up to now i´ve seen these arguments/recommendations about the propper way to provide the same music samples with different sample rates:

-) two ADCs (same model) set to the according sample rate while converting the signal to avoid any impact of a (perhaps) flawed resampling software algorithm
-) using high resolution material with a resampling software algorithm to avoid any impact of a (perhaps) flawed implementation in the hardware

the second argument is/was even maintained as usually specifications/measurement for the ADCs is provided while that luxury in case of any software often isn´t available.
If the source code is available that surely might help but isn´t always obtainable.

But again, it depends on the research question that is under examination. Is it to find out about possible flaws in todays production methods or is it to explore the abilities of human listeners (or even both).

It is getting increasingly more difficult to get the needed informations by measurements as the traditional black box approach (assuming a LTI works inside, provided that it is used within its limits) might fail in case of complex software implementations that may or may not be processing signals in a nonlinear manner.
And to get all this needed information for published experiments is even more difficult, although sometimes additional web resources area available.

Btw, due to this difficulties the ITU emphasizes the need for perceptual evaluation ........
 

Wombat

Master Contributor
Joined
Nov 5, 2017
Messages
6,722
Likes
6,464
Location
Australia
I do think it's appropriate to ask questions about scientific papers like this. It's not like these guys are trying to find out how billiard balls interact with each other. It's about human perception, a highly fickly subject, and it's a truism in research methodology that there are tons of potentially confounding factors in experimental designs (just do a google search for "replication crisis in social psychology", and you will see what I mean).

In this case, reasons for doubt are for example that there are some a priori reasons for doubting the finding. Humans can't consciously hear or detect ultrasonic frequencies, so the claim that higher bitrates are important for that reason rests on the assumption that we detect such frequencies in an unconscious manner. This may be true, of course, as perception is largely subconscious, with consciousness just being the icing on the cake. But still, I remain skeptical. I see a stronger argument for hi-rez with regards to dynamic range, if one has a system and a room which is up for it. In that case, the case for hi-rez rests on mechanisms and physics which is well understood and not controversial.

Furthermore, there are several other blind tests which show that listeners struggle to reliably separate 320 kbs from 16-bit, even. Here's a recent study that will be presented at an AES convention soon:
HiyhyWU.png

HiyhyWU.png

5yc5l.png


Which makes the claim that 16-bit can so easily be detected from hi-rez by the listeners in this test open to the objection that something may have gone wrong in the experimental design. But it could also be that they have been able to detect a real, objective and audible difference, and that it's those other blind tests which have failed to do so.


Casting doubt is easy. Backing up doubts in a credible way is more difficult.
 

oivavoi

Major Contributor
Forum Donor
Joined
Jan 12, 2017
Messages
1,721
Likes
1,940
Location
Oslo, Norway
Casting doubt is easy. Backing up doubts in a credible way is more difficult.

But I just backed up my "doubt" by two very specific points, didn't I? Let's repeat:
1) That humans are not able to consciously detect ultrasonic frequencies
2) The existence of other studies/blind tests which show difficulty distinguishing 320 kbs from 16-bit, and 16-bit from hi-rez
 

Wombat

Master Contributor
Joined
Nov 5, 2017
Messages
6,722
Likes
6,464
Location
Australia
But I just backed up my "doubt" by two very specific points, didn't I? Let's repeat:
1) That humans are not able to consciously detect ultrasonic frequencies
2) The existence of other studies/blind tests which show difficulty distinguishing 320 kbs from 16-bit, and 16-bit from hi-rez

Forgot to add 'General comment'. Sorry.o_O
 

oivavoi

Major Contributor
Forum Donor
Joined
Jan 12, 2017
Messages
1,721
Likes
1,940
Location
Oslo, Norway
Forgot to add 'General comment'. Sorry.o_O

General comment: When someone cites a comment, and adds a comment of their own, it is easy for the first commenter to perceive the second comment as related to his own comment, even though it is prefixed by the words "general comment". Just a general comment, of course.
 
Top Bottom