• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

What would it mean to use Bayesian statistics in listening tests?

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
20,524
Likes
37,057
http://nautil.us/issue/74/networks/the-flawed-reasoning-behind-the-replication-crisis

I could point to others articles, but this one I happened to see today.

I've long thought except at the very beginning of testing some area of knowledge that 5% significance (two sigma) leads to too many false positives. In managing manufactured quality 3 sigma seems to give much better control over a process. 2 sigma actually worsened quality when applied to manufacturing in the early 20th century. And some sciences move past 5% being enough fairly soon. Physics being very far past it in requiring 5 sigma results to consider something like the Higgs boson being found. So I've thought audio testing should use 3 sigma myself.

None of that deals with things as a Bayesian statistician would. So I wonder if anyone has any thoughts or knowledge about the idea. I'm no expert, and only understand in crude terms the way you properly apply Bayesian statistics.

I also found this book interesting on cause and effect.
https://www.amazon.com/Book-Why-Sci...k+of+why&qid=1564814480&s=digital-text&sr=1-1

Maybe this is of some interest to @svart-hvitt . It isn't epistemology, but a look at how things can go wrong the way science is done leading to false explanations.
 

boXem

Major Contributor
Audio Company
Joined
Jun 19, 2019
Messages
2,014
Likes
4,853
Location
Europe
http://nautil.us/issue/74/networks/the-flawed-reasoning-behind-the-replication-crisis

I could point to others articles, but this one I happened to see today.

I've long thought except at the very beginning of testing some area of knowledge that 5% significance (two sigma) leads to too many false positives. In managing manufactured quality 3 sigma seems to give much better control over a process. 2 sigma actually worsened quality when applied to manufacturing in the early 20th century. And some sciences move past 5% being enough fairly soon. Physics being very far past it in requiring 5 sigma results to consider something like the Higgs boson being found. So I've thought audio testing should use 3 sigma myself.

None of that deals with things as a Bayesian statistician would. So I wonder if anyone has any thoughts or knowledge about the idea. I'm no expert, and only understand in crude terms the way you properly apply Bayesian statistics.

I also found this book interesting on cause and effect.
https://www.amazon.com/Book-Why-Sci...k+of+why&qid=1564814480&s=digital-text&sr=1-1

Maybe this is of some interest to @svart-hvitt . It isn't epistemology, but a look at how things can go wrong the way science is done leading to false explanations.
Thanks for the links, it was an interesting read.
In reaction to what you wrote, modern manufacturing uses 6 sigma for QA.
 
OP
Blumlein 88

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
20,524
Likes
37,057
Thanks for the links, it was an interesting read.
In reaction to what you wrote, modern manufacturing uses 6 sigma for QA.
Yes your are right. In the 1920's, some manufacturing plants tried 2 sigma and quality was worse, and became almost chaotic. Switching to 3 sigma improved results, and the process of manufacturing complex goods became stable. And over time even better quality was achieved once better control was obtained.
 

Cosmik

Major Contributor
Joined
Apr 24, 2016
Messages
3,075
Likes
2,180
Location
UK
Very interesting, and by just dropping "gut feel" into the calculation, all kinds of questions could be resolved scientifically:
Maybe we’re not so dogmatic as to rule out “The Thinker” hypothesis altogether, but a prior probability of 1 in 1,000, somewhere between the chance of being dealt a full house and four-of-a-kind in a poker hand, could be around the right order of magnitude...
...Putting this claim into Bayes’ theorem with our prior probability assignment would yield:
(12 p) * (1/1,000) vs. (p) * (999/1,000)​
The resulting probability is then quoted to three decimal places!

Don't get me wrong: I'm more inclined towards this than reliance on statistics calculated with the corresponding required assumptions probably having not been met in the first place.
 
  • Like
Reactions: Tks
OP
Blumlein 88

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
20,524
Likes
37,057
Very interesting, and by just dropping "gut feel" into the calculation, all kinds of questions could be resolved scientifically:

The resulting probability is then quoted to three decimal places!

Don't get me wrong: I'm more inclined towards this than reliance on statistics calculated with the corresponding required assumptions probably having not been met in the first place.
I agree, that caught my eye as well with the three decimal places.

The House TV episode where House and Cuddy are on a plane from Malaysia (?), and everyone starts getting sick is a nice drama about such thinking. Among other things House concludes a woman is pregnant. He is as always very sure. And as is common, in the end quite wrong.

The most simple to understand example of how Bayesian thinking helps is like in the linked article medical testing. But if one then tries to determine how to apply priors you end up in a big blank space. The book I linked to is interesting as it is about an emerging method to properly apply Bayesian statistics to both plot a path to proceed and know that you've done it correctly. In the medical test example, how do you know all the prior statistics to apply them? How should you proceed if you don't have that info or it isn't available?

I can give examples of how Bayesian thinking is probably something like we informally do in our decision making, but again without some additional rigor you can't get far. You do end up with "gut feel". Gut feel can work, and it can definitely blind you or lead you down wrong paths. Done right Bayesian thinking is more like how we normally make decisions.
 

svart-hvitt

Major Contributor
Joined
Aug 31, 2017
Messages
2,375
Likes
1,253
@Blumlein 88 , are you out of your mind? Asking me a question of epistemological interest and expecting a quick and satisfying answer in return?

;)

I tried to keep it short, but couldn’t...

So I will give my 2 cents first and leave an addendum at the bottom.

2 CENTS
Putting the world into a model is dangerous. Islam advices not to depict the wonder of God at all, for a good reason!

It takes a master to use models of the world. The trustworthy master must be driven by skill, benevolence, integrity and honesty.

As soon as incentives start to interfer, you can often say good bye too benevolence, integrity and honesty.

In other words: Incentives are the root cause of model problems and bias what comes out.

And remember: In many cases where humans are involved, humans may alter the studied phenomenon too because of feedback effects, which complicates the picture beyond any standard model.

- - - - - - - ADDENDUM - - - - - - -

Read this on the use, misuse and abuse of statistics. It’s the American Statistical Association warning against its own profession.

https://www.amstat.org/asa/files/pdfs/P-ValueStatement.pdf

Long version:

https://www.tandfonline.com/doi/pdf/10.1080/00031305.2016.1154108

An earlier critique, a classic, by esteemed David A Freedman:

http://psychology.okstate.edu/faculty/jgrice/psyc5314/Freedman_1991A.pdf

Excerpt (my bolding):

«At present, regression models are used to make causal argument in a wide variety of social applications, and it is perhaps time to evaluate the results.

A crude four-point scale may be useful:
1. Regression usually works, although it is (like anything else) imperfect and may sometimes go wrong.
2. Regression sometimes works in the hands of skillful practitioners, but it isn’t suitable for routine use.
3. Regression might work, but it hasn’t yet.
4. Regression can’t work.

Textbooks, courtroom testimony, and newspaper interviews seem to put regression into category 1. Category 4 seems too pessimistic. My own view is bracketed by categories 2 and 3, although good examples are quite hard to find».

Finance, where arguably the smartest people on the planet work because that’s where the money is, has gotten it wrong for decades. Beta as a representation of risk was defined in the 1960s, but high beta stocks have always had lower returns than low beta stocks; i.e. the exact opposite of what the model said.

Wesley C Mitchell, a famous economist of his time, pointed out 100 years ago that market prices didn’t fit a standard statistical model, but his life’s work got a lethal attack when Tjalling Koopmans - a matematician turned economist at a time when economists started seeking the prestige of the natural sciences - wrote «Measurement without theory» as a response to Mitchell’s work. Mandelbrot made the same argument as Mitchell more than half a century later, but nobody takes notice still (well, read Akerlof (2019) for a more nuanced perspective on economists criticizing their own caste).

In recent years, the president of The American Finance Association wrote about a «factor zoo» in finance, where statistics were applied to detect factors in market prices that common sense wise cannot all be there:

https://faculty.chicagobooth.edu/john.cochrane/research/papers/discount_rates_jf.pdf

To his credit, Eugene Fama, recipient of the so called Nobel prize in economics for his work on factors, has said that he never uses the term «statistically significant» when expressing himself in journal articles.

This is what happens when fools get tools:

«We were seeing things that were 25-standard deviation moves, several days in a row».
David Viniar, CFO of Goldman Sachs during the Great financial crisis

Bertrand Russell, who Wikipedia describe as the philosopher, logician, mathematician, historian, writer, essayist, social critic, political activist, and Nobel laureate, said:

«Science is what you know; philosophy is what you don't know».

I guess that’s why philosophers are hated in modern times. Because a philosopher will focus on what’s not in the model, which can be costly to say Goldman Sachs if the firm were to pay a true cost of their operation.
 

PierreV

Major Contributor
Forum Donor
Joined
Nov 6, 2018
Messages
1,437
Likes
4,686
The most simple to understand example of how Bayesian thinking helps is like in the linked article medical testing. But if one then tries to determine how to apply priors you end up in a big blank space.

I am a bit irritated by the constant reuse of the prevalence/false positive/false negative example in contexts that seem to imply that modern medicine only seems to begin to accept that type of reasoning method now. Here is a picture of page 8 of my copy of volume 1 of the 11th edition (1987) of Harrison's Principles of Internal Medicine, basically the 3000+ pages bible med students had to eat and sleep with for a couple of years (on top of other things to do ;) ). That very small chapter essentially already offered a decent dispassionate summary of what various blogs/books/articles present like paradigm shifts today...

Capture copy.jpg


Generally, in that context, I'd say that the initial prior(s) would be defined by the test supplied manufacturer's data (on the test side) and existing epidemiological data (on the prevalence side). The manufacturer's data would have been the result of a process that started in a Chemical/Immunological/Physical Journal went through lab and then clinical testing. At that stage, "frequentist" stats would dominate as they would later on the QC side of the test production. Then the field performance would be assessed (as in "does the test performance degrade after 3 months in storage or in high temperature or with real world interactions), the initial conditions would be reassessed (did the prevalence of the disease change now that we have a test) and that usually leads to reevaluation of the priors. In fact, a lot of predictive models rely on live prior updating, sometimes in real time (bio sensors) but that's a long story...

So, yes frequentists and bayesians will keep on pushing for their fields, creating - for audience purpose - an opposition that doesn't really exist in practice.

As far as the replicability crisis in science is concerned, yes, statistics can be misused, we all know that; yes the wrong approach can be chosen either intentionally (bias) or accidentally, but the main problems in life sciences an semi-soft fields (not physics/maths/etc), imho, are the pressure to publish for tenured scientists which leads to a ton of publications showing a result (any result is ok ;) ) either through p-hacking or just by running enough tests or combining enough numbers to come up with somehting significant and industry induced bias through funding.

Not sure what Bayesian reasoning can bring to the audio testing table tbh.
 

Sergei

Senior Member
Forum Donor
Joined
Nov 20, 2018
Messages
361
Likes
272
Location
Palo Alto, CA, USA
So, yes frequentists and bayesians will keep on pushing for their fields, creating - for audience purpose - an opposition that doesn't really exist in practice.

I agree that not much of such opposition exists in areas of practice where consequences of significantly suboptimal decisions may be significantly detrimental: e.g. losing an election, or lots of money, or battle, or war, or life.

In other areas - and this includes home audio reproduction - the consequences are not as drastic. Thus the bulk of buying public deems acceptable less accurate modes of reasoning, including non-probabilistic reasoning.
As far as the replicability crisis in science is concerned, yes, statistics can be misused, we all know that; yes the wrong approach can be chosen either intentionally (bias) or accidentally, but the main problems in life sciences an semi-soft fields (not physics/maths/etc), imho, are the pressure to publish for tenured scientists which leads to a ton of publications showing a result (any result is ok ;) ) either through p-hacking or just by running enough tests or combining enough numbers to come up with somehting significant and industry induced bias through funding.

There is a comparable pressure to sell for the audio gear manufacturers, with comparable urge to publish results by running enough tests or combining enough numbers and testimonials to come up with something that a buyer would deem significant enough to overcome the buyer's natural fear of the industry induced bias through funding.
Not sure what Bayesian reasoning can bring to the audio testing table tbh.

IMHO, any awareness of alternative modes of reasoning is beneficial. So many audiophiles are stuck in the non-probabilistic true-or-false land! Then there are those who elevated their thinking to the frequentist one, yet are not aware of higher modes, such as bayesian.

And bayesian is not the highest mode either: I usually use a second-order probabilistic reasoning, not because I believe it is always the best to use, but because it usually provides the desired tradeoff between deep enough and fast enough thinking.
 

PierreV

Major Contributor
Forum Donor
Joined
Nov 6, 2018
Messages
1,437
Likes
4,686
IMHO, any awareness of alternative modes of reasoning is beneficial. So many audiophiles are stuck in the non-probabilistic true-or-false land! Then there are those who elevated their thinking to the frequentist one, yet are not aware of higher modes, such as bayesian.

100% agree. Stumbling on that concept when I was still a relatively young lad was one of the "oooooh" moment of my life.
 

Sergei

Senior Member
Forum Donor
Joined
Nov 20, 2018
Messages
361
Likes
272
Location
Palo Alto, CA, USA

My Amazon records show that I bought Judea's two classic books on the subject in 2008:
https://www.amazon.com/gp/product/0521773628
https://www.amazon.com/gp/product/1558604790
Reading them affected my reasoning quite a bit.
Thanks for making me aware of the popular edition!

Judea family's events are of note too:
https://en.wikipedia.org/wiki/Daniel_Pearl
https://en.wikipedia.org/wiki/Mariane_Pearl

Some conspiracy theorists say that Judea revealing higher modes of thinking to the world, which heretofore were supposedly a part of the world's elite secret knowledge, was the result of his desire to rid the world of the lower modes of thinking.

The theory goes that some highly-placed people didn't like that. Thus the brutal murder of Judea's son: to send the message to Judea and others. However, such interpretation might be not overly credible, in the context of the very framework Judea revealed.

Public foolishness is generally beneficial to the elites: it is easier to rule a populace prone to systematic errors in judgement. However, in the 21st century it also became detrimental in many ways.

The elites want quicker advancements toward long-sought practical immortality and other technological miracles, which, it appears, will be developed faster if the intellectual potential of the mankind is utilized fuller.

In any case, Judea Pearl's teachings is something very much worth to be aware of.
 

svart-hvitt

Major Contributor
Joined
Aug 31, 2017
Messages
2,375
Likes
1,253
THE GRAMMAR OF STATISTICS

@Sergei aired the notion that the benefit of Bayesian - or other modes of statistics - is that it broadens one’s perspective thus increases your skepticism. I share that opinion wholeheartedly.

One related thing about math and statistics that I never understood, is some people’s need to draw a distinction between math, statistics and language, prose. My point is, it becomes obvious fast if a person follows the logic of math and statistics in his writing or when speaking.

Since we’re talking about statistics, it reflects a lack of understanding - or best case sloppiness - if you talk about a study which «shows that a is bigger than b». The use of past tense, «the study showed that a was bigger than b» reflects what statistics is about: A careful, yet incomplete examination of the world.

Take a look at how people, even scientists use past or present tense. You’d be surprised how often present tense is used by professionals - maybe to give the impression that their study has greater general relevance than it really has - instead of applying correct past tense which would reveal the study’s limits.

So there’s statistics (and math) in common language too. Awareness is key.
 

Sergei

Senior Member
Forum Donor
Joined
Nov 20, 2018
Messages
361
Likes
272
Location
Palo Alto, CA, USA
@Sergei aired the notion that the benefit of Bayesian - or other modes of statistics - is that it broadens one’s perspective thus increases your skepticism.

I guess a more accurate description would be that using higher modes of reasoning increases the number of factors and possible outcomes considered. In some cases, this could be interpreted as increasing the skepticism. In others, increasing optimism.

Also, "broadening one's perspective" has positive connotations, yet "focusing one's attention" has positive connotations too. And once again, using higher modes of reasoning is sometimes interpreted as "broadening", and sometimes as "focusing".

I mentioned it before, yet let me repeat, so that there is no ambiguity: using higher modes of reasoning must be balanced in order to be beneficial. The higher the mode, the higher are the data gathering and computational loads, and the higher is the required tolerance of uncertainty.

Lower modes of reasoning may quickly yield excellent results when applied under the right circumstances: your computer likely does that billions of times every second. Higher modes of reasoning may result in analysis paralysis: too much thinking when a practical situation is relatively simple.
 
Last edited:

Sergei

Senior Member
Forum Donor
Joined
Nov 20, 2018
Messages
361
Likes
272
Location
Palo Alto, CA, USA
So there’s statistics (and math) in common language too.

I agree. To me, the key indicators of a person using higher modes of reasoning are meticulous attention to particulars of a situation, and tolerance to the uncertainties of a conclusion.

An unfortunate phenomenon occurs when people using lower modes of reasoning are trying to interpret conclusions reached by people via higher modes of reasoning, e.g. some audiophiles wrapping their heads around conclusions outlined in the Mr. Toole's seminal Sound Reproduction book.

We are very fortunate to have Mr. Toole as a contributor to this forum, so that he can employ the forum's "feedback loop", ensuring that what he wrote is properly understood.

I'm also grateful to have Amir as an owner/moderator, especially because of his meticulousness. Even though, at times I feel that more tolerance to interpreting his conclusions as having higher levels of uncertainty could be beneficial.
 

svart-hvitt

Major Contributor
Joined
Aug 31, 2017
Messages
2,375
Likes
1,253
CONTEXT AND HOLISM?

In my school days I remember Baysian statistics as some sort of puzzle where you got all the bits to combine into a coherent picture. The problem with these school lessons was that you never needed to question where the data came from; they were granted.

In real life Bayesian thinking can be more subtle. Though sometimes it is apparent. For example when somebody says:

«Life expectancy is 80 years. I am 79 and expect to die next year. I have cancelled my yearly birthday party and booked my funeral».

(Nobody would say this, because people’s intuition is better in terms of life expectanct than in not so familiar situations).

What Baysian adds, which I think is important from a philosophical perspective, is context. Theories may be useless in certain contexts, and data from one context may not be generalized.

Additionally, this avenue of thinking opens up our ability to question why people don’t use the same tools and theories across fields when they at the same time are big proponents of a «general theory». You can’t have a general theory if you use different tools and theories depending on the situation, can you? So one has to take a stand: Am I a believer in the grand general theory, or am I rather an «it the depends» kind of person?
 

DonH56

Master Contributor
Technical Expert
Forum Donor
Joined
Mar 15, 2016
Messages
7,835
Likes
16,498
Location
Monument, CO
I recently had to deal with Bayesian stats to develop a test for a system using forward error correction. The system can handle a certain number of errors with up to a certain number in a row to provide a corrected bit error rate better than 1e-15 from a raw error rate of 1e-9. I learned a lot, made my brain hurt, wrote a bunch of test scripts, then mostly forgot it.

The relevance that popped into my little pea brain is the ability to ignore single errors where several in a row are more detectable, as are repeated single errors if there are too many of them. Reminds me of the time I spent long ago, when I was setting up all sorts of audio tests, just how hard it was to find suitable source material to highlight the difference(s) I wanted to test.
 

svart-hvitt

Major Contributor
Joined
Aug 31, 2017
Messages
2,375
Likes
1,253
My Amazon records show that I bought Judea's two classic books on the subject in 2008:
https://www.amazon.com/gp/product/0521773628
https://www.amazon.com/gp/product/1558604790
Reading them affected my reasoning quite a bit.
Thanks for making me aware of the popular edition!

Judea family's events are of note too:
https://en.wikipedia.org/wiki/Daniel_Pearl
https://en.wikipedia.org/wiki/Mariane_Pearl

Some conspiracy theorists say that Judea revealing higher modes of thinking to the world, which heretofore were supposedly a part of the world's elite secret knowledge, was the result of his desire to rid the world of the lower modes of thinking.

The theory goes that some highly-placed people didn't like that. Thus the brutal murder of Judea's son: to send the message to Judea and others. However, such interpretation might be not overly credible, in the context of the very framework Judea revealed.

Public foolishness is generally beneficial to the elites: it is easier to rule a populace prone to systematic errors in judgement. However, in the 21st century it also became detrimental in many ways.

The elites want quicker advancements toward long-sought practical immortality and other technological miracles, which, it appears, will be developed faster if the intellectual potential of the mankind is utilized fuller.

In any case, Judea Pearl's teachings is something very much worth to be aware of.

@Sergei ,

Did Judea write a handful of articles that sums up what’s in his books? Sometimes, books are not more than an ornate representation of what the author had to say (and sometimes the book is the best medium still).
 
OP
Blumlein 88

Blumlein 88

Grand Contributor
Forum Donor
Joined
Feb 23, 2016
Messages
20,524
Likes
37,057
@Sergei ,

Did Judea write a handful of articles that sums up what’s in his books? Sometimes, books are not more than an ornate representation of what the author had to say (and sometimes the book is the best medium still).
You need to at least read his popular book I listed. You may then wish to get the others. I think Sergei will agree.
 

tr1ple6

Active Member
Joined
Mar 19, 2016
Messages
253
Likes
275
http://nautil.us/issue/74/networks/the-flawed-reasoning-behind-the-replication-crisis

I could point to others articles, but this one I happened to see today.

I've long thought except at the very beginning of testing some area of knowledge that 5% significance (two sigma) leads to too many false positives. In managing manufactured quality 3 sigma seems to give much better control over a process. 2 sigma actually worsened quality when applied to manufacturing in the early 20th century. And some sciences move past 5% being enough fairly soon. Physics being very far past it in requiring 5 sigma results to consider something like the Higgs boson being found. So I've thought audio testing should use 3 sigma myself.

None of that deals with things as a Bayesian statistician would. So I wonder if anyone has any thoughts or knowledge about the idea. I'm no expert, and only understand in crude terms the way you properly apply Bayesian statistics.

I also found this book interesting on cause and effect.
https://www.amazon.com/Book-Why-Sci...k+of+why&qid=1564814480&s=digital-text&sr=1-1

Maybe this is of some interest to @svart-hvitt . It isn't epistemology, but a look at how things can go wrong the way science is done leading to false explanations.
I'm a huge proponent of Bayes Theorem as you can see from my profile pic. Bayes Theorem is self correcting. Let's say that you start out with too many false positives, this gained knowledge means you update your prior probabilities for the next round of tests.

Here is a good read on how to properly apply Bayes Theorem. https://www.amazon.com/Proving-Hist.../dp/1616145595/?ie=UTF8&tag=richardcarrier-20
 
  • Like
Reactions: Tks

svart-hvitt

Major Contributor
Joined
Aug 31, 2017
Messages
2,375
Likes
1,253
Just a thought here concerning the robustness of some competently managed listening tests.

Say you read about a survey where 60 percent preferred a to b, and 40 percent preferred b to a.

What you didn’t know before you checked the footnotes in the article, is the fact that there were 100 participants in the survey, of which 30 participants preferred a to b and 20 participants preferred b to a. 50 participants were kept outside of the survey statistics.

Before you decide to call the author of the article, you see that there are two possible explanations why 50 participants were removed from the survey:

1) 50 participants heard a difference between a and b but couldn’t make up their mind which was the best.
2) 50 participants didn’t hear a difference between a and b.

Is it ok to say that 60 percent preferred a to b, or is the real number 30 percent?
 
Top Bottom