The Effects of MP3 Compression on Perceived Emotional Characteristics in Musical Instruments

Blumlein 88 · Dec 10, 2016

You can download the open-access AES paper here:

http://www.aes.org/e-lib/browse.cfm?elib=18523

They used 32 kbps, 56 kbps and 112 kbps. Wow!

My anger rating would be pretty high at 32 kbps I think.

Cosmik · Dec 10, 2016

This is amazing! It's probably not even pseudoscience.

What are the criteria for publication by the AES?

RayDunzl · Dec 10, 2016

http://www.aes.org/journal/authors/guidelines/

amirm · Dec 10, 2016

Glanced at the paper and it takes the prize for the most bizarre paper published in the Journal. In this day and age no one cares about low bit rate MP3 as it is not used anywhere (AAC and enhanced versions is what is used). I suspect there are also language translation issues with respect to the words/emotions they say they tested.

Blumlein 88 · Dec 11, 2016

I do have to admit, why did they do this research? No one other than maybe cell phones use such low bit-rates.

Keith_W · Dec 11, 2016

What are you guys complaining about? Are you complaining about the question, or their scientific method?

I have absolutely no problem with them asking the question. I think it's a valid question - you listen to music for its emotional content, so you want to know if the file you are delivered is effective in reproducing the content, or not.

My beef is the small sample size, and lack of published p-values. The 95% CI's do seem to be pretty consistent, but it would be nice to know if it reached statistical significance. The other beef is to do with some basic grammar errors which should not appear in a published article in an academic journal, e.g. "To see which emotional categories were strengthen or weakened by MP3 compression" on p.863.

If you are going to criticize articles published in the JAES, you should perhaps start with their sloppy statistics. I am not a subscriber to the JAES, but the few articles I have read from that publication makes me wonder how scientific they really are. To me, it seems as if they commonly conduct studies and then make conclusions without the rigorous evidence and analysis to back it up.

Mind you, you commonly find this type of study in some medical publications. The authors know that it will be the abstract and conclusion that gets read and reported. Very few will actually go through the paper to identify flaws in the methodology and come up with a more nuanced interpretation of what the data actually shows. As a result, it is common for conclusions to make all sorts of claims which is not necessarily supported by the data. Better journals have better peer review processes that prevents this from happening, but the same rigour does not always apply with smaller journals.

Asking the question is one thing, designing an experiment to answer the question is another. Failing to provide rigorous statistical analysis on your result effectively makes the paper meaningless. For me, I would say - "interesting question. Pity about the experiment and analysis".

Cosmik · Dec 11, 2016

Keith_W said:
Are you complaining about the question, or their scientific method?

Everything, on so many levels!

We could discuss whether the scientific method can answer questions like "Is there a God?" or "How do I know that everyone else is conscious?". But we could also imagine a smaller example:

I wish to know what is the most beautiful guitar studio effect scientifically. In order to do so, I will harness the scientific method most rigorously. I record samples of 58 renowned guitar players playing a variety of compositions. I put these on the internet and invite people to listen to them randomly, passed through one of 98 randomly-selected guitar effects. 1.6 million people respond to my request, and a clear winner appears - statistically significant! Science has answered the question, and allows me to build and market "The world's most beautiful guitar effect pedal". Not only that, but by rearranging my results I may also have in my possession the information to establish "The world's most beautiful chord change", "The world's most beautiful note" and "The world's most beautiful tempo".

Using similar methods, it is only a short step to finding the most beautiful musical composition, the funniest film, the best colour, and the luckiest number.

But... if I ran the test in 1998, I might get one answer, and if I ran it in 2016 I would get another. I haven't demonstrated that if I increased the number of effects to 102 that I wouldn't get a different answer. Or a different selection of guitarists and tunes. Or that if people listened at different volumes, or used different types of headphones, or did the test under laboratory conditions, or in winter rather than summer, or were offered monetary payment for participation in the experiment, or we demanded that listeners had musical qualifications or ... etc. etc.

Such an experiment may follow the scientific method to the letter, but it is still rubbish. The problem is fundamental: asking people what is beautiful/emotional/heroic/calming/meaningful/humorous/spiritual/stimulating etc. is not science. All of those things are completely fluid and dependent on the context of the subject's entire life. Simple 'fashion' can give you your "statistical significance" and then a completely opposite answer a few months later. To think that we can pin down human taste/culture/emotion objectively is to think that we can understand and predict consciousness.

Keith_W · Dec 11, 2016

Cosmik, it sounds as if you are unaware of all the studies on aesthetics. If you open a cosmetic surgery journal, you will find all sorts of studies about what constitutes beauty by rating aesthetic preferences of the study population. I even once saw a study on nipple position in breast reconstructions. The author trawled through magazines which featured nude models and noted the position of the nipple and generated a statistical model on where best to place it.

We also routinely do studies on subjective phenomena using standardized tools. The entire field of academic psychiatry is based on the study of subjective phenomena - like emotions/calming/spiritual/stimulating which you have just casually dismissed. In my own discipline, we use a "quality of life" scale that measures the impact of our treatment on the subjective well-being of the study population. After all, you want to know if making a dying patient live longer actually worsens their quality of life.

Then, there is academic psychology. Do you know that people study what kind of environments, what kind of music, what smells, and which types of colour are perceived as more soothing/exciting/disturbing? The next time you walk into a department store, you should realize that everything in that store - the lighting, the colours, the music, the uniforms, and in fact the whole environment - has been designed with the help of psychologists to put you in a state of mind conducive to shopping. And yes, I have seen the studies.

I will say again - the question is valid. Maybe it doesn't seem valid to some narrow minded engineering folk who have not read much outside their own academic disciplines (if they are academic at all), but there are entire fields of academia devoted to studying these things.

What is not valid is the actual method that they used. Small sample size, no controls, no power calculation, no statistical analysis, means that the actual strength of the data is questionable, at best. They then go on and write a conclusion as if they have proved their point. They haven't.

Incidentally, questions like "Is there a God?" can not be directly answered by science (you can not prove a negative), but some "effects" of God can be indirectly studied. There have been a number of studies in medical journals about the power of intercessory prayer in intensive care units, where the patient is unconscious. The conclusion was: the primary and secondary endpoints for the study and control groups showed no statistically significant difference, although admittedly the statistical power of most of these studies was questionable.

Cosmik · Dec 11, 2016

Keith_W said:
Cosmik, it sounds as if you are unaware of all the studies on aesthetics.

Just because people study these things doesn't make it science, or meaningful.

Human behaviour is not fixed, and not objective - as political pollsters and economists have found out recently. No doubt many people have profited in earlier times by taking the advice of economists, but this is not proof that economists are right or scientific. Economists may merely state the obvious or dress up their own intuitions and prejudices as science. They may be lucky for decades at a time. Until they aren't.

There was a programme this week about finding "The world's most similar twins who are strangers". They used several 'scientific' methods to do it:

computer analysis of 2D photographs
computer analysis of 3D scans
public assessment of side-by-side photographs
public assessment of the people in the flesh

Guess what? Each method gave a different result. No doubt the computerised algorithms could be tweaked and 'weighted' to give the same answer as the humans, which would work - until it didn't for the next batch.

A simple observation as to why the AES-published experiment is meaningless:
For many people these days, the sound of MP3 artefacts is a direct reminder of their youth, and all the emotional baggage that that carries with it. If you are going to ask them about something as nebulous and ephemeral as the "emotional" content of sound, then your results are going to be influenced by this. Ditto the first record they bought, the theme tune of the latest blockbuster movie and so on. All of these things are beyond the control of the experiment.

Phelonious Ponk · Dec 11, 2016

Cosmik said:
I wish to know what is the most beautiful guitar studio effect scientifically. In order to do so, I will harness the scientific method most rigorously. I record samples of 58 renowned guitar players playing a variety of compositions. I put these on the internet and invite people to listen to them randomly, passed through one of 98 randomly-selected guitar effects. 1.6 million people respond to my request, and a clear winner appears - statistically significant! Science has answered the question, and allows me to build and market "The world's most beautiful guitar effect pedal". Not only that, but by rearranging my results I may also have in my possession the information to establish "The world's most beautiful chord change", "The world's most beautiful note" and "The world's most beautiful tempo".

Too many unknowns. Namely what guitar/what amp. But if those answers are a Telecaster and a Vibrolux, the answer to the first question is a Fulltone OCD.

And the world's most beautiful chord change is G major with an added D a octave above open D, to D with a B added in the bass to C major 7 to G...otherwise known as the opening of Crazy Love. Just saying...

Tim

Keith_W · Dec 11, 2016

Cosmik said:
Human behaviour is not fixed, and not objective - as political pollsters and economists have found out recently. No doubt many people have profited in earlier times by taking the advice of economists, but this is not proof that economists are right or scientific. Economists may merely state the obvious or dress up their own intuitions and prejudices as science.

Just like AES members, you mean. They stuck their heads in the sand for years and insisted that digital was perfect. Until they found that it wasn't. Till today there are some flat earthers who cling to dogma and insist that bits is bits.

I will point out that many so-called objectivists seem to be stuck in the 1980's and have ignored most of the findings of pscyhoacoustics. I read recently that one reason why people think vinyl sounds better might be because it has more crosstalk than digital. Sure enough, if you install a VST plugin and increase the crosstalk you will find the music to take on a more relaxing sound. How do you think people found that out? By dissecting cochleas and measuring speakers?

Then there is the work done by Earle Geddes. This has been extensively debated in audio forums. But in case you missed it, the traditional way of quantifying THD is meaningless, because even large amounts of lower order harmonic distortion is less damaging than even small amplitude high order harmonic distortion. You don't learn this from dissecting cochleas either. You get this from asking people what their preferences are.

If this is "Audio SCIENCE Review" it seems as if a disturbingly large number of members here aren't scientific at all. Instead, anything that conflicts with their preconcieved notions is dismissed out of hand. I have met quite a few engineering types who cling on to their precious and perfect world of measurements and computer simulations and forget that we live in the real world - where people from other scientific disciplines may be asking questions you never considered, and use different methods to what you are used to.

For many people these days, the sound of MP3 artefacts is a direct reminder of their youth, and all the emotional baggage that that carries with it. If you are going to ask them about something as nebulous and ephemeral as the "emotional" content of sound, then your results are going to be influenced by this. Ditto the first record they bought, the theme tune of the latest blockbuster movie and so on. All of these things are beyond the control of the experiment.

Yes, that is a valid criticism of the study.

RayDunzl · Dec 11, 2016

I'll take "science" over "Religion"...

I try to stay on the rails here. Occasionally I'll get confoosed.

I don't have a strong affinity for so-called Scientific Results that are the aggregation and quantification and statistical verification involving tests that query for "Opinion" or other "Personal Experiences" or "Personal Preferences".

YMMV.

Mine apparently does.

Cosmik · Dec 12, 2016

Keith_W said:
Just like AES members, you mean. They stuck their heads in the sand for years and insisted that digital was perfect. Until they found that it wasn't. Till today there are some flat earthers who cling to dogma and insist that bits is bits.

I will point out that many so-called objectivists seem to be stuck in the 1980's and have ignored most of the findings of pscyhoacoustics. I read recently that one reason why people think vinyl sounds better might be because it has more crosstalk than digital. Sure enough, if you install a VST plugin and increase the crosstalk you will find the music to take on a more relaxing sound. How do you think people found that out? By dissecting cochleas and measuring speakers?

Then there is the work done by Earle Geddes. This has been extensively debated in audio forums. But in case you missed it, the traditional way of quantifying THD is meaningless, because even large amounts of lower order harmonic distortion is less damaging than even small amplitude high order harmonic distortion. You don't learn this from dissecting cochleas either. You get this from asking people what their preferences are.

If this is "Audio SCIENCE Review" it seems as if a disturbingly large number of members here aren't scientific at all. Instead, anything that conflicts with their preconcieved notions is dismissed out of hand. I have met quite a few engineering types who cling on to their precious and perfect world of measurements and computer simulations and forget that we live in the real world - where people from other scientific disciplines may be asking questions you never considered, and use different methods to what you are used to.

I really have no objection to anyone deciding that a VST plugin sounds nicer to them than the raw signal. Where I become sceptical is when they say that they have demonstrated it scientifically, with the implication being that it is generally applicable to everyone and all systems (like the AES experimenters above think their findings will make a better compression system).

I would say that unless you can argue convincingly that what you have found is definitely not influenced by novelty, fashion, familiarity and many other human factors beyond the control of the experiment, then it is not an objective, scientific finding. This could mean that many audio questions cannot be answered by science. I don't have a problem with that.

My view is extremely simple: yes, bits are bits (sorry!) and that the system should reproduce the recording as accurately as possible. I do not mind if this is never proved scientifically.

Clearly if a lossy compression scheme has had to be used, then someone, somewhere, decided on a system, made some measurements using a selection of music, and presumably did some sort of listening trial to fine tune it. I don't dispute that this would be useful, but I don't think the results are real 'science' (- no problem calling it an 'Engineering Report'). There are too many 'aesthetic' variables in play. Repeat the tests at a different volume; repeat the tests with my own Spotify playlist compared to yours and we might get different results and so on. Maybe a new style of music comes along that always triggers audible artefacts with the existing compression scheme and it has to be changed. And there's always the possibility of the inverse i.e. that the compression scheme actually influences the development of music. Such are the pitfalls of dealing with human taste.

RayDunzl · Dec 12, 2016

So I'm listening to a CD, and imagining it as alternatively compressed.

I look at my choices:

None of the above is all I get.

RayDunzl · Dec 12, 2016

Now I'm listening to an unobtainable recording downloaded from YouTube as MP4 with 43 meg for 46:37 of music.

It cuts off at 12kHz, and has 16.8Hz and 33.4 Hz continuous and endless tones at about 65dB.

It's almost certainly a vinyl rip though I haven't noticed any incriminating evidence (I'm not scrutinizing it, though). The recording is from 1976.

Once again I can't pick an adjective from the list to describe this compressed version. It doesn't give me any reason to hate vinyl either, so that conversation doesn't resonate with me.

dallasjustice · Dec 14, 2016

Peer Review is broken, IMO. Here's study where less than a third of nine major errors were detected during peer review (training the peer reviewers has little effect):
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2586872/

Keith_W said:
What are you guys complaining about? Are you complaining about the question, or their scientific method?

I have absolutely no problem with them asking the question. I think it's a valid question - you listen to music for its emotional content, so you want to know if the file you are delivered is effective in reproducing the content, or not.

My beef is the small sample size, and lack of published p-values. The 95% CI's do seem to be pretty consistent, but it would be nice to know if it reached statistical significance. The other beef is to do with some basic grammar errors which should not appear in a published article in an academic journal, e.g. "To see which emotional categories were strengthen or weakened by MP3 compression" on p.863.

If you are going to criticize articles published in the JAES, you should perhaps start with their sloppy statistics. I am not a subscriber to the JAES, but the few articles I have read from that publication makes me wonder how scientific they really are. To me, it seems as if they commonly conduct studies and then make conclusions without the rigorous evidence and analysis to back it up.

Mind you, you commonly find this type of study in some medical publications. The authors know that it will be the abstract and conclusion that gets read and reported. Very few will actually go through the paper to identify flaws in the methodology and come up with a more nuanced interpretation of what the data actually shows. As a result, it is common for conclusions to make all sorts of claims which is not necessarily supported by the data. Better journals have better peer review processes that prevents this from happening, but the same rigour does not always apply with smaller journals.

Asking the question is one thing, designing an experiment to answer the question is another. Failing to provide rigorous statistical analysis on your result effectively makes the paper meaningless. For me, I would say - "interesting question. Pity about the experiment and analysis".

Speedskater · Dec 14, 2016

But the study asks the question:
When the compression quality is reduced to the point that differences are readily noticeable, what emotions are felt?
We are only two weeks from 2017, in this day & age there is no need for this low compression quality.

RayDunzl · Dec 15, 2016

dallasjustice said:
Here's a study...

Was the peer review study peer reviewed?

Blumlein 88 · Dec 15, 2016

RayDunzl said:
Was the peer review study peer reviewed?

This recursion reminds me of two things. The unfunny who is watching the watchers watch. The funny is the opening credits to Monty Python's "Holy Grail". A different case of trying to get to the bottom of who is responsible for something and were they really responsible or not.

Bill Bainbridge · Jan 3, 2017

@Blumlein 88: The paper you first linked here may have some flaws in methodology, and confident statistical inference using simple word associations such as those used with such a limited number of participants is perhaps questionable, but their initial question is interesting for other reasons. Do envelope (and other) distortions from different playback media have a quantifiable emotional result? Since the subjects of this study might not have sufficient learned skills/descriptive vocabulary regarding audio qualities, reduction of responses to (presumably ubiquitous) emotionally linked words may become a shorthand method for determination of some sort of baseline physiological response. Further study would be needed here.

@Others: The question is valid. Science should not be restricted to what is immediately perceivable as useful, so the question of why they used low resolution MP3's is not irrelevant, particularly if the purpose was to demonstrate a difference of perception to the stimuli. Although the paper's bibliographical citations are extensive, lack of descriptive detail regarding experimental controls, regards for unintentional variable influence, etc. makes me question the underlying rigor. If the underlying questions were sociological or psychological in nature, why were no sociologists or psychologists invited/utilized in structuring the experimental design? This may be evidence of a sort of technological arrogance/schism that is possibly commonplace in the realm of Computer Science, I'm afraid.

@Cosmik: Do you have a problem with "Soft Sciences", such as sociology, criminology, psychology, etc. as such, or merely with the presented findings in the paper? I understand the uneasiness many people have with those fields, due to the un-intuitive nature of the mathematics of statistics required, and the difficulty in providing factual absolutes from their studies. This does not invalidate their efforts, however, it merely points out the relative immaturity of their fields, and particularly, their audiences.

@RayDunzl: Apparently the peer review paper was not peer reviewed, based on the questionable caveats listed at the end of the paper!

The Effects of MP3 Compression on Perceived Emotional Characteristics in Musical Instruments

Grand Contributor

Major Contributor

Grand Contributor

Founder/Admin

Grand Contributor

Major Contributor

Major Contributor

Major Contributor

Major Contributor

Addicted to Fun and Learning

Major Contributor

Grand Contributor

Major Contributor

Grand Contributor

Grand Contributor

Major Contributor

Major Contributor

Grand Contributor

Grand Contributor

Member

Similar threads