• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

The Effects of MP3 Compression on Perceived Emotional Characteristics in Musical Instruments

Cosmik

Major Contributor
Joined
Apr 24, 2016
Messages
3,075
Likes
2,180
Location
UK
@Cosmik: Do you have a problem with "Soft Sciences", such as sociology, criminology, psychology, etc. as such, or merely with the presented findings in the paper? I understand the uneasiness many people have with those fields, due to the un-intuitive nature of the mathematics of statistics required, and the difficulty in providing factual absolutes from their studies. This does not invalidate their efforts, however, it merely points out the relative immaturity of their fields, and particularly, their audiences.
Science and art do not mix. Science can be used to measure physiological responses in the right circumstances, but asking people for their emotional responses to little snippets of 'art' is just silly. If science can genuinely measure the 'emotion' of musical snippets (it can't), then science can be used to compose emotional music. Have you ever heard computer-generated music, or computer-generated stories, or computer-generated jokes?

Could this be the team's next project?
"Jokes often evoke a sensation of humour in humans. We decided to play recordings of several jokes over a stereo system. We wanted to see whether various DSP algorithms would affect the humorousness rating, with the ultimate aim of designing the world's funniest stereo system. Participants were asked to rate each joke on a scale of one to ten, where one indicated a slight tickle, and ten indicated a full strength belly laugh. Surprisingly there was little correlation between perceived humour coefficient and algorithm, although a speeded up chipmunk voice gave a slight uptick for a few minutes and then mysteriously went down again. We found a strong relationship whereby repeated listening to a joke resulted in a downgrading of the humour rating. The reasons are unclear. Further research is needed."

As I mentioned above, 'the sound of MP3' itself might evoke emotional responses in people of a certain age who associate that sound with their youth. If the research showed that MP3 apparently boosted the emotional response to sounds, would that then be an indication that MP3 is 'better' than no compression?

More such nonsense:
http://metro.co.uk/2016/07/14/this-...rld-according-to-scientific-research-6007019/
 
Last edited:
Joined
Jan 3, 2017
Messages
29
Likes
8
Science and art do not mix.
Unless run through a Neve console :D Seriously though, art and science do mix, but usually it is true that the results of scientific inquiry lead to products or techniques which are then applied to art or art-making. One could argue that art that is made in antipathy to science is in fact "mixing" with it, at least in a philosophical sense.

Can science measure the emotional effect of any stimuli? We are possibly on the brink of discovery to a small subset of these types of questions (such as using FMRI as an evolving tool to this end), so I would answer: "possibly, soon".

I have heard computer-generated music, and although some of it was interesting, much of it did not engage me as much as other music types with direct human output. However, this is mainly a problem of proper artistic implementation, as a computer is merely another tool in the rack awaiting artfully inspired integration.

<< computer-generated jokes?

Could this be the team's next project?
"Jokes often evoke a sensation of humour in humans. We decided to play recordings of several jokes over a stereo system. We wanted to see whether various DSP algorithms would affect the humorousness rating, with the ultimate aim of designing the world's funniest stereo system. Participants were asked to rate each joke on a scale of one to ten, where one indicated a slight tickle, and ten indicated a full strength belly laugh. Surprisingly there was little correlation between perceived humour coefficient and algorithm, although a speeded up chipmunk voice gave a slight uptick for a few minutes and then mysteriously went down again. We found a strong relationship whereby repeated listening to a joke resulted in a downgrading of the humour rating. The reasons are unclear. Further research is needed.">>

I think you just made one;)
 

watchnerd

Grand Contributor
Joined
Dec 8, 2016
Messages
12,449
Likes
10,408
Location
Seattle Area, USA
You can download the open-access AES paper here:

http://www.aes.org/e-lib/browse.cfm?elib=18523

They used 32 kbps, 56 kbps and 112 kbps. Wow!

My anger rating would be pretty high at 32 kbps I think.

Based on those bit rates, I thought this paper was going to be 5-10 years old. I was surprised it's a 2016 paper.

I don't see why any findings at those bit rates, for MP3 (already an old codec), are very relevant to pushing the envelope of audio in 2017.
 

Cosmik

Major Contributor
Joined
Apr 24, 2016
Messages
3,075
Likes
2,180
Location
UK
Seriously though, art and science do mix, but usually it is true that the results of scientific inquiry lead to products or techniques which are then applied to art or art-making. One could argue that art that is made in antipathy to science is in fact "mixing" with it, at least in a philosophical sense.

Can science measure the emotional effect of any stimuli? We are possibly on the brink of discovery to a small subset of these types of questions (such as using FMRI as an evolving tool to this end), so I would answer: "possibly, soon".

I have heard computer-generated music, and although some of it was interesting, much of it did not engage me as much as other music types with direct human output. However, this is mainly a problem of proper artistic implementation, as a computer is merely another tool in the rack awaiting artfully inspired integration.
Well perhaps another angle should be that genuine science should (aspire to be) be timeless and universal. As soon as we bring 'culture' into the equation then we give up on that. Sure, it may really be that a muted trumpet brings out certain emotions in everyone because it resembles the sound of a crying baby - or whatever. But it is also the case that what 'scientists' assume is an emotional response to the sound is really a cultural response to a very famous song from a popular movie that, for several generations of westerners, links the sound with humour, for example. Even if a person hasn't seen the movie, they have learned what the sound means from other people's responses to the sound, or its use in other pieces of music. It isn't a universal response, but a simplistic hypothesis and experiment would "suggest" that it was.

Here's an example where something that was assumed to be far more innate than music may turn out to be just as culturally based:
http://www.sciencemag.org/news/2016...ncluding-fear-may-not-be-universal-we-thought
“The implications here are really big,” he says. “It strongly suggests that at least these facial behaviors are not pancultural, but are instead culturally specific.”...

...Based on his research, Russell champions an idea he calls “minimal universality.” In it, the finite number of ways that facial muscles can move creates a basic template of expressions that are then filtered through culture to gain meaning. If this is indeed the case, such cultural diversity in facial expressions will prove challenging to emerging technologies that aspire to decode and react to human emotion, he says, such as emotion recognition software being designed to recognize when people are lying or plotting violence....

...Despite agreeing broadly with the study’s conclusions, Fridlund doubts it will sway hardliners convinced that emotions bubble forth from a common fount. Ekman’s school of thought, for example, arose in the post–World War II era when people were seeking ideas that reinforced our common humanity, Fridlund says. “I think it will not change people’s minds. People have very deep reasons for adhering to either universality or cultural diversity.”...

The second quoted paragraph is the one that has practical implications: that people are planning all kinds of machines based on the 'science' (naive assumptions dressed up as science) of facial cues. Goodness knows where that will lead, but it could resemble Minority Report!

I accept that the only way to refine certain technologies such as lossy compression is to ask people what they are experiencing. The basic technology may be based on genuine science, but it needs fine tuning by listening to 'representative' music. The results, however, shouldn't be labelled as science, but are more akin to a focus group exercise. A publisher may use focus groups to ascertain the most appropriate typefaces for different genres of books using a rigorous 'scientific method', but they wouldn't pretend that the results are science.
 
Last edited:

RayDunzl

Grand Contributor
Central Scrutinizer
Joined
Mar 9, 2016
Messages
13,201
Likes
16,983
Location
Riverview FL
In this day and age no one cares about low bit rate MP3 as it is not used anywhere (AAC and enhanced versions is what is used).

upload_2017-1-4_6-7-35.png

https://www.shoutcast.com/Search
 

Cosmik

Major Contributor
Joined
Apr 24, 2016
Messages
3,075
Likes
2,180
Location
UK
Something that makes the same point I was making above. Human response to MP3 artefacts can't be separated from "cultural" phenomena.
https://news.slashdot.org/story/09/03/11/153205/young-people-prefer-sizzle-sounds-of-mp3-format

"Jonathan Berger, a professor of music at Stanford, tests his incoming students each year by having them listen to a variety of recordings which use different formats from MP3 to ones of much higher quality, and he reports that each year the preference for music in MP3 format rises. Berger says that young people seemed to prefer 'sizzle sounds' that MP3s bring to music because it is a sound they are familiar with. 'The music examples included both orchestral, jazz and rock music. When I first did this I was expecting to hear preferences for uncompressed audio and expecting to see MP3 (at 128, 160 and 192 bit rates) well below other methods (including a proprietary wavelet-based approach and AAC),' writes Berger. 'To my surprise, in the rock examples the MP3 at 128 was preferred. I repeated the experiment over 6 years and found the preference for MP3 — particularly in music with high energy (cymbal crashes, brass hits, etc) rising over time.' Dale Dougherty writes that the context of the music changes our perception of the sound, particularly when it's so obviously and immediately shared by others. 'All that sizzle is a cultural artifact and a tie that binds us. It's mostly invisible to us but it is something future generations looking back might find curious because these preferences won't be obvious to them.'"
 
Joined
Jan 3, 2017
Messages
29
Likes
8
Well perhaps another angle should be that genuine science should (aspire to be) be timeless and universal. As soon as we bring 'culture' into the equation then we give up on that. Sure, it may really be that a muted trumpet brings out certain emotions in everyone because it resembles the sound of a crying baby - or whatever. But it is also the case that what 'scientists' assume is an emotional response to the sound is really a cultural response to a very famous song from a popular movie that, for several generations of westerners, links the sound with humour, for example. Even if a person hasn't seen the movie, they have learned what the sound means from other people's responses to the sound, or its use in other pieces of music. It isn't a universal response, but a simplistic hypothesis and experiment would "suggest" that it was.

So, if the result of the study was to give us more insight into what was cultural vs. what was physiological, wouldn't that have clear implications for both science and art? What the facial expression study you linked to may also have exposed was that the Soft Sciences have a long way to go regarding result replication, validation, and proper (timeless) interpretation of the data.

What's not commonly known is that since the Reagan era, funding for Soft Sciences has been dramatically reduced. One possible repercussion of this being that our reduced ability to apply scientific understanding to issues of culturally-based behavior and diversity, which may have led us to national policies which inadvertently encouraged some of the very worst (and most expensive) issues our society now faces: economic depression, terrorism, violence, crime, and reductions in support for art-making. Ironically, we used Hard Sciences (including technology) to make better weapons, rather than improve the attitudes (and understanding) of those who would pull the trigger.

<<The second quoted paragraph is the one that has practical implications: that people are planning all kinds of machines based on the 'science' (naive assumptions dressed up as science) of facial cues. Goodness knows where that will lead, but it could resemble Minority Report!>>
Exactly my point in the paragraph above: Better bullets do not equal more righteous reasons for pulling the trigger! Soft Science would be best poised to give us better understanding of what was righteous to begin with.

<<'All that sizzle is a cultural artifact and a tie that binds us. It's mostly invisible to us but it is something future generations looking back might find curious because these preferences won't be obvious to them.'>>
That conclusion may be supported by the data, but it also may be a result of one (possibly) overlooked variable not included in J. Berger's viewpoint: that increased distortions, more stimulating timbres, and increased loudness may be favored by younger members of any culture- which if true would suggest a physiological basis for these preferences.

The overarching point that I'm making is that because the human auditory system is incredibly complex, only a mix of scientific views can hope to unravel the variables: physiological responses (understanding revealed by Hard Science) vs. cultural responses (nuances uncovered via Soft Science).
 

Cosmik

Major Contributor
Joined
Apr 24, 2016
Messages
3,075
Likes
2,180
Location
UK
That conclusion may be supported by the data...
I would just home in on that statement as the nub of it. I think that there are things that "data" cannot reveal e.g. Does marriage make you live longer? You are never going to get a 'scientific' answer to that one. The only way to isolate human nature from nurture is through some extremely long term and unethical procedures!
 
Joined
Jan 3, 2017
Messages
29
Likes
8
You are never going to get a 'scientific' answer to that one. The only way to isolate human nature from nurture is through some extremely long term and unethical procedures!
So, if by 'scientific' you mean one which does not utilize the science of statistical mathematics, I agree. Central to the study of quantum physics is the use of statistical methodology, that which describes elementary particles and their behavior not in terms of exactly one thing or another, but the probability of their attributes. It is through this understanding of probabilities that greater power was given to the technology of electronics, and many, many other scientific inquiries. The fields of Hard Science have clearly benefited from the use of statistical mathematics, and these math techniques have become the primary tools utilized by Soft Science.

All science is improved/vetted by replication (a form of statistical verification), and therefore require seemingly long term efforts to condense simpler truths from nearly infinite variables. That one can imagine unethical procedures being necessary in order to achieve understanding about the question of nurture vs. nature says more about our current social dilemma- that we don't yet know enough about ourselves (personally and collectively) to definitively answer the question of what is or is not ethical. Soft Sciences, I argue, are necessary tools for examining these issues and their complexities, yet are currently culturally de-emphasized.

@watchnerd: Pushing the envelope of audio also requires that we understand how we react to all audio, not just the more favored compression techniques. The greater/more impressive technologies are developed in response to these fundamental questions.
 

Cosmik

Major Contributor
Joined
Apr 24, 2016
Messages
3,075
Likes
2,180
Location
UK
So, if by 'scientific' you mean one which does not utilize the science of statistical mathematics, I agree. Central to the study of quantum physics is the use of statistical methodology, that which describes elementary particles and their behavior not in terms of exactly one thing or another, but the probability of their attributes. It is through this understanding of probabilities that greater power was given to the technology of electronics, and many, many other scientific inquiries. The fields of Hard Science have clearly benefited from the use of statistical mathematics, and these math techniques have become the primary tools utilized by Soft Science.

All science is improved/vetted by replication (a form of statistical verification), and therefore require seemingly long term efforts to condense simpler truths from nearly infinite variables. That one can imagine unethical procedures being necessary in order to achieve understanding about the question of nurture vs. nature says more about our current social dilemma- that we don't yet know enough about ourselves (personally and collectively) to definitively answer the question of what is or is not ethical. Soft Sciences, I argue, are necessary tools for examining these issues and their complexities, yet are currently culturally de-emphasized.
Ultimately, it boils down to the thorny subject of consciousness. Science can make simple measurements of stuff like hitting your knee with a hammer and measuring the reaction, but it is a delusion to believe that it can pin down and predict what is going on in the conscious mind. The conscious mind is even aware of the results of any 'science', fed back to it via the media, and will alter its behaviour accordingly - as just one example of why science can never separate nature from nurture unless it creates an impossibly complex and unethical Truman Show type scenario.
 
Joined
Jan 3, 2017
Messages
29
Likes
8
<<Ultimately, it boils down to the thorny subject of consciousness. Science can make simple measurements of stuff like hitting your knee with a hammer and measuring the reaction, but it is a delusion to believe that it can pin down and predict what is going on in the conscious mind. The conscious mind is even aware of the results of any 'science', fed back to it via the media, and will alter its behaviour accordingly - as just one example of why science can never separate nature from nurture unless it creates an impossibly complex and unethical Truman Show type scenario.>>

Ah, yes, the thorny problem of consciousness. It could be argued that the conversation we're having right now is a result of the evolution and expansion of consciousness. Is society and all its inventions (including language and therefore science) locked forever in a feedback loop whereby the results of a generations' discoveries further our understanding of reality, but evade the basic questions that consciousness requires answers to; such as why we are here? Quite possibly.
We as a species have suffered much hardship in our history, a good deal of that suffering initially came from the environment- the moon is a harsh mistress. One of the ways we developed was to adopt a strategy of nurture, and to support that end- language and all other societal inventions are at their basic level intended to improve our lot in life, and our ability to survive as well as possible.
We may be at a crossroads in our evolution, whereby the largest remaining source of our suffering is ourselves, in which case where will our consciousness take us, and what tools will be required to take us there? I remain hopeful (illogically so, perhaps) that art and science both can be utilized as tools for our future existence.

I had forgotten about the Truman Show movie, thanks for reminding me about it. As an ironic example from that source: the final way in which Truman escapes his trap (which his consciousness required) is by way of a boat which his wardens unwittingly provided. If by analogy we compare his wardens to collective consciousness, Truman to an individual's consciousness, and the boat to science (the ability to accomplish things in physical reality); it would seem that his captors were doomed to failure in detaining him for the very reason that his (singular) consciousness was finally freed as a direct result of the wardens' perfection of this oppressive illusion, to the ultimate (and ironic) benefit of collective consciousness!
 

fas42

Major Contributor
Joined
Mar 21, 2016
Messages
2,818
Likes
191
Location
Australia
Bill, how familiar are you with the Auditory Scene Analysis research field?
 
Joined
Jan 3, 2017
Messages
29
Likes
8
Hello Frank, I had not heard of Bergman's theory until you brought it up (just Googled some basics about it). I'm still rather naive regarding many psychoacoustic discoveries, but being an autodidact I've either read about or personally experienced evidence that I believe supports the basic theoretical tenets of Auditory Scene Analysis, such the Cocktail Party Effect, and the effect and interaction of different elements of sound on their correlate properties such as JND, temporal or harmonic fusion, etc. It's a very interesting theory (or set of hypotheses), what do you think of it?
 

fas42

Major Contributor
Joined
Mar 21, 2016
Messages
2,818
Likes
191
Location
Australia
Cheers, Bill - yes, John Kenny introduced me to the subject, and I find it provides explanations of many behaviours of systems I've dealt with over the years - so, I'm rather partial to it. There's a thread in this forum titled exactly that, with lots of discussion and links to the vigorous research still taking place - check it out!
 
Joined
Jan 3, 2017
Messages
29
Likes
8
The following link may be of interest to those wishing to see a more recent and in-depth survey article regarding studies of topics related to Auditory Scene Analysis:
http://journal.frontiersin.org/article/10.3389/fnins.2014.00293/full
The phenomena being studied are of relevance to this discussion topic, and many other subjects on this forum, such as ABX testing, dynamic compression mechanisms of the cochlea, perception of differences in sample rates, etc.
 
Joined
Jan 3, 2017
Messages
29
Likes
8
Thanks Frank, (your post came in just as I was typing the link above). I'll check out that other forum topic immediately.
 
Joined
Jan 3, 2017
Messages
29
Likes
8
Just a quick post to point out an online primer on psychoacoustics with relevant data regarding JND of loudness being approximately 1 dB, under conditions of mid-range frequency, mid-level SPL, and "normal" noise shape:
http://acousticslab.org/psychoacoustics/PMFiles/Module04.htm
Relevant to this thread- If JND for Loudness is measured at 1 dB for these "normal" conditions, wouldn't it be prudent to calibrate our systems (whenever possible) to 0.1 dB? After all, for critical conditions in other areas of engineering a 10:1 design factor ("safety factor") is not uncommon.

See the text approx. "Effects of Duration on Loudness" and onward. Time-based effects (envelope) may have an effect on perception of Loudness (and by implication) many other perception metrics: Pitch, overall Frequency Response/ Timbre, and possibly others. This shouldn't actually be too surprising, given that the initial stimuli to these responses have two primary variables: Energy and Time.

At risk of redundancy, I'll C.C. this post to the ASA thread as well, as it relates to perception in general.
 

j_j

Major Contributor
Audio Luminary
Technical Expert
Joined
Oct 10, 2017
Messages
2,267
Likes
4,759
Location
My kitchen or my listening room.
But the study asks the question:
When the compression quality is reduced to the point that differences are readily noticeable, what emotions are felt?
We are only two weeks from 2017, in this day & age there is no need for this low compression quality.

This is really a very good point, there is no reason to be listening to 56kb/s MP3's except in very, very limited circumstances.
 

j_j

Major Contributor
Audio Luminary
Technical Expert
Joined
Oct 10, 2017
Messages
2,267
Likes
4,759
Location
My kitchen or my listening room.
Just like AES members, you mean. They stuck their heads in the sand for years and insisted that digital was perfect. Until they found that it wasn't. Till today there are some flat earthers who cling to dogma and insist that bits is bits.

I will point out that many so-called objectivists seem to be stuck in the 1980's and have ignored most of the findings of pscyhoacoustics. I read recently that one reason why people think vinyl sounds better might be because it has more crosstalk than digital. Sure enough, if you install a VST plugin and increase the crosstalk you will find the music to take on a more relaxing sound. How do you think people found that out? By dissecting cochleas and measuring speakers?

Well, you'd be wrong on some very basic things, then. No, nobody's learned about why interaural crosstalk is useful by dissecting cochleas, which are a monophonic organ to begin with. On the other hand, measuring speaker/room/head/ear canal resonances, using purely digital equipment, it's trivial to show why the right kind of crosstalk, especially in headphones, is important to a natural listening experience. This is backed up by mathematical examination of the interaction of wavefronts with one's head.

As to your comments about digital, a proper digital recording in a standard listening room does exceed the well-established sensitivities of the human auditory system, a statement that can be very clearly understood by relating to the physics of the situation. I'll leave that for another article if you would like to engage in a reasonable fashion.

Oh, and distortion as a function of level is also very important to why some people prefer vinyl. You can find a brief discussion of this on an ancient blog at audioskeptic.blogspot.com if you like, or by going to www.aes.org/sections/pnw, looking in the "powerpoint" section or the "meeting recaps" section for discussions of loudness vs. intensity vs. signal bandwidth.

Then there is the work done by Earle Geddes. This has been extensively debated in audio forums. But in case you missed it, the traditional way of quantifying THD is meaningless, because even large amounts of lower order harmonic distortion is less damaging than even small amplitude high order harmonic distortion. You don't learn this from dissecting cochleas either. You get this from asking people what their preferences are.
Again, that knowledge comes precisely from understanding the known characteristics of the human cochlea. You don't have to ask people, it is blatantly, trivially obvious from first principles, using well-established psychoacoustic science from the time of Harvey Fletcher, that high-order distortion is going to be much more audible, and most likely much more disturbing, than low-order effect, until you get to IM issues on sparse signal spectra.

This is well known and obvious from the mathematics of cochlear function.
If this is "Audio SCIENCE Review" it seems as if a disturbingly large number of members here aren't scientific at all. Instead, anything that conflicts with their preconcieved notions is dismissed out of hand. I have met quite a few engineering types who cling on to their precious and perfect world of measurements and computer simulations and forget that we live in the real world - where people from other scientific disciplines may be asking questions you never considered, and use different methods to what you are used to.

Engineering is all about the real world, real acoustics, the real performance of human beings in the system, and lots of other things. While I have certainly met a number of people who reject the testable, falsifiable principles behind psychoacoustics and cochlear modelling in favor of "signal to noise ratio", the leaders in the audio industry, who are by and large always using proper digital methods and equipment, are well aware of the signal processing that happens on the human cochlea, and to a surprising extent how the brain uses the information coming down the auditory nerve.

And, none of this is new. I refer you to a must-have book "The ASA Edition of Speech and Hearing in Communication" by Harvey Fletcher, edited by Jont B. Allen. It covers work primarily in the 1930's that explains, for instance, your comment about distortion order, comes about. It also, with some modern mathematics, explain why that center speaker in the book is so important to, for instance, depth perception. I will warn you, it is a tough, tough read with huge quantities of information on most every page.

Oh, and your comment about 'dissecting cochleas'. That's ridiculous, the cochlea is an ***ACTIVE*** organ. You won't figure out much of that from something that's dead. Among other things, the Organ of Corti is likely to collapse the second you start your attempt. Then you have nothing much to learn.
 

j_j

Major Contributor
Audio Luminary
Technical Expert
Joined
Oct 10, 2017
Messages
2,267
Likes
4,759
Location
My kitchen or my listening room.
The concept of the research is valid.

That's as far as I'll go. I would want to see a whole lot more data.
 
Top Bottom