Blind Listening Test 2: Neumann KH 80 vs JBL 305p MkII vs Edifier R1280T vs RCF Arya Pro5

MatthewS · Mar 28, 2023

Shortly after completing the first blind listening test, @Inverse_Laplace and I started thinking about all the ways we’d like to improve the rigor and explore other questions. Written summary follows, but here is a video if you prefer that medium:

Speakers (calculated preference score in parentheses):

Test Tracks:

Fast Car – Tracy Chapman
Bird on a Wire – Jennifer Warnes
I Can See Clearly Now – Holly Cole
Hunter – Björk
Die Parade der Zinnsoldaten – Leon Jessel (Dallas Wind Symphony)

Unless noted below, we used the same equipment, controls, and procedures as last time, review that post for details.

Motorized turntable: 1.75s switch time between any two speakers
ITU R 1770 loudness instead of C weighting
Significantly larger listening room
5 powered bookshelf/monitors (preference ratings from 2.1 to 6.2)
Room measurements of each speaker at multiple listening position

By far the most significant improvement was the motorized turntable. We were able to rotate to any speaker in 1.75 seconds and keep the tweeter in the same location for each speaker. The control board also randomized the speakers for each track automatically and was controllable remotely from an iPad.

We only had time to conduct the listening test with a small number of people and ended up having to toss out data on three individuals. The test was underpowered. We did not achieve statistical significance (p-value < .05). That said, here are the results we collected:

Spinorama of speakers:

In-room response plotted against estimated:

Our biggest takeaways were:

Recruit a larger cohort
Schedule on a weekend
Well controlled experiments are hard

Some personal thoughts:

Once you get into well-behaving studio monitors, it becomes extremely difficult to tease apart the differences. It takes a lot of listening and tracks that excite small issues in each speaker. A preference score of 4 vs 6 appears to be a significant difference but depending on the nature of the flaws it can be extremely challenging to hear the difference. It is easy to hear that the speakers sound different but picking out the better speaker gets very difficult.

Running a well-controlled experiment is extremely difficult. We had to measure groups on different days and getting the level matching and all the bugs worked out was a challenge. We learned a lot and will apply it to our next set of tests.

Comments from the individual that ran the statistical analysis:
A repeated measures analysis of variance (ANOVA) found no significant difference in sound ratings for the 5 different speaker types, F(4, 16) = 1.68, p = .205, partial eta-squared = .295.

Paired samples t-tests were then run to compare the average sound ratings between each possible pair of speakers. For the most part, speakers showed no significant differences in sound ratings, ps > .12. However, there was a significant difference between sound ratings for the JBL versus EdifierEQ speakers, t(4) = 3.88, p = .018, such that participants reported significantly better sound ratings for the JBL speaker (M = 6.18, SE = 0.31) over the EdifierEQ speaker (M = 5.64, SE = 0.40).

An interesting observation: for one group of listeners, we had to level match the speakers again and in our haste, we used pink noise instead of the actual material. This excites all frequencies equally which isn’t necessarily representative of the musical selections. The Neumann KH80 was a full 3db lower (ITU R 1770) when using the music tracks than most of the other speakers (we measured after the test and we clearly could hear differences in the volume of each speaker.) We threw out this data for our analysis, but the speaker with the lowest level was universally given awful ratings by each listener.

We are looking to conduct another test with a larger group, possibly this spring.

EDIT:

REW In-Room Measurements
Attached raw data of listener preference for anyone that wants to look at it.

thewas · Mar 28, 2023

Now this is a real comparison which promotes the word science and gives some great material for thoughts and discussion.
Thank you very much for it!

ernestcarl · Mar 28, 2023

Several times I did simple A/B comparisons between my KH120 and JBL LSR305. It was really a tough call as to which speaker was preferable for various select music tracks. The midrange scoop of the JBLs and more sparkly highs (which at other times I found annoying) made it sound quite appealing and spacious. With more critical listening tests along with tone generator tests I could tell that the LSR had definitely more “issues” (resonances and distortion) and was less accurate. The KH120, though, does have a slight hump at it’s lowest freq. which does sound to me slightly “hyped” too. After comparing several other cheaper alternatives through the years, I’ve always felt that I could switch my KH120s for other several “lesser” costing speakers quite happily and adapt to their not as neutral sound.

badspeakerdesigner · Mar 28, 2023

Great work!

amirm · Mar 28, 2023

Absolutely a great effort. No hobbyist had tried to go after this testing as formally as you have. Can't compliment you enough.

Promoting to home page.

JktHifi · Mar 28, 2023

Just checked the JBL priced at $149 on Amazon. Lots of price differences right now between manufacturers.
Who first adjusted the price will be the winner in this economic recession. EU inflation vs US disinflation

daftcombo · Mar 28, 2023

Great work!
I've like to see some Genelec speakers in the next test. ;-)

Koeitje · Mar 28, 2023

What kind of SPL levels was this done at? I've had an Edifier speaker (1850 something) that completely fell apart in the bass at decent listening volumes compared to the JBL 305 (1st gen) and Adam T5V. I didn't do a blind test, but it was super obvious that is lacked the output to fill a small room.

Sokel · Mar 28, 2023

Great effort!
What is a good eye-opener is the measured in-room response which despite the measured anechoic responses is nearly identical below 500Hz.
Room dictates it's own response as it should and that's something we have to look again and again.

Thank you!

Gio · Mar 28, 2023

A big big thank you! Bravo !

voodooless · Mar 28, 2023

MatthewS said:
We only had time to conduct the listening test with a small number of people and ended up having to toss out ~~data on~~ three individuals.

I read it like this and thought: that's probably an interesting story

Great work otherwise

That is a really nice setup you created! Clearly, some good old TLC went into creating this.

It would be interesting to test the following:

- High-pass all speakers at say 80~100 Hz, and see how much the low end determines the result
- EQ them all to the same response (as best as possible), and see how that influenced the preference scores
- All exhibit a big bump at 300~400 Hz (SBIR probably), it may be good to EQ that one out for all

Blumlein 88 · Mar 28, 2023

You may know this, but Toole suggested using pink noise from 500 to 2000hz with a 12 db/octave roll off above and below that for level matching speakers. My much less rigorous tests than yours indicates that works pretty well though I'm not convinced it is the perfect solution. I've personally experimented with a decade, 250-2500 hz using pink noise with roll offs of 12 db/octave at each end of that. Maybe worth looking into.

Keeping with 500 hz keeps you are above the Schroeder frequency in most rooms while 250 hz may get you close to it depending upon the room size.

roog · Mar 28, 2023

I love the ingenuity, I wondered if it might be better to blind fold the listener and use say, keypad scoring rather than screen the speakers?

hoverdonkey · Mar 28, 2023

Brilliant.

Maybe the hi-fi review magazines will adopt this method.

sweetchaos · Mar 28, 2023

Great effort.

I’d like to see for next time:
1. Add Genelec speaker if you can
2. High pass each speaker (say at 80hz) to eliminate the variable bass output. Perhaps run this as trial #2.
3. More test subjects to make data more statistically significant.
4. Test subjects should be sitting at the same ear height, to eliminate this variable

Nicely done!

computer-audiophile · Mar 28, 2023

This is a very interesting listening test! Thank you very much!

I feel confirmed in my observation that the JBL 305p MkII monitor sounds very good, even if it is perhaps not quite as neutral as the Neumann KH120. For the home user, it is imo probably the better buy, it also consumes less energy having Class-D amps and standby function. At the time when I bought my KH120, the JBL was not yet available.

I wrote my former hearing experience on this in an older post:

Rank studio monitors/active speakers you have heard

Here also 310/750 with dsp (calibrated with Neumann MA 1 software and mic). Much better indeed. Sounds much clearer. I still like Geithain more, allthough no dsp corrections in my room. more natural and lively.. less ‘boring’..

www.audiosciencereview.com

computer-audiophile · Mar 28, 2023

sweetchaos said:
2. High pass each speaker (say at 80hz) to eliminate the variable bass output. Perhaps run this as trial #2.

Why should you do that? It's about the complete loudspeaker. A pleasant bass reproduction is part of the sound impression if you want to use the speakers alone, without a sub. (This is how I do it)

voodooless · Mar 28, 2023

computer-audiophile said:
Why should you do that? It's about the complete loudspeaker. A good bass reproduction is part of it if you want to use the speakers alone, without a sub. (This is how I do it)

Because you want to know why one speaker is preferred over the other. One trivial property is bass response. So in eliminating the difference there, you'll learn more about the possible other properties that influence the preference.

uwotm8 · Mar 28, 2023

Wow, thats s great test! However, there's one thing I'd definitely improved.

MatthewS said:
Fast Car – Tracy Chapman

Bird on a Wire – Jennifer Warnes

I Can See Clearly Now – Holly Cole

Hunter – Björk

Die Parade der Zinnsoldaten – Leon Jessel (Dallas Wind Symphony)

That requires some full-spectre full-loaded music and metal is the way

Even such copyright-free generic is okay.
Jokes off, in all my comparisons such tracks are the best way to show speaker flaws, especially MF-HF coloration is there's any.
YES trained listener will detect it on almost any genre but the key point IMO is that "good" music tend to sound pleasing even on trashy desktop speakers and if played on some, say, hi-fi setup with midrange droop or elevated HF ("showroom sound") will still sound pleasing (not right but you'll be OK with it). At the same time, try Slayer or Dream Theater or whatever. Those who tried know what I'm talking about

computer-audiophile · Mar 28, 2023

voodooless said:
Because you want to know why one speaker is preferred over the other. One trivial property is bass response. So in eliminating the difference there, you'll learn more about the possible other properties that influence the preference.

I know what you mean. And it is indeed the case that the JBL makes a better bass, according to my impression. I see it as an advantage, as I tried to describe in my older post. For example Pinao sounds much more realistic with more 'body' in my impression. After this test here, I like my little JBLs even more.

Blind Listening Test 2: Neumann KH 80 vs JBL 305p MkII vs Edifier R1280T vs RCF Arya Pro5

Member

​

Speakers (calculated preference score in parentheses):​

Test Tracks:​

Spinorama of speakers:​

In-room response plotted against estimated:​

Attachments

Master Contributor

Major Contributor

Active Member

Founder/Admin

Senior Member

Major Contributor

Major Contributor

Master Contributor

Member

Grand Contributor

Grand Contributor

Senior Member

Active Member

Major Contributor

Major Contributor

Major Contributor

Grand Contributor

Senior Member

Major Contributor

Similar threads

Speakers (calculated preference score in parentheses):

Test Tracks:

Spinorama of speakers:

In-room response plotted against estimated: