PSB Alpha P5 Speaker Review

b1daly · Aug 7, 2020

infinitesymphony said:
Why would a small speaker manufacturer be able to meet this bar but not a larger manufacturer? Kali Audio (est. 2018) are one example.

I think any competent speaker company would be able to take advantage of having regular access to good measurements. If they had any size at all investing in their own Klippel system or the like would be well worth it.

I’m curious about how familiar various manufacturers are with the Harman/Olive/Toole body of research. It would seem most should be. Maybe the historic expense of setting up a spinorama type testing environment put them off. (Before Klippel was a thing.)

Kali was started by guys who worked at Harman so they ‘hit the ground running’ with this approach.

Why would a company like B&W make expensive speakers that don’t conform to these design goals? Do they think just think it doesn’t matter? Do their designers ‘use their ears’ and think their designs ‘sound better?’ Do they have their own market research about what sells the best in a showroom? What designs give most long term satisfaction?

Are they optimizing for manufacturing efficiency?
It’s an interesting question.

tomtoo · Aug 7, 2020

Whit that many headles panthers hopefully @amirm get's not into trouble ....
with a animal protection organisation.

tecnogadget · Aug 7, 2020

Sancus said:
I don't think small variations are material either, but there's been fairly material variations in those contour map export sizes over all the reviews. Having an automated process also means that you could theoretically change all the maps for previous reviews if you ever decide to make a graph change, which seems very useful. I've already been thinking that modernizing old reviews is going to become necessary especially in cases where very popular, iconic speakers were reviewed. I do realize that changing the actual review posts themselves is also an issue even if changing the images is easy. It might be better to host all graph images at a central location and just link to them in the reviews.

I'm very conscious of the amount of time all this takes and to me the best solution to it is automating as much as possible if not all of it, and treating the reviews more as a database than as individual posts that are fixed in time.

Agree 100% on this. Data and Review N° is starting to really stack up, sooner or later at some point it will be needed to standardize, sans to clean up the variations of measurements included/not included, graphs style, scaling, etc.
Its a heck of a job so automation will be a must or more labor Panthers

.

bobbooo · Aug 7, 2020

Matias said:
Isn't it just the scale of the scoring with sub that needs interpretation? Like "above 8 is really good, close to 6 is really bad"?

I think @MZKM 's preference rating raking charts put the scores in a clearer context (I've highlighted the Alpha P5 on them):

@MZKM I know these are only a couple of clicks away via the link in your signature, but I think it might be a good idea to include them in your preference rating posts below each speaker review (with the particular speaker highlighted somehow like above), as Amir does for the SINAD charts in the electronics reviews. As I've said before, due to the rating scale probably not being linear - a result of among other things contraction bias by the listeners (tending not to use the extreme ends of the scale) in the original blind tests the formula is devised from - the scores may be most useful for giving a relative ranking order, rather than an absolute metric, of likely preference. I think there is enough of a range of speaker performance tested for that ranking to make sense now, which might give readers more of an intuitive sense of how good/bad a speaker is than just looking at the scores, and avoid the confusion that often arises from viewing the latter in isolation. I also think showing both with- and without-sub ranking charts next to each other is useful as you can instantly infer from comparing the two how much the (lack of) bass extension of a speaker is influencing the without sub ranking (and so conversely how much everything else i.e. on-axis and predicted in-room response is contributing to the ranking).

goldark · Aug 7, 2020

Matias said:
Isn't it just the scale of the scoring with sub that needs interpretation? Like "above 8 is really good, close to 6 is really bad"?

I'm not sure what the max score is - is it 10?

Not exactly the same thing, but in Harman's paper about preference ratings for circumaural and supra-aural headphones, they have the following breakdown. Since the 2 different types of headphones have different max scores, they used percentages of max to categorize - maybe we can apply this to speaker preference ratings as well?

Poor: 0 - 39%
Fair: 42 - 54%
Good: 65 - 76%
Excellent: 90-100%

Note that not every score is represented so obviously there is overlap here. We could interpret this to be the following:

Excellent: >=9
Very good: >=7.5 but <9
Good: >=6 but <7.5
Fair: >=4.5 but <6
Poor: >=3 but <4.5
Very Poor: <3

Obviously this isn't consistent with Amir's listening impressions as some higher scores have been rated subjectively bad and vice versa, but it does give a general guide line. If anything, we've already determined the preference scores are perfect anyway.

infinitesymphony · Aug 7, 2020

bobbooo said:
@MZKM I know these are only a couple of clicks away via the link in your signature, but I think it might be a good idea to include them in your preference rating posts below each speaker review (with the particular speaker highlighted somehow like above), as Amir does for the SINAD charts in the electronics reviews.

One thing that drives me nuts about this method is that these snapshots are frozen in time. As soon as the next review comes out in a category, the information is no longer up to date. I'd rather be linked to a master database that is up to date, especially given the potential for the score calculation to change as our understanding about preference evolves.

MZKM · Aug 7, 2020

bobbooo said:
I think @MZKM 's preference rating raking charts put the scores in a clearer context (I've highlighted the Alpha P5 on them):

View attachment 77052

View attachment 77053

@MZKM I know these are only a couple of clicks away via the link in your signature, but I think it might be a good idea to include them in your preference rating posts below each speaker review (with the particular speaker highlighted somehow like above), as Amir does for the SINAD charts in the electronics reviews. As I've said before, due to the rating scale probably not being linear - a result of among other things contraction bias by the listeners (tending not to use the extreme ends of the scale) in the original blind tests the formula is devised from - the scores may be most useful for giving a relative ranking order, rather than an absolute metric, of likely preference. I think there is enough of a range of speaker performance tested for that ranking to make sense now, which might give readers more of an intuitive sense of how good/bad a speaker is than just looking at the scores, and avoid the confusion that often arises from viewing the latter in isolation. I also think showing both with- and without-sub ranking charts next to each other is useful as you can instantly infer from that how much the bass extension of the speaker is influencing the without sub ranking (and so conversely how much everything else i.e. on-axis and predicted in-room response is contributing to the ranking).

I don't have a quick way to do this, other than to zoom out to fit it the window, save it to my photos, and manually "highlight" (it just reduces the brightness outside the selection area):

And the resolution is terrible.

I can return the relative rank for the speakers measured thus far (46 of 76).

MZKM · Aug 7, 2020

infinitesymphony said:
One thing that drives me nuts about this method is that these snapshots are frozen in time. As soon as the next review comes out in a category, the information is no longer up to date. I'd rather be linked to a master database that is up to date, especially given the potential for the score calculation to change as our understanding about preference evolves.

Yeah, unfortunately this forum's security settings won't allow embedding a live document.

bobbooo · Aug 7, 2020

MZKM said:
I don't have a quick way to do this, other than to zoom out to fit it the window, save it to my photos, and manually "highlight" (it just reduces the brightness outside the selection area):
View attachment 77063

And the resolution is terrible.

I can return the relative rank for the speakers measured thus far (46 of 76).

I suppose the highlighting isn't really necessary, just a nice visual addition - the particular speaker being reviewed can easily be found from the given scores you already post anyway. As for the resolution, you don't need to zoom out and print screen or anything. I just right clicked on an empty area of the chart and clicked 'save image as', then you should get a lossless png file.

restorer-john · Aug 7, 2020

hardisj said:
I wanted to have the "klippel" logo at the top of my data but I can't stand having to manual change every single graph all the time.

That's boring. Make yours different.

hardisj · Aug 7, 2020

restorer-john said:
That's boring. Make yours different.

View attachment 77069

I was gonna just have an image up there... but too many immature ideas for me to settle on one. I suppose I could set it up to draw randomly from a set.

617 · Aug 7, 2020

You don't need an NFS to design a speaker which conforms to the standards set by the best performing speakers. It makes some things easier and more accurately but diyers with fifty dollar microphones routinely design speakers with far smoother directivity.

Psb knows what they're doing. Stereophile has been hyping these little alpha speakers since the 90s.

I think designing a speaker better than this at this price point is within the capabilities of any good speaker designer, psb just doesn't care.

napilopez · Aug 7, 2020

goldark said:
I'm not sure what the max score is - is it 10?

Not exactly the same thing, but in Harman's paper about preference ratings for circumaural and supra-aural headphones, they have the following breakdown. Since the 2 different types of headphones have different max scores, they used percentages of max to categorize - maybe we can apply this to speaker preference ratings as well?

Poor: 0 - 39%
Fair: 42 - 54%
Good: 65 - 76%
Excellent: 90-100%

Note that not every score is represented so obviously there is overlap here. We could interpret this to be the following:

Excellent: >=9
Very good: >=7.5 but <9
Good: >=6 but <7.5
Fair: >=4.5 but <6
Poor: >=3 but <4.5
Very Poor: <3

Obviously this isn't consistent with Amir's listening impressions as some higher scores have been rated subjectively bad and vice versa, but it does give a general guide line. If anything, we've already determined the preference scores are perfect anyway.

One problem is that I think it is going to be extremely unlikely for a speaker to score above a 9 (w/sub). Frankly I'll be surprised if we see something above an 8.7 anytime soon. As far as I know, Neumann and Genelec already offer the flattest speakers on the market.

Once you get into the 8s, the little squiggles and deviations that are unlikely to be very audible start to matter, and by 8.5 it seems, the current highest score, it seems you're dealing as muchwith the flattness of the measurement setup as you are of the speakers themselves.

The highest I've seen the w/sub score go with 'real' measurements is the Neumann KH80 measurements from Klippel that Neumann themselves shared back in june. When I traced that in vituixcad, I got an 8.8. (although a 9 if you use the listening window).

Though if you use 1/1 octave smoothing, you get a 9.6

beefkabob · Aug 8, 2020

Tks said:
The heck? Speaker of the year 2019? How?

Drugs are fun!

Come on, guys. Be respectful. Look at the quality craftsmanship of that box! They take real pride in their efforts.

At least it's only $349. You can pay much more for mediocre speakers.

I'd be totally comfortable with these at $100 a pair.

Francis Vaughan · Aug 8, 2020

b1daly said:
ASR has the word ‘science’ in the name, and I think this captures the ethos of the site. But what Amir is doing is not really science. He is measuring things and collecting the data. To do science with the data would require crafting a thesis and finding out whether the data support the thesis or not.

Not really. This is a modern take on a purist version of science, mostly down to a mistaken view of Karl Popper's work. Whilst I'm a big fan of Popper, he was, like many philosophers of science, mostly thinking about physics. Even then it is more nuanced. Science is much wider than falsifiable theory. In the end your theory needs to be falsifiable, and tested. But there is a long road ahead before you get there. Data collection is required, and is part of the scientific process. Experiment and publication of experimental results is absolutely science. And you must collect data without thought as to what you think the resultant theory might be, otherwise you taint the data.

Real science takes an unknown amount of time. Amir is in data collection, and at the same time using established science to create evaluations, all the while noting discrepancies with existing science. The fact that there isn't a new thesis right this very moment does not make this any less science. Indeed the fact that there isn't any such thesis is what makes it proper science, and not a psuedo-science exercise. This is a really important point. The last few decades of modern science as performed in universities has been tainted by this. There is a corrosive drive to be publishing novel, exiting, and most of all justify your next round of research funding because you are productive. This has led to the reproducibility crisis. It is clear that science would be well served by a reduction in a desire for yet another novel thesis, and much more dispassionate data collection and curation. That and experiments that don't just test a new theory, but test exiting theories. The lack of testing of exiting theories is the elephant in the room for a huge section of modern science. Peer review is not supposed to be just getting a few mates to sign off on you latest paper for publication. It is supposed to be testing of these results to ensure that they are reproducible. This is sadly very rare. Almost no journal will publish such "null" results, so there is no incentive to ever perform such a test. That has led to a morass of published work that turns out to be unreproducible if there ever is a need. Testing someone else's work to verify reproducibility is exactly science. Indeed it is now clear it is more important than just coming up with an initial new theory. Ideas are cheap. Truth is priceless.

Right now Amir is absolutely doing science. Ironically, the theory in the gun-sights at the moment is the Olive score. Nobody has ever attempted to verify the Olive score. IMHO that significantly diminishes the scientific value of the Olive score. The fact that the input data used to generate the score is not freely available to researchers makes the value of the score even less scientific by modern standards, to the point where many journals would refuse to publish the paper now. Not to diminish the work, it was done at a different time to different standards. But by a modern standard of science, ASR is more defensibly science that the Olive score. Fully disclosed methodology and measured data from experiment makes for robust science.

https://www.smbc-comics.com/comic/theory

617 · Aug 8, 2020

Francis Vaughan said:
Not really. This is a modern take on a purist version of science, mostly down to Karl Popper. Whilst I'm a big fan of Popper, he was, like many philosophers of science, mostly thinking about physics. Even then it is more nuanced. Science is much wider than falsifiable theory. In the end your theory needs to be falsifiable, and tested. But there is a long road ahead before you get there. Data collection is required, and is part of the scientific process. Experiment and publication of experimental results is absolutely science. And you must collect data without thought as to what you think the resultant theory might be, otherwise you taint the data.

Real science takes an unknown amount of time. Amir is in data collection, and at the same time using established science to create evaluations, all the while noting discrepancies with existing science. The fact that there isn't a new thesis right this very moment does not make this any less science. Indeed the fact that there isn't any such thesis is what makes it proper science, and not a psuedo-science exercise. This is a really important point. The last few decades of modern science as performed in universities has been tainted by this. There is a corrosive drive to be publishing novel, exiting, and most of all justify your next round of research funding because you are productive. This has led to the reproducibility crisis. It is clear that science would be well served by a reduction in a desire for yet another novel thesis, and much more dispassionate data collection and curation. That and experiments that don't just test a new theory, but test exiting theories. The lack of testing of exiting theories is the elephant in the room for a huge section of modern science. Peer review is not supposed to be just getting a few mates to sign off on you latest paper for publication. It is supposed to be testing of these results to ensure that they are reproducible. This is sadly very rare. Almost no journal will publish such "null" results, so there is no incentive to ever perform such a test. That has led to a morass of published work that turns out to be unreproducible if there ever is a need. Testing someone else's work to verify reproducibility is exactly science. Indeed it is now clear it is more important than just coming up with an initial new theory. Ideas are cheap. Truth is priceless.

Right now Amir is absolutely doing science. Ironically, the theory in the gun-sights at the moment is the Olive score. Nobody has ever attempted to verify the Olive score. IMHO that significantly diminishes the scientific value of the Olive score. The fact that the input data used to generate the score is not freely available to researchers makes the value of the score even less scientific by modern standards, to the point where many journals would refuse to publish the paper now. Not to diminish the work, it was done at a different time to different standards. But by a modern standard of science, ASR is more defensibly science that the Olive score. Fully disclosed methodology and measured data from experiment makes for robust science.

https://www.smbc-comics.com/comic/theory

This is a classic ASR comment which reflects more thoughtful effort than the product associated with the thread it is in.

Pepperjack · Aug 8, 2020

Does anyone else think it would be interesting to start a umik share program where we pass around a umik to users who have asr tested speakers and then graph the measurements with corresponding room dimensions etc and see if any relevant correlations to Amir’s measurements start to crop up. I know everyone says room modes will dominate, and no doubt they will, but perhaps there will be some interesting relationships between certain basic room designs and spin data with enough measurements...or is that dumb?

AnalogSteph · Aug 8, 2020

beefkabob said:
I'd be totally comfortable with these at $100 a pair.

Or a piece, maybe (depending on where they're made). As-is, they just seem to suffer from extreme cost-cutting that left no budget for a decent woofer, addressing the resonances and more than a minimum of crossover parts.

Ericglo · Aug 8, 2020

617 said:
You don't need an NFS to design a speaker which conforms to the standards set by the best performing speakers. It makes some things easier and more accurately but diyers with fifty dollar microphones routinely design speakers with far smoother directivity.

Psb knows what they're doing. Stereophile has been hyping these little alpha speakers since the 90s.

I think designing a speaker better than this at this price point is within the capabilities of any good speaker designer, psb just doesn't care.

Is it they (he) doesn't care or something else? Could PSB have designed a flawed speaker to hit a certain price point and then found some consumers preferred said flawed speaker? Maybe a batch of the original speakers were assembled incorrectly. Subsequently, dealers found that these speakers sold better than correct speakers.

The bottom line is these are poor measuring speakers and something I wouldn't purchase or recommend. That said I don't condemn the whole company on one model. I owned the Stratus Golds years ago and loved them.

And as you know, one doesn't need Samsung backing to design a competent speaker. This isn't the 80s were the cost of the equipment was out of reach of most hobbyists. To tie this into PSB, Dennis Murphy designed a speaker that was better than my Stratus Golds and that was 15 years ago.

b1daly · Aug 9, 2020

Francis Vaughan said:
Not really. This is a modern take on a purist version of science, mostly down to a mistaken view of Karl Popper's work. Whilst I'm a big fan of Popper, he was, like many philosophers of science, mostly thinking about physics. Even then it is more nuanced. Science is much wider than falsifiable theory. In the end your theory needs to be falsifiable, and tested. But there is a long road ahead before you get there. Data collection is required, and is part of the scientific process. Experiment and publication of experimental results is absolutely science. And you must collect data without thought as to what you think the resultant theory might be, otherwise you taint the data.

Real science takes an unknown amount of time. Amir is in data collection, and at the same time using established science to create evaluations, all the while noting discrepancies with existing science. The fact that there isn't a new thesis right this very moment does not make this any less science. Indeed the fact that there isn't any such thesis is what makes it proper science, and not a psuedo-science exercise. This is a really important point. The last few decades of modern science as performed in universities has been tainted by this. There is a corrosive drive to be publishing novel, exiting, and most of all justify your next round of research funding because you are productive. This has led to the reproducibility crisis. It is clear that science would be well served by a reduction in a desire for yet another novel thesis, and much more dispassionate data collection and curation. That and experiments that don't just test a new theory, but test exiting theories. The lack of testing of exiting theories is the elephant in the room for a huge section of modern science. Peer review is not supposed to be just getting a few mates to sign off on you latest paper for publication. It is supposed to be testing of these results to ensure that they are reproducible. This is sadly very rare. Almost no journal will publish such "null" results, so there is no incentive to ever perform such a test. That has led to a morass of published work that turns out to be unreproducible if there ever is a need. Testing someone else's work to verify reproducibility is exactly science. Indeed it is now clear it is more important than just coming up with an initial new theory. Ideas are cheap. Truth is priceless.

Right now Amir is absolutely doing science. Ironically, the theory in the gun-sights at the moment is the Olive score. Nobody has ever attempted to verify the Olive score. IMHO that significantly diminishes the scientific value of the Olive score. The fact that the input data used to generate the score is not freely available to researchers makes the value of the score even less scientific by modern standards, to the point where many journals would refuse to publish the paper now. Not to diminish the work, it was done at a different time to different standards. But by a modern standard of science, ASR is more defensibly science that the Olive score. Fully disclosed methodology and measured data from experiment makes for robust science.

https://www.smbc-comics.com/comic/theory

OK, sure, to the extent that doing science involves measuring and collecting data, then Amir is "doing science," and I think my comment reflected this. I certainly didn't mean to dismiss the importance of this work, which is downright heroic.

In your comment you are conflating two issues: the reproducibility crisis in science and a more fundamental question of what properly constitutes 'science.' It's possible that there is an issue with how science is (mis)conceived that is adding to the reproducibility crisis, but this is a thesis that would need to be supported itself.

(Note for anyone reading who hasn't heard of the reproducibility crisis in science, it comes out of research that attempt to replicate the results of significant published research which failed to do so. It's scandalous because it has led to false beliefs about important aspects of reality, for example in medicine).

The reproducibility crisis is a complicated problem that involves issues of incentives, social behavior, statistics, and I'm sure much more. The core of the issue, that many scientific results are not reproducible, still includes the notion that a central thesis for a paper is necessary for a scientific result to be presented, because this is the thing that is not being 'reproduced.'

The problems with the incentive structure in science (where there is a premium on novel results which leads to selection bias on which papers get published) strikes me as a vexing problem, because it's an emergent problem based on how complicated social processes causes effects that are unintended, it's a multi-faceted problem, no individual or group is responsible for creating or fixing the problem, and the incentive structures probably reflect deeply held cognitive biases by humans. It's unclear to me how broadening the what is acceptable as a 'publishable' scientific result would affect this crisis. Null results are accepted as being valid scientific finds, it's just that they are less exciting and this critical aspect of the 'scientific process' is being shortchanged.

You mention the issue of not coming up with a thesis before collecting data, and this sure does seem to a necessary part of science. The idea is that a thesis would emerge from collected data.

But there are also issues with crafting a thesis after data has been collected, because you can use statistics to show what looks like a meaningful result that is actually coincidental. There is some movement in science to 'pre-register' the thesis you will be testing to try and mitigate these types of biases. This addresses the issue of the lack of incentive to publishing null results.

But as regards to my comment on ASR and science, I was getting at the idea that, call it the final step of science, there is a convention that tends to focus on not merely data collection but some kind of 'thesis', 'result', 'conclusion', 'main idea' that comes out of the research.

As far as I can tell Amir is not attempting to present such a thesis at this point, but is instead doing careful measuring and collecting a dataset. Such a dataset could support scientific investigation along multiple directions.

I could see some kind of social science being done:

How messaging in the hi-fi audio industry obscures and confuses consumers ability to choose high performing audio equipment, a qualitative study '

Or
'How published results of measured audio performance affect the satisfaction levels of individuals who own the gear tested'

Or
' Quantifying the economic loss resulting from misinformation presented in the audiophile press'

At this point I'm not seeing how the speaker measurements collected so far either confirm or refute the 'Olive Score' because there is no corresponding preference testing being done. If the original data set was available, you could examine it to see if the statistical correlation really holds for the data. Or you could propose new data to collect from the original speaker set, measure them again, and then if you had the listening test data you could perhaps extend and strengthen the Olive Score approach.

PSB Alpha P5 Speaker Review

Active Member

Major Contributor

Addicted to Fun and Learning

Major Contributor

Active Member

Major Contributor

Major Contributor

Major Contributor

Major Contributor

Grand Contributor

Major Contributor

Major Contributor

Major Contributor

Major Contributor

Addicted to Fun and Learning

Major Contributor

Member

Major Contributor

Senior Member

Active Member

Similar threads