I have not seen any of the speakers in the study identified by brand/model, but people keep interpreting the 0.86 correlation coefficient in that way and it really doesn't mean that 86% were accurate and 14% inaccurate, it's a statistical statement of how closely the set of measured scores depart from the predicted ones.
I scanned the graph in the study and put the data here. It's not really that exciting, though, most of the big outliers are predicted-bad speakers that were given more forgiving(but still average or bad) scores by listeners. The only really interesting one is the highest outlier above a score of 5, which was rated 8.2 by listeners but given 6.6 by the preference formula.