Members here are no stranger to the battle between trusting measurements versus listening tests as performed by reviewers at large. It is a difficult topic to try to address in text so a while ago I decided to create a presentation and video for it. It was a harder job than I thought but finally managed to create a cohesive presentation based on research. I go through the formal research on how listening tests are performed and correlated with measurements.
It is a long, 1+ hour presentation but hopefully you find it worthwhile to set aside that much time to watch it (or speed it up).
Research papers:
https://www.aes.org/e-lib/browse.cfm?elib=9822 "
A Survey Study of In-Situ Stereo and Multi-Channel Monitoring Conditions
https://www.aes.org/e-lib/browse.cfm?elib=12206
Differences in Performance and Preference of Trained versus Untrained Listeners in Loudspeaker Tests: A Case Study
It is a long, 1+ hour presentation but hopefully you find it worthwhile to set aside that much time to watch it (or speed it up).
Research papers:
https://www.aes.org/e-lib/browse.cfm?elib=9822 "
A Survey Study of In-Situ Stereo and Multi-Channel Monitoring Conditions
https://www.aes.org/e-lib/browse.cfm?elib=12206
Differences in Performance and Preference of Trained versus Untrained Listeners in Loudspeaker Tests: A Case Study