• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Sennheiser HD650 Review (Headphone)

Feelas

Senior Member
Joined
Nov 20, 2020
Messages
390
Likes
316
Not to make a fight about statistics...but the "magic dust" is called the "big numbers law" or many measurements (lab) tend to the expected value (nothing new), very time consuming. If the variance is high ....it means by definition that you have a good expected value! ( meaning a large set of values). I know that acoustic measurements are anything but stable and one don't waste time in repetitions (variables even depends on temperature and humidity).
Average may not have a power for predictions but is a useful tool for better precision (at least at math level and lab), on the other hand it can not predict our "average" hearing ability on hp (many variables)
BTW we know of global warmth by the average global temperature in many decades...and the predictions seems good (in a bad way).
BTW2: the volt reference is an average of Josephson devices...not a "single" measure.
Yet, if we measure both the average & the variance, we actually know what to expect.
 

pkane

Master Contributor
Forum Donor
Joined
Aug 18, 2017
Messages
5,667
Likes
10,299
Location
North-East
Not to make a fight about statistics...but the "magic dust" is called the "big numbers law" or many measurements (lab) tend to the expected value (nothing new), very time consuming. If the variance is high ....it means by definition that you have a good expected value! ( meaning a large set of values). I know that acoustic measurements are anything but stable and one don't waste time in repetitions (variables even depends on temperature and humidity).
Average may not have a power for predictions but is a useful tool for better precision (at least at math level and lab), on the other hand it can not predict our "average" hearing ability on hp (many variables)
BTW we know of global warmth by the average global temperature in many decades...and the predictions seems good (in a bad way).
BTW2: the volt reference is an average of Josephson devices...not a "single" measure.

In signal processing, averaging improves SNR if noise is not correlated and the signal remains unchanged between runs. This is what happens when you average multiple measurements using headphones and an artificial ear using the same test signal. For random noise sources SNR increases as SQRT(N) where N is the number of averaged measurements.

Averaging doesn't act as a low-pass filter between multiple measurements with a stationary, unchanging signal. The example of averaging varying temperatures doesn't apply here, since the signal in this case is changing between measurements.
 
Last edited:
OP
amirm

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
44,595
Likes
239,614
Location
Seattle Area
Not to make a fight about statistics...but the "magic dust" is called the "big numbers law" or many measurements (lab) tend to the expected value (nothing new), very time consuming.
I am quite familiar with the law of large numbers/CLT, having used it to describe the acoustic response of rooms above transition frequencies (number of modes rapidly increases, creating a stochastic response). I hope you well know that it only applies to random events since we know their probabilities which would be the convergent value.

Most importantly, the law is called large numbers because it must not remotely be used in small number of observations. Flipping a dice 3 times and getting heads doesn't mean the next three will be tails to balance them and result in 0.5 probability. Gamblers fallacy will bite you in the behind hard here!

Mind you, when the underlying issue is noise, we do filter. My measurements are filtered to 1/12 octave because the noise there is electric and environmental noise and we are not interested in that. And we pick 1/12 so that we don't filter out what we still want to see. With averaging you have no control over the bandwidth of the filter, nor is strength.

If you want to do something here, pick geometric mean. At least there, you get one of the graphs as the center one and not be subject to smoothing that averaging would do. Problem with this scheme is that it is a) more work and b) when I used it, it didn't generate better results. See this measurement I did with B&K 5128:

index.php


Here is the GeoMean:

index.php


Notice that it has very high resolution unlike any averaging (even the little wiggles are preserved in 200 Hz region). But that still didn't benefit understanding of this headphone.

My approach is to find a much more reliable representation of the system by measuring multiple times and comparing left and right channels at two frequencies. There is no pretending that there is some magic dust in the form of simple averaging to give us good results out of bad measurements.
 

Rock Rabbit

Active Member
Joined
Feb 24, 2019
Messages
230
Likes
174
I am quite familiar with the law of large numbers/CLT, having used it to describe the acoustic response of rooms above transition frequencies (number of modes rapidly increases, creating a stochastic response). I hope you well know that it only applies to random events since we know their probabilities which would be the convergent value.

Most importantly, the law is called large numbers because it must not remotely be used in small number of observations. Flipping a dice 3 times and getting heads doesn't mean the next three will be tails to balance them and result in 0.5 probability. Gamblers fallacy will bite you in the behind hard here!

Mind you, when the underlying issue is noise, we do filter. My measurements are filtered to 1/12 octave because the noise there is electric and environmental noise and we are not interested in that. And we pick 1/12 so that we don't filter out what we still want to see. With averaging you have no control over the bandwidth of the filter, nor is strength.

If you want to do something here, pick geometric mean. At least there, you get one of the graphs as the center one and not be subject to smoothing that averaging would do. Problem with this scheme is that it is a) more work and b) when I used it, it didn't generate better results. See this measurement I did with B&K 5128:

index.php


Here is the GeoMean:

index.php


Notice that it has very high resolution unlike any averaging (even the little wiggles are preserved in 200 Hz region). But that still didn't benefit understanding of this headphone.

My approach is to find a much more reliable representation of the system by measuring multiple times and comparing left and right channels at two frequencies. There is no pretending that there is some magic dust in the form of simple averaging to give us good results out of bad measurements.
Nice explanation...good procedure and fit, thanks Amir!
 
OP
amirm

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
44,595
Likes
239,614
Location
Seattle Area
In signal processing, averaging improves SNR if noise is not correlated and the signal remains unchanged between runs. This is what happens when you average multiple measurements using headphones and an artificial ear using the same test signal. For random noise sources SNR increases as SQRT(N) where N is the number of averaged measurements.
No, no, no. Take a resonance peak. Shift the headphone just a bit and that peak can get replaced with a sharp dip. Now average the two and the resonance will be gone out of the data. You are telling me you got more truth that way by throwing out the resonance altogether?

This is not a case of measurement having noise and the use not. They both are subject to variation and that is the total response of the system.

Note that our brain is very good and reasoning through variations. This is what I do. All my measurements are stereo showing you two samples. It doesn't take much brain power to look at the two and make a judgement call on what the results mean. I can do this without losing resolution that I need in certain cases. Give me the data and let me analyze it. Don't throw it out and give me some cooked results.

Now, if we had 1000 observations, then yes, statistical data reduction is helpful because we can't intuit such information. This is not our situation.
 

infinitesymphony

Major Contributor
Joined
Nov 21, 2018
Messages
1,072
Likes
1,809
Take a resonance peak. Shift the headphone just a bit and that peak can get replaced with a sharp dip. Now average the two and the resonance will be gone out of the data.
Now this is starting to make sense. If a pair of headphones is presenting its own "room" based on physical dimensions, materials, distance from drivers, then shifting the headphones is like moving your measurement mic to a different part of the room, where you may pick up a substantially different room resonance or null. And because the "room" is intrinsic to the headphones, there is no Klippel equivalent for measuring them to remove this factor, so they must be measured in their best placement for the most accurate response.
 

pkane

Master Contributor
Forum Donor
Joined
Aug 18, 2017
Messages
5,667
Likes
10,299
Location
North-East
No, no, no. Take a resonance peak. Shift the headphone just a bit and that peak can get replaced with a sharp dip. Now average the two and the resonance will be gone out of the data. You are telling me you got more truth that way by throwing out the resonance altogether?

This is not a case of measurement having noise and the use not. They both are subject to variation and that is the total response of the system.

Note that our brain is very good and reasoning through variations. This is what I do. All my measurements are stereo showing you two samples. It doesn't take much brain power to look at the two and make a judgement call on what the results mean. I can do this without losing resolution that I need in certain cases. Give me the data and let me analyze it. Don't throw it out and give me some cooked results.

Now, if we had 1000 observations, then yes, statistical data reduction is helpful because we can't intuit such information. This is not our situation.

If you decide to pick just one of the hundreds of possible positions you're not representing the true response, either. Maybe you should try to measure and report at least two positions showing the extremes of the response. Otherwise, there's really no way to compare two headphones looking at the charts where one is showing an extreme dip and the other, none. This could be because of real differences, or it could be because the position was slightly off in one of the measurements.
 

solderdude

Grand Contributor
Joined
Jul 21, 2018
Messages
16,001
Likes
36,217
Location
The Neitherlands
Maybe using 'tolerance bands' is an idea.
Looks less messy than a lot of traces overlaid.
 
OP
amirm

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
44,595
Likes
239,614
Location
Seattle Area
If you decide to pick just one of the hundreds of possible positions you're not representing the true response, either. Maybe you should try to measure and report at least two positions showing the extremes of the response.
You have that in the form of two channels. On the first point, that is why I listen and develop EQ. I do this by eye and ear as opposed to a mathematical approach of matching to the curve which would assume precise measurements.
 

Robbo99999

Master Contributor
Forum Donor
Joined
Jan 23, 2020
Messages
6,971
Likes
6,831
Location
UK
You have that in the form of two channels. On the first point, that is why I listen and develop EQ. I do this by eye and ear as opposed to a mathematical approach of matching to the curve which would assume precise measurements.
Surely "in the form of 2 channels" doesn't account for all the possible variation, plus there are sometimes differences between channels that are inherent to each specific channel (I guess related to production tolerances and assembly tolerances).....and 2 measurements is surely not enough to work out the "tolerance bands" that solderdude mentioned, you're not gonna have any idea of the possible spread of data from 2 measurements.

EDIT: yes, you have a different approach to the EQ.....myself I'd prefer to have as representative measurement as possible on which to base more accurate graphical/mathematical EQ's. I think that's more useful. I suppose you could show your Geomean and then the tolerance band extremes & perhaps even with an averaged curve. So you could have Geomean, Tolerance Band, Average Curve.
 
OP
amirm

amirm

Founder/Admin
Staff Member
CFO (Chief Fun Officer)
Joined
Feb 13, 2016
Messages
44,595
Likes
239,614
Location
Seattle Area
Surely "in the form of 2 channels" doesn't account for all the possible variation, plus there are sometimes differences between channels that are inherent to each specific channel (I guess related to production tolerances and assembly tolerances).....and 2 measurements is surely not enough to work out the "tolerance bands" that solderdude mentioned, you're not gonna have any idea of the possible spread of data from 2 measurements.
*I* have an idea because I make a lot more measurements than what I show. As I keep explaining, I go through a process of calibration/optimization using two frequencies and both channels. I have done the work of throwing out the outliers.

I then put on the headphones and work on the EQ. Perceptually we appear to be much less sensitive to variations in how we hear headphones than measurements show. If this were not the case, we would be disgusted and stop listening to headphones.

Importantly, we don't know the precise tonality of any music. So ultimate goal here is approximate.

Really, the results are what matter. In every case I am able to EQ a headphone to produce far better fidelity than the way it comes. I had to develop a working process to make this happen. This was not the case when I blindly followed the schemes others used with 30 Hz tones, lot of averaging, etc. Maybe it works for others in which case you should follow their work, not mine. I cannot get blood out of stone and magically get you exact measurements when such is not physically possible.
 

RayDunzl

Grand Contributor
Central Scrutinizer
Joined
Mar 9, 2016
Messages
13,246
Likes
17,161
Location
Riverview FL
Flipping a dice 3 times and getting heads doesn't mean the next three will be tails to balance them and result in 0.5 probability. Gamblers fallacy will bite you in the behind hard here!

I've never seen anyone make heads (or tails) with dice.

1610057107899.png
 

Robbo99999

Master Contributor
Forum Donor
Joined
Jan 23, 2020
Messages
6,971
Likes
6,831
Location
UK
*I* have an idea because I make a lot more measurements than what I show. As I keep explaining, I go through a process of calibration/optimization using two frequencies and both channels. I have done the work of throwing out the outliers.

I then put on the headphones and work on the EQ. Perceptually we appear to be much less sensitive to variations in how we hear headphones than measurements show. If this were not the case, we would be disgusted and stop listening to headphones.

Importantly, we don't know the precise tonality of any music. So ultimate goal here is approximate.

Really, the results are what matter. In every case I am able to EQ a headphone to produce far better fidelity than the way it comes. I had to develop a working process to make this happen. This was not the case when I blindly followed the schemes others used with 30 Hz tones, lot of averaging, etc. Maybe it works for others in which case you should follow their work, not mine. I cannot get blood out of stone and magically get you exact measurements when such is not physically possible.
I edited my post after you replied:

EDIT: yes, you have a different approach to the EQ.....myself I'd prefer to have as representative measurement as possible on which to base more accurate graphical/mathematical EQ's. I think that's more useful. I suppose you could show your Geomean and then the tolerance band extremes & perhaps even with an averaged curve. So you could have Geomean, Tolerance Band, Average Curve.


----------------------------------------------------------------------------
Replying to your post I quoted.....
Well it's good that you did more measurements even though you're just showing one. I still think there is value in showing variations within multiple optimal positionings....so essentially you're reseating it optimally as best you can, rather than moving the headphone backwards and forwards on purpose. And then you could show the data as Geomean, Tolerance Band and Average Curve......I think there could be some value derived from that and I would find it interesting taking that data myself & using it for EQ, I'd do a Geomean EQ and an Average Curve EQ and see which gave the best listening tests - I'd do that if you measured a headphone I would own. There might be something to be learned from that as to which EQ method gives the most reliable results.
 

L5730

Addicted to Fun and Learning
Joined
Oct 6, 2018
Messages
670
Likes
439
Location
East of England
I just searched for HD650 to see if there any deals and nearly fell off of my chair when I saw £255 at Thomann.de.
Then I realised the UK isn't part of the EU anymore and so it'd incur VAT at some point. Add on 20% and it's back to the ~£300-10 range again.

Northern Ireland might become a neat way to avoid VAT once the 'rona is out of the way.
 

pkane

Master Contributor
Forum Donor
Joined
Aug 18, 2017
Messages
5,667
Likes
10,299
Location
North-East
You have that in the form of two channels. On the first point, that is why I listen and develop EQ. I do this by eye and ear as opposed to a mathematical approach of matching to the curve which would assume precise measurements.
I cannot get blood out of stone and magically get you exact measurements when such is not physically possible.

Ha! To clarify, I'm not asking for blood ;) Just for a more representative set of measurements. When you say you show the left and right traces, that's better than a single trace, but still two random samples drawn from many more that are possible. How are you deciding which positions to include in your review? Your review is what others will see as representative of these headphones, and not everyone is going to be EQing.

Here's an example of HD650 sitting on a flat plate, moved about 1mm at a time horizontally (shown on a linear frequency scale). Notice the different results at 6kHz, for example. What happens if I report any two results at random and they happen to fall near the top, or near the bottom at 6kHz? The impression will be very different:

1610112415906.png
 

Feelas

Senior Member
Joined
Nov 20, 2020
Messages
390
Likes
316
Ha! To clarify, I'm not asking for blood ;) Just for a more representative set of measurements. When you say you show the left and right traces, that's better than a single trace, but still two random samples drawn from many more that are possible. How are you deciding which positions to include in your review? Your review is what others will see as representative of these headphones, and not everyone is going to be EQing.

Here's an example of HD650 sitting on a flat plate, moved about 1mm at a time horizontally (shown on a linear frequency scale). Notice the different results at 6kHz, for example. What happens if I report any two results at random and they happen to fall near the top, or near the bottom at 6kHz? The impression will be very different:

View attachment 104522
I guess it's not impossible in theory to give out a family of measurements, but have fun doing 1mm-stepped testing (or any step, really) of headphones on GRAS without a special precise aparathus to make the process repetible.

It helps much more to make people sensitive to variance inherent in FR's produced from measurements, though.
 

pkane

Master Contributor
Forum Donor
Joined
Aug 18, 2017
Messages
5,667
Likes
10,299
Location
North-East
I guess it's not impossible in theory to give out a family of measurements, but have fun doing 1mm-stepped testing (or any step, really) of headphones on GRAS without a special precise aparathus to make the process repetible.

It helps much more to make people sensitive to variance inherent in FR's produced from measurements, though.

I've thought about a computer-controlled linear X-Y stage to move the headphone around. With a flat plate, it'll be easier to reposition the mic. But with a GRAS rig, probably not so easy to automate this.
 

Feelas

Senior Member
Joined
Nov 20, 2020
Messages
390
Likes
316
I've thought about a computer-controlled linear X-Y stage to move the headphone around. With a flat plate, it'll be easier to reposition the mic. But with a GRAS rig, probably not so easy to automate this.
Makes sense, albeit then you'd have to make out some universal method for selecting the properly sealing and representative measurements, which in itself isn't easy, I think. Either way, the problem lies in lack of industry-standard gear for that, I guess.

I was actually wondering about something connected for a few times - what if sometimes the lack of seal is engineered into the headphones and by sealing it, we're doing a bad job?
 

solderdude

Grand Contributor
Joined
Jul 21, 2018
Messages
16,001
Likes
36,217
Location
The Neitherlands
I guess it's not impossible in theory to give out a family of measurements, but have fun doing 1mm-stepped testing (or any step, really) of headphones on GRAS without a special precise aparathus to make the process repetible.

That, however, would ONLY show how the test rig reacts. Your head and perception will definitely react differently and can be many, many dB's.
It will differ per headphone as well so you don't really chart anything and the more you test the more you will come closer to the average 'curve' provided by the manufacturer which doesn't really fit any exact measurements.
 
Top Bottom