Level 1: a speaker response has to be flat.A neutral response would be a flat line across the frequency spectrum. The curve we have to work with is not neutral at all. It's a common scooped listening curve (aka the smiley face).
(This is the curve I aim for when mixing live, more or less, so I'm not knocking it at all)
Here's a test to try. Listen to Bob Katz's mastering job of Future of Forestry's "Life Begins Today" on a pair of Beyerdynamic DT990 Pros, and see if it sounds as terrible as the review suggests it might. My hypothesis is that it will sound pretty darn good. Even if you don't have corrective EQ and a pair of these, try it out. Does it sound horrendous?
Level 2: speaker's off axis response also count on the perceived response, so that a good speaker has a tilted down linear slope.
Level 3: with headphones and earphones the sound is shaped by the ear shape, a linear response has an ear gain to be considered flat sounding.
Level 4: a translation of a sloped linear speaker into earphones would be a sloped ear gained response.
Now you are up to speed.