• WANTED: Happy members who like to discuss audio and other topics related to our interest. Desire to learn and share knowledge of science required. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

What is timbre and can we measure it?

It would be nice to make a cheap acoustic guitar recording to sound like an expensive highly valued acoustic guitar with certain manipulation.
Perhaps AI could assist once it learns the difference and to apply that to the poor guitar sound recording making the recording sound like a M$.

In fact there are already AI generated songs that appear to be sung by original artists but aren't... that's a nice example of faked timbre of the the original voice.
 
It would be nice to make a cheap acoustic guitar recording to sound like an expensive highly valued acoustic guitar with certain manipulation.
Perhaps AI could assist once it learns the difference and to apply that to the poor guitar sound recording making the recording sound like a M$.

In fact there are already AI generated songs that appear to be sung by original artists but aren't... that's a nice example of faked timbre of the the original voice.
People thought it would be nice to fix pitch problems with people's singing, didn't they? Now all we get are complaints about that sort of manipulation. Ho hum...
 
And every note may be different, Changing the point where the bow contacts the strings by a couple of millimetres changes the sound. So musicians have an understanding of the different timbres their instruments should produce, Instrument makers (as I referred to before) have to create instruments that can adequately produce the expected sounds from different playing techniques. A classical guitar that produces weak sound when the player plays harmonics is not much use to a player performing music that requires that technique, after all.

IMO I'm not thinking of every note, to me it's more generic: if I can recognize a given instrument then that's its timbre, once the technique of the musician changes the sound of the same instrument it is unrecognizable then, to me, that is a different timbre.

I think the key point here is the second part of the question, can we measure it? I presume that what is intended is the question of whether we can measure the ability of our systems to reproduce different timbres: and standard measurements are how we do that.

If we reserve it just for music production, and IMO we should, then I'm not sure we need to - there would be no point. So, exactly as you said - that is measuring accuracy, transparency or fidelity of a sound reproduction system.

As for voice, the singer is still restricted by the size of the diaphragm, lung capacity, the physical characteristics of their vocal chords and palate, and so on. If you know the voice of a singer well enough, you will recognise it no matter what vocal technique they are using.

Sure.
 
But that delineation basically ignores electronic music. Which obviously has timbre (from individual 'instruments' and/or in aggregate).
If air is being moved it has the timbre of the source including a speaker connected to a Hammond B3 or Fender Stratocaster
 
Come on only 14 pages of circular discussion, what about Auto-Tune? Does it alter timbre or what?
 
To add to my Musician's observations - Have you heard of situations where "famous musician A" will pick up "totally different genre Musician B's Instrument" and it magically transforms from sounding like Musician B's rig to Musician A's rig - just due to swapping the player?

Heard these stories hundreds of times with the likes of Eddie Van Halen and other famous musicians sounding just like themselves on radically different rigs - and the "other" musicians always being gob-smacked that their rigs sounded like the famous player and his rig - just by changing the player (an "input" issue at its most rudimentary level, lol). Where do y'all pose this falls into the "Timbre" discussion?

Or - How much of "timbre" is the artist in and of themselves? If the musician/artist isn't the single biggest influence on the SOURCE of timbre, then what is?

Switching musicians is likely to have way more impact on timbre than anything else I can fathom unless playing tightly quantized keys or similar.

What about vocal timbres?
Tone is in the fingers, as they say. Often instrument preference is just for the axe that makes it feel like their tone is coming out more easily.
 
A speaker should only reproduce the sound of an acoustic guitar being played within the limitations set by the laws of physics, and, if the speaker is excellent, it should not introduce any significant change to the original sound signal.

Yes, I already understood that is your opinion.

The speaker does not replicate the actual instruments being played but reproduces the sound captured by the microphones used to record them. I believe this common conflation often leads to significant issues in subjective and objective disagreements regarding audio quality assessments.

I don’t think it’s a conflation to notice that loudspeakers can alter the sound of recordings in specific ways, some of which a person might prefer.

And reference to actual instruments and voices is often how we recognize high sound quality.

So I don’t think you can stick only to “ is the speaker accurately reproducing the recording” because in the end people want
“ good sound quality” and “ good sound quality” is not determined by “ accuracy to the recording” (because recordings can have anything from excellent to very poor sound quality). Therefore “ accuracy” cannot be a stand in for or determinative of sound quality.

And again, one (not all) of the references we have for recognizing sound quality is the apparent naturalness or realism with which a Soundsystem reproduces things like voices and instruments.

And in this case it’s possible to recognize if certain aspects of a loudspeaker is contributing to the greater realism or naturalness in playing a recording.

Take a track in which a female vocal has been recorded in a way that is a bit too sibilant to sound natural. You could take an equalizer and produce a little dip somewhere between 5 to 8K, and the voice will sound more natural.

Alternatively, a loudspeaker may have a dip in that region and produce the same effect - The vocal will sound more natural on that loudspeaker versus another more flat loudspeaker.

And perhaps if somebody listens to enough music in which vocals tend to be recorded with some excess sibilance (something I find quite common), they may find the loudspeaker with a bit of a presence dip has a trend to reproducing vocals in a bit more natural sounding less artificial manner.

So we can talk about accuracy but we can also talk about differences in sound quality between loudspeakers.

I’ve mentioned here before that one of my favourite loudspeakers has some level of colouration - there are major cabinet residences, some port resonance and likely other things going on. But I found quite a number of recordings to sound richer and more realistic on those speakers then many other neutral speakers. For instance, one of my pet peeves with piano recordings is the
“ floating piano keys” effect one often gets when listening to a piano recording from the Sweet spot and a stereo system.
It’s the impression of piano keys tinkling away, detached from any soundboard or body of the piano. What I’m used to when I’m playing a piano or listening to a piano played live near me is the impression of piano keys striking a large resonating solid object.

The piano recordings play through these particular speakers gave the sense of piano actually having weight, of the piano keys attached to a big resonating body, which gave me more of the impression of a real piano than I got through some more neutral speakers. This also happened with other instruments.

So I certainly have no quarrel with your own desire that speakers simply reproduce the recording with little distortion.

But I think it’s also perfectly reasonable To think about “ sound quality” in of itself, and what may or may not sound more natural or convincing in someways through some speakers versus others.
 
Last edited:
Transducers can change the frequency response which affects the reproduced signal and thus its timbre but it does not have a timbre.
If it had all instruments and voices would start to sound the same as it would be defined by the timbre of gear. It doesn't.

That’s actually how I tend to experience things. Once I hear a loudspeaker reproducing a variety of instruments I notice a sameness to the sound. I know essentially how drum cymbals or acoustic guitars etc. are going to sound on that speaker forever more. There is a certain colouration that - even if due only to frequency variations - registers in my mind as a “ timber” placed across everything. I’ve had loudspeakers in which everything had, to my ears, a slightly warm “ woody” character, whereas other speakers I’ve heard - often ones lacking in the warmth region and maybe some emphasis in the highs - give me the impression of a sort of silvery metallic tone on practically any music played through that speaker. And these Impressions strike me as differences in timber, just as the difference between a wood or metallic instrument would register as timber.

I’m not saying you should on my own account change the way you are thinking about timber. You are making some excellent points. I’m just reiterating one reason why I tend to almost helplessly, register differences in loudspeakers as differences in timber, and not only frequency response.
 
Last edited:
Power outages can be quite peaceful. Unless caused by proximate disaster, I hope you aren't near any wildfires.



I've elaborated previously but you've ignored, so I expect you'll continue to do so. But the steady-state spectrum you describe—with or without micro-tonal variation etc—is a component of timbre. Sometimes that's sufficient for differentiation without considering the overall envelope (or detailed onset/attack and so on). Sometimes as @kemmler3D noted upthread, it isn't. You are simply positing an incomplete definition. I've also posted Schouten's comprehensive description/definition. You can see there the 'subtle stuff' is not ignored at all.
I promised I might get back to Schouten and why I didn't find his "comprehensive" description. I will concede its tidy, but it not comprehensive, merely more complex and couched in "scientific" language. But it is, from a musician's point of view, a list of 4 things which need to be excluded from timbral consideration and one thing which was almost on point.

Let's take a look at its first statement;

1. "Range between tonal and noiselike character." Let me translate from the academic; "tonal" - that which is easily decomposed (the math was easy), noiselike (the math is tough). Statements like this shouldn't inspire confidence in what follows.

"2. Spectral envelope" If he'd written content, instead of envelope, I'd be happily on board. Of course, this could be the translator's mistake.

3. ADSR - that's his belief, (I almost called it his theory) I don't buy it for the same reason(s) I don't buy...

4. Changes both of spectral envelope (formant-glide) and fundamental frequency (micro-intonation), both of which are excluded from the standard plain-language definitions of Timbre, which exclude pitch and intensity.

5. Excluded for the same reason, with a dash of my objection to 1 above.

At best, this wiki-quote is an attempt at redefinition, but I con't see how it adds anything useful to our understanding of timbre.
 
From my own experience generating sound using synths, Timbre is first and foremost the envelope of the sound. Spectral content is second. Both have to be present.

Recognition of the category of instrument (guitar vs. piano vs violin or plucked vs struck vs bowed) is almost completely reliant on attack, decay, sustain, release. The difference between guitars or between pianos is spectral. And of course the player effects the ADSR of the instrument, but the instrument defines the range of variability in ADSR.

Anyway, if it ain’t got an envelope, it’s not timbre.
 
From my own experience generating sound using synths, Timbre is first and foremost the envelope of the sound. Spectral content is second. Both have to be present.

Recognition of the category of instrument (guitar vs. piano vs violin or plucked vs struck vs bowed) is almost completely reliant on attack, decay, sustain, release. The difference between guitars or between pianos is spectral. And of course the player effects the ADSR of the instrument, but the instrument defines the range of variability in ADSR.

Anyway, if it ain’t got an envelope, it’s not timbre.

Now that is succinct. And I agree. I can't understand the "timbre without envelope" proposition really.

Also, I could have saved my self some typing and mental effort if I'd simply waited for this post. :)
 
I promised I might get back to Schouten and why I didn't find his "comprehensive" description. I will concede its tidy, but it not comprehensive, merely more complex and couched in "scientific" language. But it is, from a musician's point of view, a list of 4 things which need to be excluded from timbral consideration and one thing which was almost on point.

Let's take a look at its first statement;

1. "Range between tonal and noiselike character." Let me translate from the academic; "tonal" - that which is easily decomposed (the math was easy), noiselike (the math is tough). Statements like this shouldn't inspire confidence in what follows.

"2. Spectral envelope" If he'd written content, instead of envelope, I'd be happily on board. Of course, this could be the translator's mistake.

3. ADSR - that's his belief, (I almost called it his theory) I don't buy it for the same reason(s) I don't buy...

4. Changes both of spectral envelope (formant-glide) and fundamental frequency (micro-intonation), both of which are excluded from the standard plain-language definitions of Timbre, which exclude pitch and intensity.

5. Excluded for the same reason, with a dash of my objection to 1 above.

At best, this wiki-quote is an attempt at redefinition, but I con't see how it adds anything useful to our understanding of timbre.

You are as good as your word.

We aren't going to agree on the "need to be excluded" part obviously. But interesting comments. I don't have a problem with his tonal vs noiselike. I did find the expression 'spectral envelope' a bit odd, assumed it referred to spectral content (vs time envelope) and struck me as Germanic and yes possibly an artefact of translation. Micro-intonation is a transient variation in perceived pitch (occurring in the course of the sonic event) so not the same as the fundamental pitch that people recognise as a note. Accordingly we don't need to exclude it. Nor transient variations in intensity.

The time envelope and quite especially the onset are fundamental to the character of the sound. As @IPunchCholla says immediately above, without the envelope it's barely timbre at all.
 
While I slept, some of you brought up questions about performer effects on timbre or the role of performer in timbre. These arguments have merit, primarily because they focus our attention on the actual production of the sound(s) on to which timbral descriptions might be hung.

For the sake of argument, allow me to temporarily adopt a combined timbre/envelope view of timbre. I use violin only because it provides an easy case study. Most other instruments are also "timbre malleable" depending on the means of excitation chosen. N.b., timbre does depends on manner of excitation, not on envelope, which is a product of the same choice of means.

When someone talks about violin timbre to a violinist, the violinist could easily ask” WHICH timbre were you talking about: the standard bowed timbre (arco), or pizzicato, or col legno. etc.?” There are dozens of conventional techniques for squeezing sound (and thus timbre) out of a violin, each with it’s own characteristic timbre and envelope. Yet the average listener to classical music would recognize the violin’s timbre with all but the most atypical performing techniques. In this context, conflating timbre and envelope is an unnecessary complication, which is why I argue for envelope as an independent, but closely coupled aspect of musical sound; because it is.

In simple terms, the player controls sound production, consequently affecting timbre and envelope independently if in closely associated ways. Coincidence or correlation is not proof of causality (added: or entanglement).

The violin techniques mentioned above can all be notated by the composer, conductor, section leader, or individual player. To some degree they modify the characteristic timbre and define (or constrain) the envelope. The degree to which these choices apply to sampled or synthesizsed sound depends on the depth (and ready availability) of envelope parameters in the player interface. Some of the physical modeling schemes are quite capable of convincing envelope analogs, but only to the extent the player interface allows.

Responding here to a recent post regarding piano and guitar; The two instruments (the piano especially) almost give credence to the ADSR model, but they are anomalous instruments (again the "unprepared" piano, particularly). Unfortunately, they are favorite instruments of theorists and academics, who are prone to treat them as representative, rather than as well populated ghettos in the diverse world of music instruments.;) Theories which rely too heavily on them are immediately suspect.
 
Last edited:
You are as good as your word.

We aren't going to agree on the "need to be excluded" part obviously. But interesting comments. I don't have a problem with his tonal vs noiselike. I did find the expression 'spectral envelope' a bit odd, assumed it referred to spectral content (vs time envelope) and struck me as Germanic and yes possibly an artefact of translation. Micro-intonation is a transient variation in perceived pitch (occurring in the course of the sonic event) so not the same as the fundamental pitch that people recognise as a note. Accordingly we don't need to exclude it. Nor transient variations in intensity.

The time envelope and quite especially the onset are fundamental to the character of the sound. We'll quite likely not recognise the instrument without it in certain cases. As @IPunchCholla says immediately above, without the envelope it's barely timbre at all.
Thanks for getting some of what I'm saying. It's refreshing to have someone credit another point of view, especially one that seems to threaten orthodoxy. Yes, I read the PunchCholla exchange and responded to it at the end of my immediately previous post.

Good day.
 
... I don’t think it’s a conflation to notice that loudspeakers can alter the sound of recordings in specific ways, some of which a person might prefer.

And reference to actual instruments and voices is often how we recognize high sound quality.

So I don’t think you can stick only to “ is the speaker accurately reproducing the recording” because in the end people want
“ good sound quality” and “ good sound quality” is not determined by “ accuracy to the recording” (because recordings can have anything from excellent to very poor sound quality). Therefore “ accuracy” cannot be a stand in for or determinative of sound quality.

And again, one (not all) of the references we have for recognizing sound quality is the apparent naturalness or realism with which a Soundsystem reproduces things like voices and instruments.

And in this case it’s possible to recognize if certain aspects of a loudspeaker is contributing to the greater realism or naturalness in playing a recording.

Take a track in which a female vocal has been recorded in a way that is a bit too sibilant to sound natural. You could take an equalizer and produce a little dip somewhere between 5 to 8K, and the voice will sound more natural.

Alternatively, a loudspeaker may have a dip in that region and produce the same effect - The vocal will sound more natural on that loudspeaker versus another more flat loudspeaker.

And perhaps if somebody listens to enough music in which vocals tend to be recorded with some excess sibilance (something I find quite common), they may find the loudspeaker with a bit of a presence dip has a trend to reproducing vocals in a bit more natural sounding less artificial manner.

So we can talk about accuracy but we can also talk about differences in sound quality between loudspeakers.

I’ve mentioned here before that one of my favourite loudspeakers has some level of colouration - there are major cabinet residences, some port resonance and likely other things going on. But I found quite a number of recordings to sound richer and more realistic on those speakers then many other neutral speakers. For instance, one of my pet peeves with piano recordings is the
“ floating piano keys” effect one often gets when listening to a piano recording from the Sweet spot and a stereo system.
It’s the impression of piano keys tinkling away, detached from any soundboard or body of the piano. What I’m used to when I’m playing a piano or listening to a piano played live near me is the impression of piano keys striking a large resonating solid object.

The piano recordings play through these particular speakers gave the sense of piano actually having weight, of the piano keys attached to a big resonating body, which gave me more of the impression of a real piano than I got through some more neutral speakers. This also happened with other instruments.

So I certainly have no quarrel with your own desire that speakers simply reproduce the recording with little distortion.

But I think it’s also perfectly reasonable To think about “ sound quality” in of itself, and what may or may not sound more natural or convincing in someways through some speakers versus others.

This is a good explanation as to why people select and prefer certain loudspeakers. I also notice a degree of sibilance in most vocal recordings (I think it's an almost unavoidable ear vs microphone thing). Also a kind of harmonic haze that follows the voice (production effects rather than distortion in the reproduction system). My main loudspeakers have qualities I enjoy, but the don't do anything to minimise those sonic aspects, perhaps the opposite. I sometimes wonder what all this sounds like on more mellow (dare I say "warmer") gear.

An instrument has a timbre (distinct sound).
That timbre can change depending on how it is played even if the tuning of the fundamental remains the same.
Timbre is the spectrum at each moment in time which changes during attack, sustain and decay and all those aspects determine how it sounds.
How it 'sounds' also depends on acoustics, distance, angle even if the timbre of the played note is the same.

Yes. An instrument maker cares about this a lot, of course. I tend to think of an instrument having potential only. But that is unmeasurable at rest. The sound when it is played has timbre, which is when we can measure it.

Tangentially, I had an ultrasound recently, and received the measurements. Very detailed and systematic, and obtained via running what amounts to a digital caliper over the fluctuating image. Electrical engineers will have glazed over by now, but for an acoustician that looks like how we may characterise/analyse a spectrogram.
 
The intention of autotune was to preserve timbre (voice formants, especially) while allowing manipulation of pitch. We can all judge how well it does that. Interesting that one of the things Autotune screws up worst is envelope (by squashing the legato). (A munged envelope is the most obvious clue that Autotune has been used in vocal tracks.)

I stood in line at an AES show, right behind Stevie Wonder, to tryout one of the early hardware Autotunes.
 
Last edited:
Back in the days when I did R&D in medical devices and before AI, I used this tool to learn things as I am Mechanical Engineer and not a Clinician. I plugged Timbre into it and here is what I got back.


My favorite a bit tangential to this discussion; Against Music? Heuristics and Sense-Making in Listening to Contemporary Popular Music

More on Point; Analysis and Modeling of Timbre Perception Features in Musical Sounds
 
Come on only 14 pages of circular discussion, what about Auto-Tune? Does it alter timbre or what?
Yes Auto-Tune during a recording, live performance or post-production to the voice track before mixdown, changes timbre. But if it were absolutely perfect (probably impossible) it wouldn't. Maybe generative AI will achieve that?

As an example, I can sing a tune deliberately flat by many notes at some points, perhaps an octave sharp followed by droning it nasally on one note. A recording of this will include my intended timbre. I then in post-production "fix" the tune with Auto-Tune, but inevitably my deliberate timbre will be different.

Applying Auto-Tune at home to my CD of Frank Sinatra's My Way to change the tune to Michael Jackson's Beat It will create a mess.
 
This is a good explanation as to why people select and prefer certain loudspeakers. I also notice a degree of sibilance in most vocal recordings (I think it's an almost unavoidable ear vs microphone thing). Also a kind of harmonic haze that follows the voice (production effects rather than distortion in the reproduction system).

Yes, it really depends on where an individual is coming from in terms of what he/she is looking for, and what their reference point is.

I’m always comparing to the sound of real sound sources because I’ve always found the difference between live and reproduced to be fascinating.

I used to (and still do occasionally) play some very well recorded single vocal tracks (some of which included members of my own family speaking) and have somebody stand in between the speakers where I perceived the vocalist to becoming from in the stereo image. And then I would have that person speak and go back-and-forth and compare the gestalt, General impressions, of a real human emitting sound from that spot versus the center image through the speakers.

It’s just so immediately revealing: what always leaves out to me is the mechanical/electronic/artificial nature of the reproduced sound, versus the distinctly softer, subtler, denser and especially organic
quality of the actual human voice - that specific sound you get emitting from the wet, damped material of the human throat, the chest resonance etc. That’s one reason why for me “ more organic” is a personal touchstone for a quality that I seek in reproduced sound. And all I have to please is myself not somebody else.

When I’m listening to an audio demo at a store, or an audio show, with the inevitable solo vocal track, you can get incredibly vivid
images out of such demos, but when I close my eyes and compare the sound of the reproduced voice to the sound of someone inevitably talking in the room or nearby, it always reveals an obvious difference between the human voice, and the mechanical reproduced voice.

And I remember at one audio show where I had done this in many rooms, one room above all stood out, and it wasn’t remotely one of the more expensive speaker systems.
They were just using a stand mount pair of Harbeth 30 speakers. The vocals sounded more organic and human than I’d heard from any other system at the show, and when I close my eyes to compare to other voices in the room, I was like “ damn, that’s close!”

I ended up owning Harbeth speakers for a little while and found their reputation for natural voice reproduction to be well-founded in my experience. For whatever reason, vocals reliably sounded more human and natural than any other speaker I’ve owned.
Even small background vocals in the distance in processed pop recordings sounded a little more like “ human beings, singing back there.”

(especially, IMO, using my tube amps, which for me nudged the sound even further in the organic direction).

These are the type of reasons and experiences why I find talk of “ accurately, reproducing the recording” to certainly be worthwhile, but doesn’t capture everything interesting happening in reproduced sound.
(or at least that I am interested in).

My main loudspeakers have qualities I enjoy, but the don't do anything to minimise those sonic aspects, perhaps the opposite. I sometimes wonder what all this sounds like on more mellow (dare I say "warmer") gear.

Like I’ve mentioned before I also auditioned the audio physic Avanti. (And I’ve had the AP Virgos, Libras and Scorpios in my listening room). So I know what you mean.

I think it’s a fine line one might want to walk.

I personally wouldn’t want so much colouration that I give up on being informed about the nature of the recordings. Both because the distinct quality of recordings themselves are part of the fun, and also of course the recordings contain that timbrel information you don’t want to lose, so you want to make sure you don’t homogenize that away.

I’m always trying to find that fine line, which is why for my own system I tend to be OK with a little bit of colouration that nudges it in the direction direction I want, but not to the point of too much obvious colouration or homogenization.
 
Back
Top Bottom