• Welcome to ASR. There are many reviews of audio hardware and expert members to help answer your questions. Click here to have your audio equipment measured for free!

Multitone test files as an alternative to music tracks for loudspeaker listening evaluation

TiFi Speakers

Member
Audio Company
Joined
Feb 15, 2024
Messages
23
Likes
24
Location
EU
The work by Amirm, Erin, and others on loudspeaker measurements and controlled listening deserves high recognition. What remains unclear to me is the reliance on music tracks for subjective listening: how can one ensure that a chosen piece reliably exposes a specific loudspeaker flaw? Meaningful listening of this kind cannot be done quickly and often accumulates into hundreds of hours. In my own experience, identifying one loudspeaker issue took weeks, and another response-related problem only became apparent after several months. After reading long discussions about the causes of unusual distortion behavior in an ELAC DF63 floor-standing speaker, I became convinced that simple, purpose-built test signals are needed to expose loudspeaker weaknesses before moving on to music.

Assuming the loudspeaker has already been measured and is of reasonable quality, listening should primarily answer whether small deviations seen in measurements are actually audible and objectionable under real listening conditions.

Creating test files is easy; generating sweeps or multitone signals is not the hard part. The real challenge is how to identify problems reliably without musical training or “golden ears.” For that reason, clear listening instructions are at least as important as the signals themselves. In practice, I find some form of measurement necessary; at a minimum, a smartphone app such as Spectroid is useful for identifying problematic frequencies and relative level changes.

Some examples of potentially useful listening signals:
  • Slow logarithmic sweep – Reveals whether low-frequency room or loudspeaker resonances produce a “multitone” bass character. It also helps assess the combined effects of placement, room modes, reflections, driver interference, and directivity at the listening position, assuming constant sweep level.
  • Slow sweep with added 2nd harmonic (–40 dB) – A near-threshold harmonic makes the loudspeaker’s own harmonic distortion more audible. Frequencies where strong multitone artifacts appear should be noted.
  • Slow sweep with added 3rd harmonic (–40 dB) – As above, but also reveals whether multiple harmonics coincide and reinforce the same frequency, which can be problematic with music.
  • Bass timing using two-frequency transients – For example, 70 Hz combined with 2 kHz, with the high-frequency component delayed in steps (2–10 ms). The perceived point of time coincidence indicates whether bass is correctly aligned or delayed.
  • Pink noise for level reference – While problems can be evaluated at any level, it is useful to know the approximate SPL at which an issue appears. This can be aproximated using band-limited pink noise (e.g., 500–2000 Hz) at 1 m and a smartphone SPL app, without changing volume between tests.
Open questions remain: are square waves or other explicitly multitone signals useful, and which audible issues—beyond room effects, directivity, interference, and harmonic distortion—benefit from short, targeted listening tests?

The attached files (and translation) were generated with the help of ChatGPT. Please verify the files before use—you never know with AI.

My hope is that these files can be refined into more practical listening tools and expanded with additional signal types for evaluation prior to music listening. There are many variables, and even small changes (levels, markers, tone combinations) can significantly improve usefulness.
 

Attachments

Sample of request to ChatGPT for one of the files generated, file specifications:
Format: FLAC
Sample rate / bit depth: 44.1 kHz / 16-bit (CD quality)
Duration: 90 seconds (10 s/octave)
Sweep: logarithmic
Frequency range: 20 Hz → 10,240 Hz
Fundamental tone: –3 dBFS
Added harmonics:
H2 (2nd harmonic) only
–40 dB relative to the fundamental
No H3 or higher harmonics
Channels: mono
 
Indeed a valid question. I am sure that major speaker and chassis builder tried and use various measurement tools. Music olny can not reveal specific problems of a speaker. This is the reason why at audio shows some use music with low bass content and instruments which alredy have rich tone spectrum and then some distortion is covered for the ear. Not much measured speakers I can say that in a regular living room it is not possible to measure with slow sweeps. Anything in the room gets in resonance at some frequencies and rattles, even windows too. You need an acoustical dead room. So home measurements will not show a full picture. By the way, there is literature and youtube videos regarding speaker meas. But I am interested in further thoughts here.
 
Last edited:
For whatever it's worth, as of today, I can throw 3, 4 (music) test tracks at a speaker, and pretty much immediately know 90% of what I want to know about it. And that doesn't mean I'm a dullard or not picky. I am picky - a lot of speakers immediately fail the first music title for me. And only for the last few points of minor contention between already excellent speakers might I need substantially more time and effort. And I doubt I would let synthetic signals replace music for that purpose.

Synthetic signals for measurements, real signals for listening - how about that approach? :)
 
For whatever it's worth, as of today, I can throw 3, 4 (music) test tracks at a speaker, and pretty much immediately know 90% of what I want to know about it. And that doesn't mean I'm a dullard or not picky. I am picky - a lot of speakers immediately fail the first music title for me. And only for the last few points of minor contention between already excellent speakers might I need substantially more time and effort. And I doubt I would let synthetic signals replace music for that purpose.

Synthetic signals for measurements, real signals for listening - how about that approach? :)
From an engineering perspective, music is also a multitone signal. Therefore, I fully agree that a test signal can also be musically composed. What still troubles me, is this: if a sine sweep played through a loudspeaker in a living room sounds horrible and essentially unevaluable, how is it possible that my perception and brain are able to assess loudspeaker-induced problems in a piece of music, when from the sweep we know how terrible it actually sounds? Bridging this gap would require approaching the problem from either end—or both ends—by making either the test signal or the music progressively more revealing of the loudspeaker.

Again, the goal is not to replace either measurement or music listening, but to fill the gap in between.
 
Regenerated all files using Python, as I did not understand how the AI selected some of the parameters. In the Python files, most parameters are defined at the beginning, so you can easily adjust them, regenerate the test files, and share the results if some versions work better for listening tests.

The test files are explained in the first post. If you are unsure, you can open the Python files in Notepad; even without any coding knowledge, it is clearly visible what was generated and how.
 

Attachments

What remains unclear to me is the reliance on music tracks for subjective listening: how can one ensure that a chosen piece reliably exposes a specific loudspeaker flaw?
You're right and the listening tests here are generally secondary. They can "confirm" the measurements or demonstrate the practical implications of what was measured.

I haven't read as many of Erin's reviews but it seems like Amir does measurements first, then listens and experiments with EQ, starting with EQ adjustments based on measurements. Neither of them claim to be doing proper-controlled listening tests as part if their everyday reviews. They are listening carefully and critically but this kind of listening is categorized as "casual".

I wouldn't know what a good multitone test sounds like and I suspect it would take lots of training to learn to make sense of it. And, I'm still not sure if it would be useful. Amir does multitone measurements.

I've heard white noise and pink noise plenty of times but I'd trust measurements far-more than with my ears. I probably wouldn't notice a 3dB frequency response difference with pink noise from one day to the next... (Noise-based measurements can have variability too since noise contains randomness.)

Music has sort-of a built-in reference since we have a feel for how different instruments and musical sounds should sound like in relationship with each other, especially if we are familiar with the recording.
 
It is hard to understand what you are proposing.

Our audio cortex is trained by voice and music since birth. It is not trained on test tones, and the response in the room is highly determined by the room. Any mix or mastering engineer will bring reference tracks into a new room. I have been very fortunate to have reference live instrument experience.
 
Just for your reference...

1. "SONY Super Audio Check CD 48DG3, 1983" ref. here #651 on my project thread.

2. And, I would like to recommend you establishing your own consistent (for log years) and excellent-recording-quality "Audio Reference/Sampler Music Playlist" consists of various genres of music tracks fit for your music preference. For example, you can find my such playlist here on my hosting thread on the subject.

If you would be seriously interested in having all the intact/non-compressed tracks of 1. and 2. above, please simply PM me.

Edit:
In case if would be also interested in the tone-burst test signals which I prepared and applied in my time-alignment (phase-continuity) and transient (step response) measurements and tunings of my multichannel system shared in my posts under the spoiler cover, please simply PM me; I will be happy sharing them with you.
- Precision measurement and adjustment of time alignment for speaker (SP) units: Part-1_ Precision time-shifted pulse wave matching method: #493
- Precision measurement and adjustment of time alignment for speaker (SP) units: Part-2_ Energy peak matching method: #494
- Precision measurement and adjustment of time alignment for speaker (SP) units: Part-3_ Precision single sine wave matching method in 0.1 msec accuracy: #504, #507

- Measurement of transient characteristics of Yamaha 30 cm woofer JA-3058 in sealed cabinet and Yamaha active sub-woofer YST-SW1000: #495, #497, #503, #507

- Identification of sound reflecting plane/wall by strong excitation of SP unit and room acoustics: #498
 
Last edited:
Back
Top Bottom