I've yet to make the balanced passive attenuator, but I do have the single 10-turn attenuator set up. I ran tests using that between the unbalanced output of the 1010LT to the unbalanced input of the V3 Mono. This does not allow use of the REW S-THD option to find the "sweet spot" for 1010LT output. The Monitor1 was connected between V3 Mono output probe and the 1010LT balanced input. The goal was to find the best measurement settings for V3 Mono @5W into 4 ohms.
The Monitor1 was set to the point previous determined to be the optimum for 1010LT balanced input and remained fixed throughout the test. The effort to find the best output to the V3 required a "manual stepped-THD". That is, the 10-turn pot was set to maximum while the REW output was set so that the V3 output was 5W (4.47V) to start. This provided the lowest possible REW output for 5W.
The "stepped THD" was accomplished by manually increasing the REW output by 0.1dBFS, then adjusting the 10-turn pot to reduce the V3 output back to 4.47V, making note of significant setting results along the way. As it turned out, each 0.1dBF change increased the V3 output by exactly 0.05V. This continued until the REW output was higher than about -10dBFS. Above this point the 1010LT output distortion continually increased.
I made a few observations during this. The upper range of the 1010LT output was similar to the 2i2 in that above a certain point the distortion made it unusable for measuring another device. Not surprising I suppose. Another one is that since the best area of low self-noise is in the unusable area, again in the upper output range. Due to this the SINAD/Noise numbers at the settings that were noted as best for distortion results when measuring the V3 are certainly not impressive. In fact, they are disappointing. There's no way to obtain the lowest distortion measurement results of the amp without it being where the audio interface self-noise isn't a factor. At least in my experience with both the 1010LT and the 2i2. I'll know more about the 2i2 when I make the 10-turn pot attenuator for its output control, but I expect that to be no different. Part of this is, of course, due to the relatively high self-noise of these two even near or at full output.
I ran a long series of V3 measurements at various settings of REW and potentiometers. These are the most representative. These all used the 1010LT unbalanced output to balanced input. The 10-turn pot was on the output with the Monitor1 on the feedback probe (direct, no voltage divider) to the input.
First up had the 10-turn pot at 9.7 (near maximum) that required to REW set at the lowest value that still provided 5W into 4 ohms. The left graph is the classical from the right overlaid on the coherent measurement. I'll have more to say on this later.
Next is the result with the REW output at -22.72dBFS (bad edit in the graph comments) with input at -11.51dBFS vs -19.25dBFS above. The 10-turn pot was again set so that the V3 output 5W (9.0, not 90 as typed). The most obvious differences are that the noise floor is little changed and most distortion components are similar with the exception that H2 in the coherent response is significantly worse. This also will be addressed later.
Next is a comparison with the same REW input, but with REW 0.0dBFS output vs -15.0dBFS output. The Monitor1 was untouched for the V3 feedback, only REW and the 10-turn pot were changed. There are significant differences. The most obvious is that the noise floor and noise are much lower with 0.0dBFS output, not surprising The HHD levels are similar. However, what is notable is that all even-order harmonics are worse, especially HD2. All graphics up to this point have been for a 32 average.
This brings me to the point I alluded to earlier. The first four graphs were what I call "cherry picked". What that means is that I've been aware from early on since working with REW that for an average, in these cases the 32 average default, that the harmonic component values can and usually do change rather dramatically if the 32 average is allowed to continue. The change is most often biggest in HD2, though all even-order will change a lot. The latter has more to do with which "sweet spot" is chosen. I focused on HD2 when capturing a graph update. HD2 very often had 10dB or more swings over time. HHD changed several dBs as well. Noise generally remained stable.
To follow up on that I started making longer averages. The default in REW for "Forever" with "Stop at" the initial default of 100. Later extended that, at one point letting it run overnight to more than 5000. Eventually I found that 100 was fairly close to longer averages, though not always. But what was always the case was that the longer it ran, the lower the numbers. That is, the HD components generally were worse. At some point they stabilize, but I would say that the the default of 100 is adequate. Examples are in the graphs below.
The right shows the classical measurement for 32 (red) and 1000 (black) average results. The left is another input level using classical and coherent measurements with classical 32 (red) and coherent 1000 (black). Both of these input levels were points I had made note of during testing that looked promising. Note how close they are for most HD components.
I've posted the above graphs because I have questions related to two aspects. One is about the typical 32 average often used. In my testing it does not appear to be adequate. This holds for both the 1010LT and the 2i2. I noticed this in measuring the V3 Mono and the Kenwood KM-X1 as well as for audio interfaces in loopback. An average of 100 or more seems to be more accurate. I ran a number of long term measurements, both classical and coherent. As can be seen above, the coherent numbers, even for different input levels (good ones, that is) tend to converge to nearly the same numbers. Cherry picking a continuing 32 point average provides widely varying results, but 32 seems to be a common average reported here, though even fewer at times, but 100 would be more accurate.
This made me curious about the APx results. The company site
AP More about Averaging provides some details. It has the option for number of averages as does REW. My interpretation is the Synchronous is analogous to Coherent in REW and Power analogous to Classical (non-coherent that is) in REW.
Synchronous averaging
Synchronous averaging operates on a time domain acquisition. Synchronous averaging is useful to examine coherent time domain waveforms by reducing the average level (but not the variance) of noise and other non-coherent signals. Since the frequency domain results in a measurement are derived from the time domain acquisition, synchronous averaging affects spectrum results as well.
Power averaging
Power averaging (spectrum averaging) operates only on a frequency domain result. The power spectra for all averaged acquisitions are summed and then divided by the number of acquisitions. Power averaging helps reveal coherent frequency components by reducing the variance (but not the average level) of noise and other non-coherent signals.
My other question is directed to Amir if he's reading this. What type of average is used in the APx and which averaging? I have not been able to find that info here.