OP
Serge Smirnoff
Active Member
- Joined
- Dec 7, 2019
- Messages
- 240
- Likes
- 137
- Thread Starter
- #41
test vectors are for examples from pdf - https://www.dropbox.com/s/bw32imb3eghrjk5/diffrogram2.4.rar?dl=0
Thank you! I'm trying hard. There will be something interesting aheadVery happy to see you here and present your work to some of the folks here. I was wondering if you would ever swing by
test vectors are for examples from pdf - https://www.dropbox.com/s/bw32imb3eghrjk5/diffrogram2.4.rar?dl=0
Great results! You probably use 20lg definition of Df level while I use 10lg, so your values are multiplies by 2 of my ones. Otherwise the results are very close. Though some deviations from my Df levels is possible due to linearity issue discussed above and accuracy of FFT computing in your case. Mathematically the problem of linear time warping has only one possible solution, which is implemented in my algo. In the linear case precision of df computing can be compared between our computational methods. Additionally applying non-linear correction of time scale you can achieve better (lower df values). I also tried similar correction method (in time domain) at the beginning of my research. Such non-linear operation is known as canonical time-warping. Similar approach is used in Image Registration, which aimed at matching images in bio-medicine and computer vision [https://en.wikipedia.org/wiki/Image_registration]. I returned to linear transformation because of two reasons:
- I have some doubts that canonical time-warping is deterministic operation and has only one solution. I could not develop a reliable algo for the purpose.
- It turned out that in audio such non-linear correction is not necessary due to the nature of distortions in audio and practical convenience of computing df levels piece-wise with some window. The more so, linear correction allows to see that slow time deformations of signal on diffrograms. In practice they are very rear and usually point to some drawback in circuitry design. Most modern audio solutions do not have this problem.
So, I returned to linear time-warping as more simple, interpretative and helpful in audio.
Concerning precision of computation using FFT I think that some accuracy loss is acceptable in favor of computational efficiency. My algo computes white noise test vector in 7min on my notebook and sine one takes around 25min as the search of minimum is deeper in the case (lower df levels).
Still I'm pretty sure we lost x2 multiplication somewhere. We'll find it later for sure.Actually, I realized I was using 20log(1-corrcoeff) when I got -500dB results on the sine test file. I found the difference looking at your matlab code, so I changed it to 10log to be consistent. The results I posted were already with 10log.
You can choose/design any samples you like (be creative)). I think 30s tracks of any sample rate are ok. When the test vectors be ready, please, share them so I can process them too. We'll compare the results.For the next steps I'll run the analysis on a few real test tracks with known amounts of distortion to see how df metric fares against this. Do you recommend using simple test signals (sine waves, white noise, etc.) or real music files? I can do both.
Thanks for your interest. Please, ask questions.I'll be the first to admit that much of the math is a bit "beyond my pay-grade"...but I'm slowly working through the paper and the site. It's very interesting and I look forward to the day a more "layman-digestible" format exists (though the site does go a decent way towards this). Thanks for this.
Still I'm pretty sure we lost x2 multiplication somewhere. We'll find it later for sure.
You can choose/design any samples you like (be creative)). I think 30s tracks of any sample rate are ok. When the test vectors be ready, please, share them so I can process them too. We'll compare the results.
yes, good idea
NoWarp df measurements:
View attachment 42214
14.3661-14.0887-13.8006
Warped df measurements:
View attachment 42215
-14.3709-14.0912-13.8007
I think if you reduce number of FFT points you will get even better (lower) Df measurements as the reducing hide some details of both signals and their correlation increases.
I'll be the first to admit that much of the math is a bit "beyond my pay-grade"...but I'm slowly working through the paper and the site. It's very interesting and I look forward to the day a more "layman-digestible" format exists (though the site does go a decent way towards this). Thanks for this.
@Tks - I'm right there with you on the rabbit hole thing... happens more than I want to admit - but it usually ends up in some level of comprehension. Significantly more productive than watching a TV show at least in all cases.
Great! So it looks like my df calculation is off by about -4dB then. Let me double-check (I am discarding the first and the last 100ms segments, are you doing the same?)