Started a few weeks ago an experiment with Claude Code to develop a full Trellis SDM for DSD playback.
Since at that time Claude was still working decently (right now it's a mess) I kept adding more and experimenting everything.
Even developed a full GPU Trellis SDM running over 252 parallel segments which was a dead end but quite interesting as a learning exercise.
At the end I made my own algorithm to stitch parallel SDMs and it works pretty well at DSD256/DSD512 with 4 segments.
Good enough to run the Trellis SDM at DSD512 on my 5950X, even if only with two candidates.
You really need a high-end, top of the line, latest gen CPU to run Trellis with even average settings.
But even with 2 candidates the quality is still superior to the greedy PreCorr SDM which as the advantage that can run with negligible computational requirements.
The DSP is still very rough and quite buggy but it works surprisingly well already and since it's open source, can be used as resource codebase for others.
Today I even added a full Convolution filter up to 6 channels that works both on DSD and PCM; simple on CPU and CUDA GPU accelerated up to 4M taps.
I've set a strict rule to never decimate DSD to PCM rate and was able to meet the goal: direct DSD to DSD with DSD-Wide, volume control with DSD-Wide, Convolution without PCM decimation.
Among the other things there's also:
It requires the latest version of foo_input_sacd to work.
Use at your own peril and good luck.
github.com
Since at that time Claude was still working decently (right now it's a mess) I kept adding more and experimenting everything.
Even developed a full GPU Trellis SDM running over 252 parallel segments which was a dead end but quite interesting as a learning exercise.
At the end I made my own algorithm to stitch parallel SDMs and it works pretty well at DSD256/DSD512 with 4 segments.
Good enough to run the Trellis SDM at DSD512 on my 5950X, even if only with two candidates.
You really need a high-end, top of the line, latest gen CPU to run Trellis with even average settings.
But even with 2 candidates the quality is still superior to the greedy PreCorr SDM which as the advantage that can run with negligible computational requirements.
The DSP is still very rough and quite buggy but it works surprisingly well already and since it's open source, can be used as resource codebase for others.
Today I even added a full Convolution filter up to 6 channels that works both on DSD and PCM; simple on CPU and CUDA GPU accelerated up to 4M taps.
I've set a strict rule to never decimate DSD to PCM rate and was able to meet the goal: direct DSD to DSD with DSD-Wide, volume control with DSD-Wide, Convolution without PCM decimation.
Among the other things there's also:
- Anti-pop filter (doesn't really work that well)
- DSD pre-emphasis real-time ML filter (very experimental) using ONNX runtime DX12/CUDA
- Very complex worker system to distribute the load over multiple cores, based on my code, SMT/AMD/Intel aware, supporting CPUSet
- XMOS DAC detection
- GPU FIR/Boxcar/lowpass offload via CUDA/DX12
- PCM conversion via SoX
It requires the latest version of foo_input_sacd to work.
Use at your own peril and good luck.
GitHub - mann1x/foo_dsd_trellis: foobar2000 DSD Trellis SDM DSP plugin — C/MSVC/Windows
foobar2000 DSD Trellis SDM DSP plugin — C/MSVC/Windows - mann1x/foo_dsd_trellis