- Thread Starter
- #21
Sorana Flow v1.8.2 — SIMD-Optimized DSP, Lock-Free RT Path, 61 Correctness Fixes
v1.8.2 is available. This release focuses on measurable audio engine improvements and correctness.
DSP Performance
All compute-heavy DSP has been moved to Apple's vDSP/Accelerate SIMD:
DSP module compiles with -O3 -ffast-math -flto=thin.
RT Thread Safety
The real-time audio callback now has:
Linear Phase EQ Transitions
Previous: single OLA instance → 29ms silence during kernel swap.Current: double-buffered OLA with:
The +1 warmup ensures the overlap-add buffer used for the first valid output was computed from a fully-populated frequency domain delay line — eliminating the residual pop from contaminated overlap data.
Correctness Audit
Full static analysis across 8 categories (thread safety, playback logic, DSP state, error handling, audio engine, UI, Apple Music integration, database). Results:
Key fix: shuffle/repeat skip used QTimer::singleShot(0) to defer track loading, creating a race window where the gapless pre-load would fire before the new track loaded. Solved with an atomic m_userSkipPending flag that suppresses stale gapless transitions during user-initiated skips.
New Features
Download: https://soranaflow.com
Feedback welcome — especially from anyone running measurements or comparing against other players.
v1.8.2 is available. This release focuses on measurable audio engine improvements and correctness.
DSP Performance
All compute-heavy DSP has been moved to Apple's vDSP/Accelerate SIMD:
- Convolution (partitioned overlap-add): vDSP_zvma complex multiply-accumulate → 3-5× throughput
- HRTF binaural: vDSP_conv block-based FIR replaces per-sample scalar loop → 5-10× throughput
- Minimum phase EQ: vDSP_biquad cascaded IIR → 2-3× throughput
- Gain ramps: vDSP_vrampmul replaces per-sample multiply
- Peak limiter: Padé [3,3] rational tanh approximation replaces std::tanh
DSP module compiles with -O3 -ffast-math -flto=thin.
RT Thread Safety
The real-time audio callback now has:
- Zero blocking mutex locks (all converted to try_lock or lock-free atomics)
- Zero heap allocations (all buffers pre-allocated in prepare())
- Lock-free staged swap for convolution IR, HRTF filters, and EQ kernels
- NaN/Inf sanitization in the peak limiter chain
Linear Phase EQ Transitions
Previous: single OLA instance → 29ms silence during kernel swap.Current: double-buffered OLA with:
- FDL warmup counter (m_numKernelPartitions + 1 partitions before crossfade)
- 128-sample equal-power crossfade (sqrt(t) / sqrt(1-t))
- Batch kernel build (1 FFT per preset change, was 10)
The +1 warmup ensures the overlap-add buffer used for the first valid output was computed from a fully-populated frequency domain delay line — eliminating the residual pop from contaminated overlap data.
Correctness Audit
Full static analysis across 8 categories (thread safety, playback logic, DSP state, error handling, audio engine, UI, Apple Music integration, database). Results:
- 5 critical fixes (use-after-free, data race, unguarded destruction)
- 8 high fixes (gapless race conditions, shuffle/skip desync, device disconnect)
- 27 medium fixes (atomic upgrades, channel guards, DSD validation, FTS5 injection)
- 21 low fixes (defensive null checks, RAII, edge cases)
Key fix: shuffle/repeat skip used QTimer::singleShot(0) to defer track loading, creating a race window where the gapless pre-load would fire before the new track loaded. Solved with an atomic m_userSkipPending flag that suppresses stale gapless transitions during user-initiated skips.
New Features
- AIFF/AIF playback
- XSPF playlist import/export
- ISO 226 equal loudness compensation (3 presets)
Download: https://soranaflow.com
Feedback welcome — especially from anyone running measurements or comparing against other players.