How to experience 7.2.4 on Windows?

MCH · Dec 2, 2022

Sorry in advance for potentially very stupid question.
I understand cavern is to convert content you might have in a disc or in your computer (?).
Is there a way to decode the 5.1 dd+ from say, Netflix, on the fly while you watch and get it out of the computer via multichannel card to your amps without avr?
Thanks

VoidX · Dec 2, 2022

@goodkeys, the upmixer is a standard matrix-based 5.1, everyone's doing the same thing. Height generation is something that's done better by others, and DTS can even separate speech to the center entirely, that's only found in AVRs. If upmixing is a must, you can't skip an AVR.

MarcosCh said:
on the fly

It's pre-render only currently. Real-time solutions are in the works.

MCH · Dec 2, 2022

VoidX said:
It's pre-render only currently. Real-time solutions are in the works.

This is great, will be waiting for it. Thanks!

tifune · Dec 2, 2022

VoidX said:
Height generation is something that's done better by others

Can you elaborate at all? Height channels are a confusing, almost black box, subject to me. I've always wondered how the signals are generated by upmixers

VoidX · Dec 2, 2022

tifune said:
Can you elaborate at all? Height channels are a confusing, almost black box, subject to me. I've always wondered how the signals are generated by upmixers

There are three possible versions:
- Matrixing (Dolby, Auro): upmixing purely mathematically, not necessarily by adding/subtracting channel signals, but doing the same with multiple subbands per channel. A subband is not frequency-dependent, the exact method used is called QMFB.
- Height generation (Cavern, maybe QuantumLogic): height generation is done per channel, not with a single formula, but by some metric of the mono signal. In Cavern's case, this is simply the pitch, smoothed out in time.
- Neural networks (DTS): an NN is a fancy way to brute force a method for a problem science can't really solve yet, upmixing is actually one of these problems, since we have no way of assigning sound cues to the camera work. An NN just have to learn enough about the problem, and it will figure out something - it doesn't really mean anything to the devs, so they also don't know the exact method their NN found. Teaching an NN means you have lots and lots of sources (5.1/7.1 tracks) and targets (Atmos/DTS:X tracks), and it tries to figure out a graph with weights by trying lots of solutions by guessing, and it keeps the best. You can give this graph an input 5.1/7.1, it does its magic, and outputs a full spatial mix.

goodkeys · Dec 2, 2022

Thank you @VoidX! That's good to know.

orangezero · Jan 4, 2023

Just curious if anyone on here had done any testing with cavern on windows? I was considering adding some upper speakers but I'm not sure it is possible. I was hoping to avoid an AVR and just use some digital amplifier boards. (I'm not sure I have enough atmos material to worry about it right now)

Doodski · Jan 4, 2023

orangezero said:
I was hoping to avoid an AVR and just use some digital amplifier boards.

What do you mean by digital amp boards?

tifune · Jan 4, 2023

VoidX said:
There are three possible versions:
- Matrixing (Dolby, Auro): upmixing purely mathematically, not necessarily by adding/subtracting channel signals, but doing the same with multiple subbands per channel. A subband is not frequency-dependent, the exact method used is called QMFB.
- Height generation (Cavern, maybe QuantumLogic): height generation is done per channel, not with a single formula, but by some metric of the mono signal. In Cavern's case, this is simply the pitch, smoothed out in time.

Why did you choose to go with height gen. instead of matrixing? resource constraints (time, licensing, $$)? Or is there some other benefit?

Re: QMFB, I've tried to read about it but can't find anything entry-level. Can you recommend something, or maybe give some examples of subband criteria? TY for shedding some light on this topic!!

VoidX · Jan 4, 2023

tifune said:
Why did you choose to go with height gen. instead of matrixing?

Upmixing by itself is a problem that is mathematically impossible. Let's say, your front left has a value of 4, your front height is 2, and added together (downmixed), it's 6. Your task would sound like this in math: the result is 6, what numbers did I add together? There are infinite solutions, and only one is right. Matrix encoding is a hack to this, but it requires a content to be specifically made for your solution. It means the equations are already known, for example your left front source after mixing would sound from the left front in 50% volume, 25% from the center, and 25% from the left side. This is a bed example, but illustrates the problem nicely. This can be done in reverse, but other channels are also mixed to your outputs, so a huge crosstalk will always be present. My guess is that Auro chose the height speakers instead of top ones not because they're better, but because this hides the fact that speech can't be contained on the center with their method, it will have a considerable chunk of it on the center height. Since I have no say in the authoring tools, the Auro equations are unknown (can be reverse engineered very easily, but I don't have any kind of decoder), and Dolby is varying these equations on the fly (metadata), matrixing is a no-go.

Matrixing is a great solution for transmission of 5.1 in 2.0, but that's the most with good enough crosstalk. The LFE can obviously be straight-up included in other channels after gain-matching (IMAX is using this in practice, they have no LFE channel, just a crossover), center is usually done with this method, and surrounds are phase-matrixed with a Hilbert-transform, there are 4 ways of 2 + or - signals to be contained, all of them are present. Since there are no other variations, it's extinguished all possibilities, other channels can only be added with way larges crosstalk.

tifune said:
QMFB, I've tried to read about it but can't find anything entry-level. Can you recommend something, or maybe give some examples of subband criteria?

It's just simply a band selection algorithm. It has a formula that separates part of a signal defined by some characteristics (not just by frequency). The metadata contains how much of each band for each input channel is used for one output channel, so it's basically a dynamic matrix after the disassembly of the source signal in a known way.

orangezero · Jan 6, 2023

Doodski said:
What do you mean by digital amp boards?

Class D amplifier boards like they sell at partsexpress is what I meant. From my reading recently, class D amp boards using tpa3255 chips may have some sync issues when used together. Something like a JAB5 has 4x100w output, and can link (and sync) to another board with is2, so that would get me 8 channels. I see parts express has a 6x100w board.

knreddy · Jun 4, 2023

1) I have PC with 5.1 channel sound card
2) Onkyo AV Receiver 5.1 with direct analog aux inputs (6 channels) is available
(It is not having HDMI input or output, it is also not having Dolby Atmos decoder)
Can use the above setup as 3.1.2 (similar to current 3.1.2 Dolby Atmos soundbars / pioneer vx326 5.1 Dolby Atmos capable av receiver) with Cavern.

How to experience 7.2.4 on Windows?

MCH

Major Contributor

VoidX

Member

MCH

Major Contributor

tifune

Major Contributor

VoidX

Member

goodkeys

Member

orangezero

Member

Doodski

Grand Contributor

tifune

Major Contributor

VoidX

Member

orangezero

Member

knreddy

New Member

Similar threads