JSmith
Master Contributor
That is unnerving and hilarious all at once... rather disturbing in many ways.There is a slight chance that they live a little too close to Three Mile Island . . .
JSmith
That is unnerving and hilarious all at once... rather disturbing in many ways.There is a slight chance that they live a little too close to Three Mile Island . . .
We didn't train the models ourselves. I don't know what Meta uses as data to train Llama but you may be right that training the model yourself could improve results.Interesting. Where has the training data come from? If it's based on general internet audio discussions, it's likely to be more biased towards subjective opinion than objective measurement. If the training data came exclusively from academia, ASR and competent engineering companies it would be less likely to focus on sound and more likely to pick up on reproducibility of analysis.
That's the experience of my colleagues (in a rather huge example). The internet training data turns out to have lots of nasty systemic bias in it, as you would expect.We didn't train the models ourselves. I don't know what Meta uses as data to train Llama but you may be right that training the model yourself could improve results.
Not sure if we have enough time to gather data that can be used for training, then train the model and implement a first version for summaries.
Have you considered fine-tuning these models? I've heard this gives big improvements especially if you are using the smaller 7B models?We didn't train the models ourselves. I don't know what Meta uses as data to train Llama but you may be right that training the model yourself could improve results.
Not sure if we have enough time to gather data that can be used for training, then train the model and implement a first version for summaries.
The problem with training a model ourselves is time and computing power/money. Training a model is something we won't be able to do.That's the experience of my colleagues (in a rather huge example). The internet training data turns out to have lots of nasty systemic bias in it, as you would expect.
I would think that a system trained on internet data is likely to have even more cognitive bias than a troll turning up on ASR insisting they can hear a night and day difference between DACs, cables and op-amps and their wife in the kitchen agrees.
So if I got this right, summaries should be shorter, image data should be included and obviously mistakes need to be reduced. Thanks a lot that's something we can work with.My feedback:
Mistral 7B
- Falsely states that PA5 II is "Class A"
- Makes up stuff about "warm and detailed sound" which wasn't stated by the reviewer
- Incorrectly talks about issues with durability, which do not relate to the amplifier discussed in the review, but the predecessor
- Well structured but very superficial and broadly generalized summary with very little substance
LLama3
- Uses annoying platitudes like "a popular choice among audiophiles"
- Makes up stuff about "smooth midrange, and tight bass"
- "Summary" and "Key Points" are very repetitive
- States "I did not lose much information when summarizing the thread." ???
- Summary of the "Tower vs Bookshelf Speakers" thread looks broadly correct, but also contains many repetitions
Overall, I don't think the current state gives usable results. If "forced to choose", I would prefer a Mistral 7B, but with shorter outputs and some corrections to the obvious errors. Overall, the summaries appear very superficial and disregard most of the relevant "hard facts" from the main (first) review post of each thread. I think apart from integrating data from the images, there needs to be some directive to summarize the review post first and then add a summary of the dicsussion in a second step.
As already mentioned by others in the thread, one important limitation seems to be the context size. Could something like YaRN be helpful here?
We didn't plan on fine-tuning one ourselves yet since we want to fix some other things first. Next step will be to run tests with improved context windows.Have you considered fine-tuning these models? I've heard this gives big improvements especially if you are using the smaller 7B models?
IMO summaries don't need to be super short, but they should avoid repetition, and should have low/no mistakes - the most important one to avoid is including ideas that weren't actually present in the threads. The threads that need summarization are often 100s of posts long, so even a full-page summary would be OK as long as it caught the key points at a useful level of detail. If someone could read an LLM thread summary, and the original post, and then participate meaningfully in the ongoing discussion based on that, it would be a real success for this use case IMO.So if I got this right, summaries should be shorter, image data should be included and obviously mistakes need to be reduced. Thanks a lot that's something we can work with.
We are already looking at models with an improved context window, so YaRN could come in handy. Will definitely give it a closer look.
I hope to be able to run a first test with a tuned model until Friday, but since we are both working other classes aswell time is the limiting factor.
Avoiding repetition and hallucination/mistakes is definitely on the agenda. We tried something with usernames and had massive problems with hallucination there so this definitely needs to be fixed.IMO summaries don't need to be super short, but they should avoid repetition, and should have low/no mistakes - the most important one to avoid is including ideas that weren't actually present in the threads. The threads that need summarization are often 100s of posts long, so even a full-page summary would be OK as long as it caught the key points at a useful level of detail. If someone could read an LLM thread summary, and the original post, and then participate meaningfully in the ongoing discussion based on that, it would be a real success for this use case IMO.
No, sorry, I meant that a summary could be a full page in length (say 500+ words) and still be useful, since the underlying threads can be so long.If I understand you correctly by a full page summary you mean summarizing each page of a thread on its own and then showing the summary for each page right?
Ahh okay now I get what you mean, thanks for clarifying.No, sorry, I meant that a summary could be a full page in length (say 500+ words) and still be useful, since the underlying threads can be so long.
However, I guess if it helped with the token / context window difficulties, doing a very brief summary of each page of a thread could also be useful.
A combined summary of X sub-summaries could still be pretty handy.Ahh okay now I get what you mean, thanks for clarifying.
Yeah the one page at a time approach is one option if there is no better way to fix the content window issue. Maybe not a single page every time, but a group of pages every time und these get summarized. We will see if this is necessary or a possible solution in the next few days I think.
Amazon summary of consumer reviews does that and it works well. It is usually 3 to 4 sentences. The first few are praises and the last are issues some have found (if any).So if I got this right, summaries should be shorter, image data should be included and obviously mistakes need to be reduced. Thanks a lot that's something we can work with.
This is the main issue with the review, and demonstrates that the AI/LLMs did not understand the assignment. The hallucinations just magnified this problem.Lastly, they both just summarize what type of discussion happened in the thread, not what the prevailing opinions or conclusions might be. While this is valid, it's not very useful. What you'd want to know in a thread summary is where the balance of opinions landed, not only what topics the thread happened to touch on.
Here is the extended summary:
The forum posts discuss the TP RA3 Rackmount Amplifier, a new product from Topping's sub-brand, which costs $229 and is available on Aliexpress. The amplifier has a remote control and SOTA resistor-based attenuation.
User amirm shares their review of the amplifier, including measurements, and notes that it looks great with a large and highly visible volume level. Another user, Music1969, asks about channel imbalance, but amirm responds that they only have this version and don't measure channel imbalance for power amps as it's not typically an issue.
Paco De Lucia shares their positive experience with the amplifier, using it in a bi-amp setup without any problems. BeerBear suggests that an electronic/digital volume control would be nice to achieve perfect channel balance and avoid scratching.
Antcollinet asks about pairing the amplifier with a DAC that has remote control. Music1969 brings up the issue of channel imbalance again, referencing a previous review of the Topping PA3S, which showed channel balance issues at low volumes. They ask amirm to measure the channel imbalance of the RA3.
Howard416 identifies the amplifier's chip as the TPA3251. Talisman is cautious about voting for the product due to concerns about reliability, citing past issues with Topping products. Yoku-San suggests that if the amplifier proves reliable, it could be part of an interesting entry-level combination with other affordable components.
Additionally, there was a discussion about the slew rate of the amplifier, with Antcollinet mentioning that it's important for a power amp to have a high slew rate to handle dynamic music. He calculates the slew rate based on the chip's specifications and concludes that it should be able to handle most music without issues.
Here is the extended summary:
The forum discussion revolves around two main topics: the ability of audio equipment to reproduce signals accurately and a specific amplifier product review.
In the first part of the discussion, users debate about the speed at which an audio device can reproduce signals within the human audible range (20Hz to 20kHz). User Julf suggests calculating the slew rate by multiplying the full power output voltage with the upper frequency limit. However, User Maxon insists on knowing the specific parameter of slew rate. The discussion continues with corrections and clarifications on how to calculate the slew rate accurately.
In the second part of the discussion, users review and discuss the TP RA3 Rackmount Amplifier, a new product from Topping's sub-brand. The amplifier has a remote control and SOTA resistor-based attenuation. Users share their positive experiences with the amplifier, including measurements and usage in bi-amp setups. Some users raise concerns about channel imbalance and reliability issues, referencing previous reviews of similar products. Others suggest potential pairings with DACs that have remote control and discuss the importance of perfect channel balance.
Overall, the discussion covers technical aspects of audio equipment performance and user experiences with a specific amplifier product.
Here is the extended summary:
The forum discussion revolves around audio equipment and speaker systems. The conversation starts with User cathodeb Wall sharing an unusual recording experience using a dummy head mic and seeking opinions on the matter.
User Emlin humorously remarks that it might depend on how much time one is willing to devote to listening to car Doppler effects, implying that the topic may be niche or require significant dedication.
The conversation then shifts to a discussion about speakers and subwoofers. User sigbergaudio mentions that there are few speakers that can rival a good subwoofer in terms of performance, citing the Revel Salon 2 as an example. They argue that even high-end speakers like the Salon 2 may not be able to produce meaningful output at low frequencies (20Hz), and that adding a subwoofer can enhance the overall sound quality.
User simple6 asks about the importance of listening distance in choosing between tower speakers and bookcase speakers, wondering if their bookcase speakers would be sufficient for a longer listening distance. They express concern about potential distortion at high SPL (sound pressure levels) and whether their two subs could provide enough power to mitigate this issue.
User tuga chimes in with several comments. Firstly, they suggest that the baffle design of floorstanders might affect vertical dispersion, potentially reducing perceived spaciousness, but note that this effect would likely be minimal. They also mention that visual cues, such as the physical presence of floorstanders, could influence one's perception of the sound.
In another comment, User tuga clarifies that the Revel Salon 2 is a rare 4-way speaker with dedicated low- and sub-bass sections, which allows it to produce full-range sound with quality.
Finally, User tuga shares their personal experience with reference speakers, including the B&W F801 and TAD Reference 1, stating that even these high-end speakers could benefit from the addition of a pair of subs or a "swarm" (a large number of smaller subwoofers).
Additionally, Grandeur suggests using a "swarm" of small subwoofers to enhance sound quality. User tuga also recommends considering room acoustics and speaker placement when setting up a home audio system.
Furthermore, the discussion touches on the importance of proper subwoofer setup and calibration, as well as the potential benefits of using multiple subs in a home audio system. Grandeur shares their experience with using two subs to create a more immersive listening experience.
The conversation also explores the topic of room acoustics and how it affects sound quality. User tuga emphasizes the importance of considering room dimensions, speaker placement, and acoustic treatment when setting up a home audio system.
Lastly, the discussion delves into the world of subwoofer setup and calibration, with Grandeur sharing their experience with using the "swarm" technique to enhance sound quality.
Here is the extended summary:
The conversation revolves around the importance of subwoofers in enhancing sound quality and the role they play in speaker systems. The discussion starts with an unusual recording experience using a dummy head mic, but quickly shifts to a debate about speakers and subwoofers. Users argue that even high-end speakers may not be able to produce meaningful output at low frequencies, and that adding a subwoofer can significantly enhance the overall sound quality.
The importance of listening distance is also discussed, with users wondering if bookcase speakers would be sufficient for longer listening distances and whether two subwoofers could mitigate potential distortion at high SPL. The baffle design of floorstander speakers and its effect on vertical dispersion are also mentioned, as well as the influence of visual cues on one's perception of sound.
Several users share their personal experiences with high-end speakers, including the Revel Salon 2, B&W F801, and TAD Reference 1, and believe that even these exceptional speakers could benefit from the addition of a pair of subwoofers or a "swarm" system. Additionally, users discuss the benefits of adding a subwoofer to their systems, citing examples such as the Revel Salon 2, which can produce full-range sound with quality.
The conversation also touches on the topic of audio equipment and speaker systems more broadly, with users sharing their experiences and opinions on various aspects of sound reproduction.
Clearly, the thread is not about the RA3. The price is also incorrect and seems to be taken from the PA3 review. The stuff about the remote and the resistors is equally unrelated/wrong.Here is the extended summary:
The forum posts discuss the TP RA3 Rackmount Amplifier, a new product from Topping's sub-brand, which costs $229 and is available on Aliexpress. The amplifier has a remote control and SOTA resistor-based attenuation.
There is nothing about the volume level (display) in the review, just about a large volume knob. The "only have this version" is not related to the channel imbalance post, but the rest of the sentence is correct.User amirm shares their review of the amplifier, including measurements, and notes that it looks great with a large and highly visible volume level. Another user, Music1969, asks about channel imbalance, but amirm responds that they only have this version and don't measure channel imbalance for power amps as it's not typically an issue.
I don't think anybody asked amir to measure the channel imabalance of the RA3 in that thread. The rest looks broadly correct.Paco De Lucia shares their positive experience with the amplifier, using it in a bi-amp setup without any problems. BeerBear suggests that an electronic/digital volume control would be nice to achieve perfect channel balance and avoid scratching.
Antcollinet asks about pairing the amplifier with a DAC that has remote control. Music1969 brings up the issue of channel imbalance again, referencing a previous review of the Topping PA3S, which showed channel balance issues at low volumes. They ask amirm to measure the channel imbalance of the RA3.
Looks broadly correct.Howard416 identifies the amplifier's chip as the TPA3251. Talisman is cautious about voting for the product due to concerns about reliability, citing past issues with Topping products. Yoku-San suggests that if the amplifier proves reliable, it could be part of an interesting entry-level combination with other affordable components.
Somewhat mixed up, but still close.Additionally, there was a discussion about the slew rate of the amplifier, with Antcollinet mentioning that it's important for a power amp to have a high slew rate to handle dynamic music. He calculates the slew rate based on the chip's specifications and concludes that it should be able to handle most music without issues.
I don't think that this is a fitting summary of those 45+ pages.Here is the extended summary:
The forum discussion revolves around two main topics: the ability of audio equipment to reproduce signals accurately and a specific amplifier product review.
The slew rate stuff is really only present on the last two pages, as far as I could see. In a three paragraph summary, it probably shouldn't be mentioned at all.In the first part of the discussion, users debate about the speed at which an audio device can reproduce signals within the human audible range (20Hz to 20kHz). User Julf suggests calculating the slew rate by multiplying the full power output voltage with the upper frequency limit. However, User Maxon insists on knowing the specific parameter of slew rate. The discussion continues with corrections and clarifications on how to calculate the slew rate accurately.
There is a lot of discussion around the RA3 and what parts are similar or different, but the thread overall is still a review of the PA5 II and the summary fails to address this. It also doesn't mention amir's main review post.In the second part of the discussion, users review and discuss the TP RA3 Rackmount Amplifier, a new product from Topping's sub-brand. The amplifier has a remote control and SOTA resistor-based attenuation. Users share their positive experiences with the amplifier, including measurements and usage in bi-amp setups. Some users raise concerns about channel imbalance and reliability issues, referencing previous reviews of similar products. Others suggest potential pairings with DACs that have remote control and discuss the importance of perfect channel balance.
Overall, the discussion covers technical aspects of audio equipment performance and user experiences with a specific amplifier product.