duped | Hacker News

Comment by duped | original | FFmpeg 9.1's new AAC encoder

[−]duped · 2026-07-01 Wed 17:44 UTC · link

> Higher sample rates than 48kHz only needed when you want to pitch down ultrasonic recordings (of whales, bats and other such animals for example).

There are numerous use cases for higher sample rates that go beyond this but it's hard to talk about it without starting flame wars filled with junk science.

[−]zamadatix · 2026-07-01 Wed 17:53 UTC · link

Say it or don't but "I have evidence otherwise but don't think I should say" is just as bad a flame war gateway as tempting the junk science audiophiles directly.

[−]skydhash · 2026-07-01 Wed 18:03 UTC · link

I know that with oscilloscopes, it’s recommended to use 5x instead of nuquist 2x of the highest frequency you want to use., but the most reasonable argument I’ve heard for higher than 48kHz sampling is digital audio effects.

But for the end result 48kHz is more than necessary. I can’t even hear any frequency above 17kHz.

[−]dcrazy · 2026-07-01 Wed 18:15 UTC · link

Yes, bit depth headroom is very useful for audio production to avoid aliasing. Pro DAWs support 96KHz.

[−]adgjlsfhk1 · 2026-07-01 Wed 21:14 UTC · link

yeah for real time signals higher frequency makes sense (very briefly before you fft and kill the high frequencies), but for stored signals nyquist is king.

[−]Aurornis · 2026-07-01 Wed 21:47 UTC · link

> I know that with oscilloscopes, it’s recommended to use 5x instead of nuquist 2x of the highest frequency you want to use.

For capturing analog signals, 2.5X is enough headroom.

The 5X recommendation is probably for digital signals where the frequency refers to the baud rate, not the highest frequency coming through. A fast switching digital signal will have components with higher bandwidth than the fundamental. Using a higher multiple of samples (assuming the bandwidth is there) will let you see the shape of the waveform and rise and fall times better.

[−]atoav · 2026-07-02 Thu 05:12 UTC · link

> But for the end result 48kHz is more than necessary. I can’t even hear any frequency above 17kHz

And even if you could, would the frequencies that all humans lose with age really be all that essential for the enjoyment of music? We are talking about frequencies most instruments won't even produce unless severely abused.

For some reasons in audiophile-land the magic is always in some elusive outer realms and never right there where the important stuff happens. They spend a fortune on speaker cables, while often not giving a second thought on room acoustics beyond the cosmetic. The magic sparkle is all the way in the ultrasonic, while their listening spaces have deep nulls in the mid-range due to comb filtering from reflective surfaces caused by a lack of acoustic treatment.

I love music (enough to have mixed it for a living) and to me it is very clear how the priorities are ordered when it comes to audio fidelity:

1. Room Acoustics

2. Speakers

3. Electronics & Digital

Going from the back: Assuming you don't get the cheapest of the cheapest and don't abuse the gear by making it do things it wasn't build for electronics and digital audio nowadays is transparent. That means, it essentially sounds the same if operated within spec. Even a 0,50 € IC will have distortion figures so staggeringly low it is below human perception and equipment is getting better still. A decent opamp can have distortion figures like 0.005 % THD with a linear frequency response all the way up to radio frequencies. There can be challenges with driving very weird speakers or headphones, but if you hsve the right combination of gear it doesn't have to be expensive to be indistinguishably good in it's audio performance.

This means speakers are way more important thsn the electronics before it. Their distortion numbers are multiple magnitudes higher (in the ball park of 3% THD), their frequency response is inherently problematic (often many dBs up and down even in expensive speakers), they will hsve different beaming characteristics st different frequencies, small speakers lack bass, placement is essential, etc. So getting good speakers is important.

But all of this is dwarfed by the impacts acoustics. The position of the speakers alone makes a huge difference. The impact of an acoustically untreated space is severe: you can get a completely smeared time response with deep nulls of 20dB and more while other frequencies are highly resonant. Even a budget speaker won't have problems of that magnitude.

So get some ok electronics, even more ok speakers, but invest the bulk of the money/time into the setup of the room itself.

Many adiophiles have that priority list reversed. Room acoustics suck. You need to measure a lot, add ugly absorbers in inconvenient places, can't place speakers where they look nice and conserve space, but need to place them where they work well acoustically, there is no ideal solution and everything is a compromise. So buying a gold plated HDMI cable and imagining the improvement appears to be better. Only that you might be doing it in a room where a positional difference of a few centimeters changes the frequency response of the listening position massively.

[−]duped · 2026-07-01 Wed 18:05 UTC · link

Higher sample rates are lower latency for the same block size and resampling is not "free" (pick 2: performance, aliasing, latency) so there can be advantages to working with audio archived at higher sample rates.

But all the advantages come down to professional or editing use cases. There's next to zero advantage to using it as a storage format for listening. Just like 24 bit audio (do you have an amp with 96dB SNR?).

Just personally, I have seen little evidence (personally, professionally, or academically) that there is any advantage for lossless audio for consumer applications. For professional applications there are plenty, and it's endlessly tiring to convince people that "no, actually I need 96kHz for my use case."

Where the audiophiles have _some_ argument here is the design of reconstruction filters which I've heard alleged can perform better in the audible frequency range if the stop band is outside of it. But I have never personally tested this, nor cared enough to. But the theory is sound.

Whether or not it's perceptible depends on what you're measuring, though. In theory, there should be perceptual differences in sound localization if your DAC's reconstruction filter is at 24kHz vs 48kHz since it will change the group delay in a critical frequency region, where you'll get sound at >~2kHz arriving later at the lower sample rate. I think it would be extremely hard to test this though, because humans are really shitty at sound localization to begin with, and practically speaking most recorded material is processed to shit in that frequency range to intentionally decorrelate the channels for the perception of "width."

[−]amluto · 2026-07-01 Wed 20:50 UTC · link

> Higher sample rates are lower latency for the same block size

This a truly bizarre statement. On the one hand, of course higher sampling rates are lower latency for the same block size measured in samples. But all sampling rates have (almost [0]) identical latency for the same block size measured in time and lower sampling rates allow less computation for those shorter blocks.

[0] If you are concerned about needing to know future samples in order to calculate the actual signal amplitude at a time between samples, then (a) this matters less at higher sampling rates and (b) this is at most a small number of samples and we're talking about block sizes that presumably exceed, say, 5, so this isn't really a big deal.

[−]duped · 2026-07-01 Wed 23:32 UTC · link

The unit of a block size is samples (frames, technically), not seconds. When configuring audio devices for playback you tune both sample rate and block size for latency. It used to be far more common to tune sample rate than block size alone for tracking. This is getting into the weeds of actual devices though.

Also to your point, this is why compliant peak meters use a mandatory 4x upsampling at 48k.

[−]Sesse__ · 2026-07-02 Thu 07:20 UTC · link

> Also to your point, this is why compliant peak meters use a mandatory 4x upsampling at 48k.

This isn't due to latency, it's because the true peak (in the analog waveform) could be between samples.

[−]Dylan16807 · 2026-07-01 Wed 20:56 UTC · link

> Higher sample rates are lower latency for the same block size

And if your goal is latency, it makes far more sense to change the block size rather than the sample rate.

> But all the advantages come down to professional or editing use cases.

That sounds about right.

[−]toast0 · 2026-07-01 Wed 20:57 UTC · link

> Just personally, I have seen little evidence (personally, professionally, or academically) that there is any advantage for lossless audio for consumer applications

I think the advantage of lossless audio is for archival: rip once, archive as lossless; then you can reencode your library with the latest and greatest lossy encoders over time, or just use the lossless if your player can manage it, cpu and storage is less of a limiting factor for players than 20 years ago.

I don't know how many people are actually managing their libraries these days though, so I dunno if makes a huge difference.

[−]duped · 2026-07-02 Thu 02:00 UTC · link

I wouldn't call archiving a consumer application but I understand the point. Really it gets back to the word: fidelity. Some say it means "truth" but really it's latin for faithful or in the context of audio, perceptually identical (a faithful representation). Even among highly trained and skilled listeners, lossy codecs are faithful and imperceptible.

[−]jpc0 · 2026-07-02 Thu 06:16 UTC · link

Group delay is a poor argument.

Unless you also have a pretty decent monitoring system the group delay of the speakers isn't going to be consistent so the filters before them wouldn't matter all that much...

Even in that case I would have a hard time believing that any human in a blind test would be able to perceive a group delay of even 360deg above 2k...

You are talking about sub milliseconds differces in the time frequency content arrives at the ears, just tiling your head slightly will have a greater impact...