FFmpeg 9.1's new AAC encoder

[−]thisislife2 · 2026-07-01 Wed 14:15 UTC · link

Flagged for the wrong link.

[−]defrost · 2026-07-01 Wed 14:18 UTC · link

Hopefully they see this - there's still time to edit the submission link.

[−]ledoge · 2026-07-01 Wed 14:21 UTC · link

It doesn't let me edit the link, but I'm confused by what even happened here... I posted this from my phone and that wrong link doesn't show up in my clipboard history.

Link should be: https://hydrogenaudio.org/index.php/topic,129691.0.html

[−]defrost · 2026-07-01 Wed 14:24 UTC · link

Your options are:

* quick email to HN@ycombinator.com with a "Help Me please!! and link ( mods can edit link in and sideline (hide) these comments )

* Just live with the rotting fish head of public boo boo (we've all made mistakes, as the Dalek said whilst climbing down off the dustbin)

* I can kill the whole thing dead.

[−]dang · 2026-07-01 Wed 16:59 UTC · link

It's fixed now.

Our software follows redirs and somehow we got a 302 to our own IP. Perhaps it is someone's idea of a bot detector?

[−]Aachen · 2026-07-01 Wed 23:41 UTC · link

Unrelated: Hey, I sent hn@ycombinator.com two emails. One was May 6th, the other June 18th (UTC+2). The former's subject is "Broken prev/next links sometimes". In the latter, I've asked to let me know if it arrived. It didn't bounce so your email server has acknowledged receipt, but based on fast responses to previous emails and someone else mentioning randomly that you responded quickly to theirs in iirc early June, I'm starting to assume you're not seeing mine. I don't know how else to reach out than via an off-topic comment or a dummy submission or so. Is there a fallback mechanism to use when your email doesn't?

[−]defrost · 2026-07-02 Thu 02:54 UTC · link

Well, they don't read all the emails -or- respond to @dang, @mod, etc.

Your approaches are, ride a comment (as you did, 9 hours, no response, likely didn't read) or lean on a frequent flyer (the only privilege I have is increased ability to [dead] obvious spam not caught by filters - but it gets my emails seen)

I sent them an email in past minute - Good Luck! (YMMV)

[−]HugoTea · 2026-07-01 Wed 17:12 UTC · link

>FFmpeg's AAC DEcoder is busted with regards to stereo PNS, and the bug may be in other AAC decoders too, so we work around it in the encoder. Since no other encoder used PNS, the bug was not found until now.

I don't know what PNS is, but I bet this has been bothering someone's niche use-case for 20 years

[−]mcoliver · 2026-07-01 Wed 17:25 UTC · link

https://www.audiolabs-erlangen.de/content/resources/aesCodin...

[−]dcrazy · 2026-07-01 Wed 18:20 UTC · link

Hah, this sounds like the audio equivalent of Netflix’s grain reconstruction.

[−]BoingBoomTschak · 2026-07-01 Wed 19:30 UTC · link

Netflix's or AV1's FGS?

[−]dcrazy · 2026-07-01 Wed 20:39 UTC · link

Netflix developed it as a member of AOM.

[−]lesscraft · 2026-07-01 Wed 18:49 UTC · link

The issue was twofold, on one hand, using TNS on top of PNS meant the noise that got inserted was shaped by TNS, which is nonsense since the decoder generated the noise, not the encoder. This made PNS explode. The second, biggest issue was that using PNS in combination with any stereo tools resulted in noise leaking in both channels equally, ruining stereo imaging. So the best and only thing to do was to enable PNS only if the band in both channels is noise (or is sufficiently non-tonal and masked).

[−]superzazu · 2026-07-01 Wed 17:14 UTC · link

> The encoder was mainly optimized for 48Khz audio. Get over it. It's 2026, resampling is free, 48Khz is the standard. 44.1Khz will work, and so will 96Khz but use 48Khz if you want the best quality.

Is 48kHz really the standard nowadays?

[−]TheChaplain · 2026-07-01 Wed 17:19 UTC · link

48kHz has been the recommended setting with Premiere Pro as long as I can remember.

44.1kHz, isn't that what lameMP3 uses as default?

[−]williadc · 2026-07-01 Wed 17:44 UTC · link

It's what CDs use, so it would make sense for mp3 encoders to follow suit.

[−]asveikau · 2026-07-01 Wed 17:20 UTC · link

I know the opus codec assumes everything is 48kHz and will resample inputs to that.

[−]atoav · 2026-07-01 Wed 17:23 UTC · link

More or less. Streaming is often done with 48, video content has ben 48 for a while now, so unless you still produce content for CDs it is the standard.

44100 Hz had reasons no longer really needed (storing audio in 3 samples per line in VHS: 490 lines × 3 samples × 30 GPS = 44100 sample/s).

Qualitywise both are more than enough snd 99.99% of people would not be able to tell it apart in a blind test. Higher sample rates than 48kHz only needed when you want to pitch down ultrasonic recordings (of whales, bats and other such animals for example).

Aside from this higher than 48 kHz sample rates may have only downsides, like increased size and potential distortion in the ultrasonic frequency range that has sidebands in the audible range. Yet there is a persistent, but unscientific "more-is-better"-crowd in the HiFi-sector.

[−]duped · 2026-07-01 Wed 17:44 UTC · link

> Higher sample rates than 48kHz only needed when you want to pitch down ultrasonic recordings (of whales, bats and other such animals for example).

There are numerous use cases for higher sample rates that go beyond this but it's hard to talk about it without starting flame wars filled with junk science.

[−]zamadatix · 2026-07-01 Wed 17:53 UTC · link

Say it or don't but "I have evidence otherwise but don't think I should say" is just as bad a flame war gateway as tempting the junk science audiophiles directly.

[−]skydhash · 2026-07-01 Wed 18:03 UTC · link

I know that with oscilloscopes, it’s recommended to use 5x instead of nuquist 2x of the highest frequency you want to use., but the most reasonable argument I’ve heard for higher than 48kHz sampling is digital audio effects.

But for the end result 48kHz is more than necessary. I can’t even hear any frequency above 17kHz.

[−]dcrazy · 2026-07-01 Wed 18:15 UTC · link

Yes, bit depth headroom is very useful for audio production to avoid aliasing. Pro DAWs support 96KHz.

[−]adgjlsfhk1 · 2026-07-01 Wed 21:14 UTC · link

yeah for real time signals higher frequency makes sense (very briefly before you fft and kill the high frequencies), but for stored signals nyquist is king.

[−]Aurornis · 2026-07-01 Wed 21:47 UTC · link

> I know that with oscilloscopes, it’s recommended to use 5x instead of nuquist 2x of the highest frequency you want to use.

For capturing analog signals, 2.5X is enough headroom.

The 5X recommendation is probably for digital signals where the frequency refers to the baud rate, not the highest frequency coming through. A fast switching digital signal will have components with higher bandwidth than the fundamental. Using a higher multiple of samples (assuming the bandwidth is there) will let you see the shape of the waveform and rise and fall times better.

[−]atoav · 2026-07-02 Thu 05:12 UTC · link

> But for the end result 48kHz is more than necessary. I can’t even hear any frequency above 17kHz

And even if you could, would the frequencies that all humans lose with age really be all that essential for the enjoyment of music? We are talking about frequencies most instruments won't even produce unless severely abused.

For some reasons in audiophile-land the magic is always in some elusive outer realms and never right there where the important stuff happens. They spend a fortune on speaker cables, while often not giving a second thought on room acoustics beyond the cosmetic. The magic sparkle is all the way in the ultrasonic, while their listening spaces have deep nulls in the mid-range due to comb filtering from reflective surfaces caused by a lack of acoustic treatment.

I love music (enough to have mixed it for a living) and to me it is very clear how the priorities are ordered when it comes to audio fidelity:

1. Room Acoustics

2. Speakers

3. Electronics & Digital

Going from the back: Assuming you don't get the cheapest of the cheapest and don't abuse the gear by making it do things it wasn't build for electronics and digital audio nowadays is transparent. That means, it essentially sounds the same if operated within spec. Even a 0,50 € IC will have distortion figures so staggeringly low it is below human perception and equipment is getting better still. A decent opamp can have distortion figures like 0.005 % THD with a linear frequency response all the way up to radio frequencies. There can be challenges with driving very weird speakers or headphones, but if you hsve the right combination of gear it doesn't have to be expensive to be indistinguishably good in it's audio performance.

This means speakers are way more important thsn the electronics before it. Their distortion numbers are multiple magnitudes higher (in the ball park of 3% THD), their frequency response is inherently problematic (often many dBs up and down even in expensive speakers), they will hsve different beaming characteristics st different frequencies, small speakers lack bass, placement is essential, etc. So getting good speakers is important.

But all of this is dwarfed by the impacts acoustics. The position of the speakers alone makes a huge difference. The impact of an acoustically untreated space is severe: you can get a completely smeared time response with deep nulls of 20dB and more while other frequencies are highly resonant. Even a budget speaker won't have problems of that magnitude.

So get some ok electronics, even more ok speakers, but invest the bulk of the money/time into the setup of the room itself.

Many adiophiles have that priority list reversed. Room acoustics suck. You need to measure a lot, add ugly absorbers in inconvenient places, can't place speakers where they look nice and conserve space, but need to place them where they work well acoustically, there is no ideal solution and everything is a compromise. So buying a gold plated HDMI cable and imagining the improvement appears to be better. Only that you might be doing it in a room where a positional difference of a few centimeters changes the frequency response of the listening position massively.

[−]duped · 2026-07-01 Wed 18:05 UTC · link

Higher sample rates are lower latency for the same block size and resampling is not "free" (pick 2: performance, aliasing, latency) so there can be advantages to working with audio archived at higher sample rates.

But all the advantages come down to professional or editing use cases. There's next to zero advantage to using it as a storage format for listening. Just like 24 bit audio (do you have an amp with 96dB SNR?).

Just personally, I have seen little evidence (personally, professionally, or academically) that there is any advantage for lossless audio for consumer applications. For professional applications there are plenty, and it's endlessly tiring to convince people that "no, actually I need 96kHz for my use case."

Where the audiophiles have _some_ argument here is the design of reconstruction filters which I've heard alleged can perform better in the audible frequency range if the stop band is outside of it. But I have never personally tested this, nor cared enough to. But the theory is sound.

Whether or not it's perceptible depends on what you're measuring, though. In theory, there should be perceptual differences in sound localization if your DAC's reconstruction filter is at 24kHz vs 48kHz since it will change the group delay in a critical frequency region, where you'll get sound at >~2kHz arriving later at the lower sample rate. I think it would be extremely hard to test this though, because humans are really shitty at sound localization to begin with, and practically speaking most recorded material is processed to shit in that frequency range to intentionally decorrelate the channels for the perception of "width."

[−]amluto · 2026-07-01 Wed 20:50 UTC · link

> Higher sample rates are lower latency for the same block size

This a truly bizarre statement. On the one hand, of course higher sampling rates are lower latency for the same block size measured in samples. But all sampling rates have (almost [0]) identical latency for the same block size measured in time and lower sampling rates allow less computation for those shorter blocks.

[0] If you are concerned about needing to know future samples in order to calculate the actual signal amplitude at a time between samples, then (a) this matters less at higher sampling rates and (b) this is at most a small number of samples and we're talking about block sizes that presumably exceed, say, 5, so this isn't really a big deal.

[−]duped · 2026-07-01 Wed 23:32 UTC · link

The unit of a block size is samples (frames, technically), not seconds. When configuring audio devices for playback you tune both sample rate and block size for latency. It used to be far more common to tune sample rate than block size alone for tracking. This is getting into the weeds of actual devices though.

Also to your point, this is why compliant peak meters use a mandatory 4x upsampling at 48k.

[−]Dylan16807 · 2026-07-01 Wed 20:56 UTC · link

> Higher sample rates are lower latency for the same block size

And if your goal is latency, it makes far more sense to change the block size rather than the sample rate.

> But all the advantages come down to professional or editing use cases.

That sounds about right.

[−]toast0 · 2026-07-01 Wed 20:57 UTC · link

> Just personally, I have seen little evidence (personally, professionally, or academically) that there is any advantage for lossless audio for consumer applications

I think the advantage of lossless audio is for archival: rip once, archive as lossless; then you can reencode your library with the latest and greatest lossy encoders over time, or just use the lossless if your player can manage it, cpu and storage is less of a limiting factor for players than 20 years ago.

I don't know how many people are actually managing their libraries these days though, so I dunno if makes a huge difference.

[−]duped · 2026-07-02 Thu 02:00 UTC · link

I wouldn't call archiving a consumer application but I understand the point. Really it gets back to the word: fidelity. Some say it means "truth" but really it's latin for faithful or in the context of audio, perceptually identical (a faithful representation). Even among highly trained and skilled listeners, lossy codecs are faithful and imperceptible.

[−]jpc0 · 2026-07-02 Thu 06:16 UTC · link

Group delay is a poor argument.

Unless you also have a pretty decent monitoring system the group delay of the speakers isn't going to be consistent so the filters before them wouldn't matter all that much...

Even in that case I would have a hard time believing that any human in a blind test would be able to perceive a group delay of even 360deg above 2k...

You are talking about sub milliseconds differces in the time frequency content arrives at the ears, just tiling your head slightly will have a greater impact...

[−]someonebaggy · 2026-07-01 Wed 23:51 UTC · link

VHS doesn't store audio in samples nor does it have 490 lines or 30 G(?)PS. NTSC uses 525 lines per frame and PAL uses 625, both with interlacing at 60 fields per second. The VHS system is analog for audio and video, though analog video has discrete lines, and VHS records discrete stripes on the tape which should be one field each.

44100 was chosen for CD, as 20kHz upper limit of human hearing, doubled for Nyquist theorem, plus a 10% guard band so that anti-aliasing filters don't have to be made of magical fairy dust, plus a bit (maybe to make it relatively prime with something else in the system).

[−]pezezin · 2026-07-02 Thu 01:28 UTC · link

The parent comment is talking about this: https://en.wikipedia.org/wiki/PCM_adaptor

The first digital audio systems encoded the audio as a black-and-white video signal on video tapes. 44100 HZ was selected at it was the highest sampling rate achievable on both NTSC and PAL video tapes.

[−]izacus · 2026-07-01 Wed 17:27 UTC · link

Yes, pretty much all new hardware uses it as default output setting as well (by that I mean laptops, phones, smart speakers, etc.)

[−]xuhu · 2026-07-01 Wed 17:52 UTC · link

For one, audio transcription services that use Whisper will sample the input down to 16Khz mono first.

[−]legdoge · 2026-07-01 Wed 17:57 UTC · link

AAC has a strange quirk that the window size is dependent on the sampling rate, thus requiring a complete psychoacoustics reoptimization of all encoder parameters for each sampling rate, since a 20msec window sounds very different than a 60msec window, to human ears.

This was of course fixed in Opus.

[−]pipo234 · 2026-07-01 Wed 18:00 UTC · link

48kHz makes alignment between video and audio so much easier. (I.e.: Lip synchronization after edits)

[−]Joeboy · 2026-07-01 Wed 18:15 UTC · link

I think the closest thing to an actual "standard" is AES5-2018, "Recommended practice for professional digital audio".

Abstract:

> A sampling frequency of 48 kHz is recommended for the origination, processing, and interchange of audio programs employing pulse-code modulation. Recognition is also given to the use of a 44.1-kHz sampling frequency related to certain consumer digital applications, the use of a 32-kHz sampling frequency for transmission-related applications, and the use of a 96-kHz sampling frequency for applications requiring a higher bandwidth or more relaxed anti-alias filtering. This revision further quantifies the preferred choices for higher sampling frequencies.

Edit: From my personal perspective, 44.1kHz is a legacy minor annoyance

[−]daneel_w · 2026-07-01 Wed 18:27 UTC · link

Yes and no. It is the standard for audio in film, which explains the author's focus. But is the audio CD bigger and more "standarder" than DVD and Blu-Ray? I think they're equals, and I personally think this encoder only makes sense for video content. Given all the caveats the author mentions (in particular about the sample rate) I would steer clear from using it when ripping CDs.

[−]lesscraft · 2026-07-01 Wed 18:42 UTC · link

Pretty much all DACs run at 48Khz by default due to operating systems picking it as a sane default.

[−]bpye · 2026-07-01 Wed 19:56 UTC · link

Pipewire will quite happily pipe through audio without resampling if it is the only source on a system. You can see this by running pw-top and using speaker-test with various sample rates.

[−]sneezychl · 2026-07-01 Wed 17:25 UTC · link

A very welcomed addition, hopefully I can replace fdk-aac

[−]cogman10 · 2026-07-01 Wed 17:27 UTC · link

Man what a showcase for Opus this is.

Don't get me wrong, this sort of thing is a valuable exercise and we are better off with better encoders for these older codecs. But look at the numbers for Opus on this benchmark. It simply blows all the AAC encoders out of the water even at 64 kbps.

[−]ndiddy · 2026-07-01 Wed 17:46 UTC · link

The biggest advantage for having a good AAC encoder isn't efficiency, it's that for nearly the past 2 decades the de facto standard for live streamed video has been RTMP with H.264 video and AAC audio. There is basically no support for any other codecs. If you want to send a video stream to Youtube or Twitch, you will be sending H.264 and AAC. If you want an idea of how ubiquitous this is, I just checked in OBS and it will not even let you select different video and audio codecs in streaming mode, it just (correctly) assumes that anybody who's streaming will be streaming H.264 and AAC.

[−]repelsteeltje · 2026-07-01 Wed 17:55 UTC · link

Sample accurate editing is with AAC is a pain though. Especially if you also have video, because frame rates are usually incompatible.

If you want flexibility without fully transcoding both audio and video, Opus is your friend

[−]ksncksmckwkf · 2026-07-01 Wed 18:15 UTC · link

Opus is your friend as long as the software you’re using supports it—besides, Apple’s AAC-LC can beat out Opus in low bitrates scenarios.

Whether you like it or not, AAC is still the standard.

[−]CharlesW · 2026-07-01 Wed 17:56 UTC · link

Plus, at 96+ kbps (assuming an Apple-quality AAC-LC encoder) Opus loses its quality advantage. So at higher bitrates, the benefit of choosing Opus is that encoders/decoders are royalty-free.

[−]pkulak · 2026-07-01 Wed 21:16 UTC · link

Am I reading that chart wrong? I see Opus ahead across every bitrate.

[−]CharlesW · 2026-07-01 Wed 23:35 UTC · link

The evaluation tools used are helpful for encoder development, but at best they're imperfect proxies for human perception, and their predictions are often inconsistent with the human experience. I assume that statements like "apparently the best AAC encoder" aren't meant to be taken too seriously, since everybody who does this stuff knows that ABX/MUSHRA tests with real humans is what tells the tale.

On Opus vs. AAC specifically, there's a long history of studies like https://www.researchgate.net/publication/301428302_Perceived... to help answer that question. (There are interesting charts at the top of page 1175.)

[−]booi · 2026-07-01 Wed 18:35 UTC · link

Also the fact that hardware-accelerated AAC and even full AAC offload is ubiquitous in modern-ish hardware. I think my rice cooker can play AAC audio

[−]lesscraft · 2026-07-01 Wed 18:41 UTC · link

No one really offloads AAC, apart from Apple. Opus can be decoded on very cheap microcontrollers entirely in software using the reference library.

[−]philistine · 2026-07-01 Wed 22:47 UTC · link

On a microcontroller doing nothing else sure. But on a phone, a tablet, a laptop, you absolutely want hardware decode to preserve your battery life.

[−]nulld3v · 2026-07-02 Thu 01:45 UTC · link

That's their point though. Basically no modern phone/laptop/tablet other than Apple offloads audio decoding (of any codec) to hardware. You can check this on Android phones by installing the Codec Info app.

[−]cogman10 · 2026-07-02 Thu 03:01 UTC · link

Audio decode is extremely cheap. It's true that a hardware implementation will be more efficient, but really not a whole lot more.

[−]AnggaSP · 2026-07-02 Thu 05:29 UTC · link

There absolutely does, Android did with low power audio. They even goes a step further by offloading bluetooth processing into DSP.

I’m not in this space anymore but as of Android 5-6 era aac and bt is offloaded to hexagon dsp on qualcomm device.

[−]stefan_ · 2026-07-01 Wed 21:16 UTC · link

I think often of how all it would have taken was a bomb for the 10 or so people that years ago at some browser vendor consortium out of pure self centeredness went „nah lets fragment“. We could have saved many many collective years, electricity and eyeballs simply watching the most basic content.

[−]derf_ · 2026-07-01 Wed 22:21 UTC · link

At one point in I think 2012 three of us who normally all live in different countries were riding in the same car in Australia. We advised the driver to be extra careful (she was dating one of us, so incentives were aligned).

But it is nice to hear that you have been thinking of us, too.

[−]jshier · 2026-07-01 Wed 21:39 UTC · link

YouTube actually supports H.265 and VP9 ingest, depending on the streaming protocol. I can actually stream 4K@60 H.265 from my Mac Studio with < 5% CPU usage due to the hardware encoder support in OBS.

https://developers.google.com/youtube/v3/live/guides/ingesti...

[−]ndiddy · 2026-07-02 Thu 01:48 UTC · link

Nice, glad some sites are finally trying to move forward. It looks like they only support H.265 video with AAC audio, so this should still be helpful for people who are streaming H.265. https://developers.google.com/youtube/v3/live/guides/hls-ing...

[−]someonebaggy · 2026-07-01 Wed 23:41 UTC · link

The RTMP protocol comes from Adobe Flash which only supported a limited set of codecs, the only still useful ones being H264 and AAC. Nobody published the needed protocol extension "enhanced RTMP" until 2022 and it still isn't supported widely. RTMP is not a generic container for any codec, like Mastroska - RTMP is tightly coupled to the codec.

[−]palmotea · 2026-07-01 Wed 17:49 UTC · link

> Man what a showcase for Opus this is.

I take it you mean this Opus (https://en.wikipedia.org/wiki/Opus_(audio_format)) not that Opus (https://en.wikipedia.org/wiki/Claude_(AI)).

I read almost all the way through your comment thinking there was a decent probability you were saying this new AAC encoder was written with Claude Opus.

[−]theandrewbailey · 2026-07-01 Wed 18:31 UTC · link

I've never been AI guy, and have more fascination with audio. I've long stopped being excited when I read "Opus" on HN. It's refreshing when it turns out to be the audio codec.

[−]Aachen · 2026-07-01 Wed 23:30 UTC · link

To be fair, Opus was never a great name. I always feel the need to specify further when using it outside of a clear context of music codecs (also way before Claude was announced). Love it in every other way though

[−]pezezin · 2026-07-02 Thu 01:21 UTC · link

No, he was talking about the Opus Dei because this code quality can only be reached by God himself /jk

[−]skydhash · 2026-07-01 Wed 17:55 UTC · link

I would like Opus, but I’m using a subsonic client on iOS and my choice has been Flac (Alac?), MP3, or AAC. Opus wouldn’t play (There are some that supported it, but I didn’t like their UX).

[−]CharlesW · 2026-07-01 Wed 17:59 UTC · link

You might like Poppy (in beta), which supports all media servers (including OpenSubsonic/Navidrome) and Opus as a first-class music format. https://www.reddit.com/r/PoppyApp/comments/1tiyki0/about_pop...

[−]jck86 · 2026-07-01 Wed 17:56 UTC · link

Choosing a lossy audio codec has become such a no brainer. Either use opus and be done with it or if for some reason opus cannot be used then use aac for compatibility with insane high bitrate for good quality without having to do research on what encoder and mode to pick.

Still having a good quality and default aac encoder is great. Though I don't get why it is mainly CBR.

[−]BoingBoomTschak · 2026-07-01 Wed 19:15 UTC · link

Eh, I prefer Vorbis mostly because it's still competitive at transparent bitrates (esp. with Aotuv patches) and benefits from a much saner volume normalization spec (simply transfer RG 2.0 tags from the FLAC source): Xiph decided to exclude peak information from Opus' spec while adding that weird thing where album gain is stored in the format header and additional track gain in the metadata.

It also uses less battery on my Rockbox'd Clip+.

[−]jck86 · 2026-07-01 Wed 19:32 UTC · link

For replaygain purposes simply ignore the spec and use RG 2.0 tags? That works with Opus too and hardly any players support Opus R128 gain anyway. For very low spec devices Vorbis would do a bit better though. For legacy devices legacy codecs can be a better fit indeed.

But would you really store new material encoded in Vorbis just to be able to play it on an old device? Vorbis can sound fine, even at lower bitrates like 128k or 96k, but Opus would sound much better. So perhaps then use Vorbis at higher bitrates like +192k? I prefer Vorbis to Aac but at that bitrate minor intricacies of the container format become more important than the codec because audio quality wise they are near indistinguishable.

[−]a1o · 2026-07-01 Wed 18:14 UTC · link

I think the biggest issue with Opus is the problem with its specification being lacking, see:

https://nothings.org/stb/stb_opus.html

This essentially causes opus to never be used in games or in things in stores that may have issues with specific licenses.

[−]kderbe · 2026-07-01 Wed 18:33 UTC · link

This essay says it's not possible to make a public-domain implementation of Opus. But it could be released under BSD (as libopus is), which is fine for games, as evidenced by the Licenses section of the credits in many games.

[−]scratcheee · 2026-07-01 Wed 18:52 UTC · link

That’s going a bit far. I’m in the games industry and have used opus regularly, it’s a great codec for games, often the hardware decoding is so restricted that we’re using software regardless so we might as well use something like opus.

The licensing restriction is unfortunate, but only restrictive for those with very specific goals, under normal conditions BSD is a wonderful license for game devs since you’re free to use the code and only have to add an acknowledgement somewhere.

I suppose a public domain game might hit the same limitation, though as a non-lawyer I would guess the chance of anyone with standing trying to sue anyone implementing from this spec is realistically zero (though I don’t fault stb for being unwilling to roll those dice!)

[−]duskwuff · 2026-07-01 Wed 21:19 UTC · link

> under normal conditions BSD is a wonderful license for game devs since you’re free to use the code and only have to add an acknowledgement somewhere.

And it's not as though libopus is an outlier in using a BSD license. A lot of other commonly used libraries have similar licenses; a few examples that come to mind which are likely to show up in games are zlib, curl, Lua, and SDL.

[−]chaosharmonic · 2026-07-01 Wed 21:38 UTC · link

libopus isn't even an outlier in using it for a media format specifically. See: everything coming out of the Alliance for Open Media

[−]ack_complete · 2026-07-01 Wed 19:11 UTC · link

Most games use the sound support that comes with their game engine or choice of sound system, so I don't think the lack of an STB version is an issue. Performance is more of a problem. Audiokinetic, the makers of the popular Wwise audio system, estimate that Opus takes ~3-5x the CPU of Vorbis:

https://www.audiokinetic.com/en/community/blog/a-guide-for-c...

[−]sbseitz · 2026-07-01 Wed 23:12 UTC · link

Most of my collection is Opus 256K, the only downside is support. A lot of tools like Bliss/Roon don't support it :(

[−]arikrahman · 2026-07-02 Thu 03:54 UTC · link

Took me a second to realize you were talking about the encoder not the model before going into this article

[−]ndiddy · 2026-07-01 Wed 17:29 UTC · link

Nice, I'm looking forward to seeing how this performs in practice. FFmpeg's previous AAC encoder produced poor quality output and often had irritating chirping artifacts, so I've always had to install Apple's Core Audio encoder on any computer I do video recording on to get decent sound. I've done A/B/X comparisons and found that a 320kbps MP3 sounds better than a 320kbps AAC encoded by FFmpeg, but about the same as a 256kbps AAC encoded by Core Audio. If installing Core Audio is no longer necessary, that'll be a huge improvement and people who use something like OBS to do screen recordings or streaming will get a massive sound quality boost the next time they update.

[−]repelsteeltje · 2026-07-01 Wed 17:49 UTC · link

Why not use a lossless codec if you care about quality? Or use Opus, descent for specht and works pretty much anywhere these days.

[−]CharlesW · 2026-07-01 Wed 18:11 UTC · link

> Why not use a lossless codec if you care about quality?

(1) Lossy codecs are transparent at half the file size (or less) of FLAC/ALAC.

(2) AAC (strictly, AAC-LC) is universal, where FLAC and Opus are not yet there.

[−]ksncksmckwkf · 2026-07-01 Wed 18:11 UTC · link

You can care about quality to the extent that a lossy codec allows. Lossless is not always necessary or wanted. This is like saying “why care about transcoding quality when you can keep the video as is?”. There’s a myriad of use cases and preferences at play here.

[−]cosmic_cheese · 2026-07-01 Wed 18:17 UTC · link

There are a ton of older, but still perfectly usable devices that support AAC well but not Opus.

[−]kderbe · 2026-07-01 Wed 18:21 UTC · link

In the Hydrogenaudio discussion thread's metrics table, the new encoder scores better than Core Audio. But this is at constant bitrate (CBR) [edit: maybe not? see lesscraft's reply below]. Core Audio also has variable bitrate modes (TVBR) which the new encoder lacks.

So maybe Core Audio will continue to be the best when TVBR is available, but I'm hopeful the new FFmpeg encoder will be "good enough", especially if more folks find and contribute problem samples to help tune it.

[−]lesscraft · 2026-07-01 Wed 18:39 UTC · link

The benchmarks were made using afconvert on OSX with the default VBR settings.

[−]madars · 2026-07-01 Wed 18:22 UTC · link

A useful project related to Apple's Core Audio is qaac - it wraps iTunes Windows DLL's in a standalone encoding tool with a CLI interface. I believe it even works under Wine on Linux: https://web.archive.org/web/20250814194428/https://www.andre... So you don't need a Mac or even a full iTunes installation to get high quality AAC encoding.

[−]winstonwinston · 2026-07-01 Wed 19:35 UTC · link

I was using FDK AAC encoder, I didn’t know Apple encoder was available for systems other than Apple. Though I have once compared AAC FDK to Apple AAC at 192kbps, and couldn’t tell the difference, while the old FFmpeg AAC encoder fall apart at this bitrate.

[−]ndiddy · 2026-07-02 Thu 01:45 UTC · link

It gets installed when you install iTunes. If you don't want to install iTunes, you can pull out the codec installer by opening an old version of the iTunes installer in 7-zip and extracting the MSI. Here's a copy I keep around for whenever I have to do a screen recording on a new computer, it's signed by Apple so you don't have to trust me. https://www.infochunk.com/obs/AppleApplicationSupport64.msi

[−]moniosi · 2026-07-01 Wed 21:18 UTC · link

i will never understand apples cuckoldry for proprietary codecs, if it wasn't for their adoption of h265 we would live in the av1 utopia

[−]refulgentis · 2026-07-01 Wed 18:09 UTC · link

Older I get, more it seems it’s possible to ping pong between rewrites for good reasons (ex. here, metric maxes but I find it hard to believe VBR and not-48 kHz are silly things and not worth investing it)

[−]binaryturtle · 2026-07-01 Wed 21:49 UTC · link

I always encode my AAC with VBR. Why wouldn't you, right? I guess I'll stick to apple or fdkaac for now.

[−]timcobb · 2026-07-02 Thu 02:18 UTC · link

Why do you record AAC?

[−]Marsymars · 2026-07-02 Thu 05:19 UTC · link

It’s better than mp3, and my car supports it from a usb stick.

[−]JSR_FDED · 2026-07-01 Wed 19:19 UTC · link

It’s fascinating so much of this comes down to the developer’s own ears - disturbing and quite cool at the same time how subjective this is

[−]ant6n · 2026-07-01 Wed 23:48 UTC · link

The table and comparison uses “Google's new Zimtohrli, ViSQOL, and my own hearing”

[−]esafak · 2026-07-01 Wed 20:13 UTC · link

HA, a blast from the past, when audio encoders were making strides and collecting mp3s was a thing. Same for video encoders.

[−]amluto · 2026-07-01 Wed 20:52 UTC · link

It was kind of fun being able to easily distinguish 128kbps MP3 from the source audio. (Some early encoders were really bad.)

[−]ant6n · 2026-07-01 Wed 22:21 UTC · link

This is truly a representative of the old internet: somebody codes up the best AAC encoder ever, and the first response comes from some admin, and it's some bickering about 48Khz vs 44Khz.

[−]SideQuark · 2026-07-02 Thu 01:21 UTC · link

It’s not that cynical. The author didn’t test on the most common rate in use, so it would be ludicrous for any serious project to wholesale replace a decades old working pipeline with it. It makes perfect sense to wait till due diligence is done.

[−]ant6n · 2026-07-02 Thu 06:47 UTC · link

It's not cynical. It's dismissive. Especially given that these codecs work in the frequency domain anyway.

>>...use 48Khz if you want the best quality.

>Yet most of the worlds audio is 44KHz...

[−]ximdotro · 2026-07-01 Wed 22:23 UTC · link

Nice, I can’t wait to see how this turns out in practice.

[−]functionmouse · 2026-07-01 Wed 22:25 UTC · link

Last time I used ffmpeg to encode songs for my iPod nano they were broken; playback was interrupted by pops and clicks every few seconds. I wonder if this is fixed now?

[−]pseudosavant · 2026-07-02 Thu 01:00 UTC · link

I applaud a new/better FFMPEG AAC encoder, but there are two pretty massive caveats that are mentioned in the specifics that need to be called out:

- CBR only

- Only optimized for 48khz sampling

Not being able to do quality-based variable bitrate encoding is a major gap, and since all of the CD audio in the world is at 44.1k sampling, that seems like a huge miss too.

[−]lesscraft · 2026-07-02 Thu 02:18 UTC · link

You can use -q:a, for "true" VBR, but its metrics are a few percent (imperceptable, we still win) less. "The benchmarks I posted were done mainly on 44.1Khz. I tuned by ear on 48Khz data though, so some of the windowing/transient logic is tied to 48Khz. It translated to 44.1Khz well enough that I left it as-is, since the timing difference isn't that large."