More

zaptrem · 2026-06-08T18:25:24 1780943124

Needs more WebGL spinning rubik's cube

cubano · 2026-06-09T02:54:32 1780973672

Well...what about a <BLINK> as well? For gramps.

zaptrem · 2026-06-08T02:47:47 1780886867

Can you include GPT 5.5 non-pro (extra high thinking I guess) in your comparison? GPT Pro is the "I am willing to torch cash for a sooometimes slighty better result" option, not the one people are actually expected to use daily. That's probably part of the reason it's not in Codex

SwellJoe · 2026-06-08T03:07:08 1780888028

It's already there. It performed well. And, it'll be in the replication run later, as well.

zaptrem · 2026-06-02T07:48:22 1780386502

OOM on CUDA GPUs is relatively graceful (the process crashes). However, on macOS if torch MPS tries to allocate too much memory, the whole kernel will simply lock up and the only option is to reboot the computer. I have no idea why Apple doesn’t reserve memory for stuff like the OOM/kernel watchdog, but it seems they either don’t or there is a bug.

zaptrem · 2026-05-26T04:20:12 1779769212

Love me some JSD. Here is a problem most people don't consider with generative modeling (e.g., AI text, image, music, video models): basically all standard pre-training algorithms for generative models (i.e., cross entropy, basically all diffusion/flow formulations) are closer to a Forward KL divergence. In other words, given limited capacity the model will try to stretch itself to cover every mode. This gives you a jack of all trades (lots of knowledge and diversity), but a master of none (you get blurry images and text filled with nonsense).

The real magic in generative modeling comes from the post training process that comes after, which usually (e.g., RLHF) approximates Reverse KL (given limited capacity, try to perfectly cover what you can, but it's fine to drop the rest entirely). This gives amazing results, but is also the cause of AI oddities like the "AI Image Pixar Look", many of the verbal tics of LLMs, and all AI music using the same small set of voices. Jensen-Shannon Divergence sits right in the middle of Forward and Reverse KL and is what many GANs are claimed to approximate. Ideally, it is a better trade-off between diversity and fidelity.

zaptrem · 2026-05-19T21:23:49 1779225829

V4-Pro is about 2.4× total params and 1.3× active params of V3.2.

zaptrem · 2026-05-06T04:33:58 1778042038

Seems pretty clear, Claude and Codex were getting a lot of free publicity by instructing their models to do the same and MS wanted similar results. However, a bug caused this to be applied to all commits instead of all Copilot-influenced commits.

zaptrem · 2026-04-24T03:50:15 1777002615

I bumped from $20 -> $100 today but the Codex CLI lacking code rewind and "you can change files but ask me every time" mode from Claude Code is quite annoying. Sometimes I want to code, not vibe code lol.

zaptrem · 2026-04-20T18:26:19 1776709579

I train music generation models. They are very trivial to detect. In fact, detecting them then training them to evade detection by the detection model is a big part of training them! But the detectors win instantly without some hardcore regularization. Simply turn that off and you've instantly got a perfect classifier.

This isn't like text classification, the signal many orders of magnitude higher bitrate and so many more corners need to be cut. It's likely going to be nearly impossible or at least not remotely worth it to generate an audio signal that is truly undetectable in the foreseeable future.

fooker · 2026-04-20T18:50:26 1776711026

We are talking about entirely different things.

You are right, the output of a model that generates music directly is, for now, easy to categorize as AI.

What this big flux of AI generated music online isn't really that. It'a a tiny bit autogenerated stuff and a whole lot of automatically remixed stuff. The reason it can not be easily classified as AI is because quite a bit of human produced music is also that, and you'd just shut out real users.

MetaWhirledPeas · 2026-04-20T19:28:03 1776713283

> They are very trivial to detect.

Today. Trying to detect AI is like extracting water from puddles in a lake that is quickly drying up. What is the point in the short term if it's impractical in the long term? It will catch some low-hanging fruit in the best case, and will find false positives in the worst.

zaptrem · 2026-04-21T02:41:35 1776739295

My point is you should consider creating truly undetectable audio end to end with AI to be effectively impossible for the foreseeable future (i.e., I would bet money it is still trivially detectable five years from now). It won't be detectable to humans, though, only models.

8note · 2026-04-21T03:55:30 1776743730

in the broad strokes of ai generated, i wouldnt be so sure.

if the ai picked a bunch of samples and combined them together and mastered using an mcp to a DAW, how is that particularly distinguishable vs a person doing the same thing badly?

i can see how the llm generation pictures of spectrograms is essy to spot, but much less so with tool following.

even worse of you using a vla to have it actually play the guitar and use the recording as a sample.

theres some time and setup to make it happen sure, but somebody put that all in a studio and expose an mcp

zaptrem · 2026-04-21T03:57:23 1776743843

Agreed, that’s why I specified end to end (I.e., text to waveform)

nektro · 2026-04-20T23:28:20 1776727700

why would you admit so openly to being part of the problem?

deaux · 2026-04-21T05:10:57 1776748257

Why not? Even now it's still common to see people here openly admit to working at Meta. Making AI music less detectable is comparatively benign.

zaptrem · 2026-04-18T18:33:34 1776537214

What's your reasoning effort set to? Max now uses way more tokens and isn't suggested for most usecases. Even the new default (xhigh) uses more than the old default (medium).

nixpulvis · 2026-04-18T22:25:00 1776551100

That's what I'm wondering. Is it people are defaulting to xhigh now and that's why it feels like it's consuming a lot more tokens? If people manually set it to medium, would it be comparable?

nixpulvis · 2026-04-19T00:44:03 1776559443

Switching back to medium seems to have fixed the issue for me.

zaptrem · 2026-04-05T05:38:04 1775367484

YouTube et al's automated copyright systems put way too much trust in the hands of those making the claims.