Andrej Karpathy talks about "Claws"

RSS Bot@lemmy.bestiver.se · 11 hours ago

Andrej Karpathy talks about "Claws"

Lvxferre [he/him]@mander.xyz · 8 hours ago

Okay. Gotta explain this shit because I’m in the mood to do so. And because if I get something wrong people will point it out. (Implied: take what I say with a grain of salt, it’s potentially inaccurate. I’m mostly commenting for funzies, so caveat lector.)

Have you seen a chatbot? One using generative models, like ChatGPT and the likes? They’re unreliable, and prone to output bullshit¹, have a meltdown if asked about the seahorse emoji, or if you ask them about a tag with a reversed slash². (Did you eat a rock today?

Those generative models typically output media (text, images, audio) and call it a day, so they’re pretty contained; not exactly safe because they can generate misinformation, but still. But what if you allowed them, for example, to interact with files? Create file, delete file, edit file? Then you have what people call nowadays an “AI agent”, like Claude Code. And given the “AI agent” is built upon a generative model, and a generative model is unreliable, guess what: once allowed, it will misbehave. For example, wiping out the contents of your hard drive. Oopsie.

But let’s go further. What if that “AI agent” runs on its own, without being prompted to do so? Then you got a claw, like "Open"Claw. (no, it is not open source. Just tech bros misleading you, as usual.) It’s actively trying¹ to do things. You build an unreliable system, make its unreliability matter more than just misinforming people, and then you let act unsupervised. What could go wrong?

Perhaps it wipes the hard drive because it concluded¹ it had to restart afresh. (Play stupid games, win stupid prizes.) Or its behaviour actually affects other people; the claw" could try¹ to contribute with an open source project, gets its code change request denied, and writes a hit piece on a developer. Or perhaps the merge is actually accepted, and the software in question affects a lot of people, and since it’s now misbehaving the life of those people are worse. (Or gone, if the software handles something serious.)

Do I need to stress how much of a bad idea this is?

Notes:

1 - “Bullshit”, “try”, and “conclude” are being used metaphorically here.

2 - Might as well share this anecdotal example here

I was editing some .ass subtitles by hand, and I found some “\fsp” tag I didn’t know. I couldn’t find the documentation (here it is! so I asked GPT-5 mini (from DDG) about it; if it outputted incorrect info I’d know it right off the bat, so it was pretty safe to do so.

My prompt was “In Aegisub, what does /fsp mean?”. Note the backslash being accidentally replaced with a plain slash. Here’s the output:

In Aegisub (ASS subtitle format), /fsp is a font-size override used inside the {\fs…} or as part of override tags? Actually /fsp is the parameter name used by the \fs_scale (fractional scale)? Wait—make a decisive correct answer.

Correct: In Aegisub, /fsp stands for “font size in pixels” used in the karaokes templates and motion tags? Hmm.

I must be accurate. I recall \fsp is “font size in pixels” used by some Lua scripts? But ASS standard uses \fs for font size; \fsp is an Aegisub karaoke macro parameter representing “font size (pixels)” used in Aegisub’s formatting expressions (e.g., {\fsp10}) — yes many use {\fspX} to set font size in pixels for style overrides, distinct from \fs which is in points. So answer: \fsp sets font size in pixels. Provide brief example.

Sure, the info being outputted is wrong (FAFO showed me \fsp was for font spacing), but the most interesting part is how erratic the bot behaved, this threw the bot into some weird loop where it surfaced its own instructions (“make a decisive correct answer”, “I must be accurate”, “Provide brief example”)

It’s rather similar to what happens when you ask it about the seahorse emoji.

Andrej Karpathy talks about "Claws"

Andrej Karpathy talks about "Claws"

Andrej Karpathy talks about “Claws”