Hardware for local inference?

droopy4096@lemmy.ca · 2 days ago

Hardware for local inference?

sobchak@programming.dev · 2 days ago

The trend I see are the Mac Minis with a lot of unified memory. These are typically very well off people though. Prices for even old GPUs like 3090s are ridiculous now. I don’t think connecting 2 machines over Ethernet would work well, but putting 2 GPUs in a single machine does.

ffhein@lemmy.world · 1 hour ago

I bought a used 3090 two years ago, and back then they were usually listed for €800-1000 in my country. I thought I was lucky to find one for €700 after searching for a few months, and I don’t think they’ve ever been cheaper than this here. There are definitely fewer of them available now, but you can still buy one for €950 (and possibly even lower if you’re patient). So prices have gone up, but IMO not by ridiculous amounts like RAM.

Mika@piefed.ca · 23 hours ago

I’ve just checked the Mac Studio on the site and lmao, they first ran out of 512gb uram and then of 256gb uram, now selling only 96gb version.

B0rax@feddit.org · 3 hours ago

Which is quite impressive to be honest, these machines were fucking expensive, and yet sold out completely

Mika@piefed.ca · 1 hour ago

512 gb would future proof you to run any local LLMs for quite a while. The speed at which they did it wasn’t exactly bad too afaik due to uram being so fast. Dunno what other setup would compete here for the price.