Curious to know what the experiences are for those who are sticking to bare metal. Would like to better understand what keeps such admins from migrating to containers, Docker, Podman, Virtual Machines, etc. What keeps you on bare metal in 2025?

  • brucethemoose@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    edit-2
    10 hours ago

    In my case it’s performance and sheer RAM need.

    GLM 4.5 needs like 112GB RAM and absolutely every megabyte of VRAM from the GPU, at least without the quantization getting too compressed to use. I’m already swapping a tiny bit and simply cannot afford the overhead.

    I think containers may slow down CPU<->GPU transfers slightly, but don’t quote me on that.

    • kiol@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      11 hours ago

      Can anyone confirm if containers would actually impact CPU to GPU transfers

      • brucethemoose@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        11 hours ago

        To be clear, VMs absolutely have overhead but Docker/Podman is the question. It might be negligible.

        And this is a particularly weird scenario (since prompt processing literally has to shuffle ~112GB over the PCIe bus for each batch). Most GPGPU apps aren’t so sensitive to transfer speed/latency.