Do you host your own ML / AI / LLM? What do you use, and what do you use it for?

  • Domi@lemmy.secnd.me
    link
    fedilink
    English
    arrow-up
    2
    ·
    23 hours ago

    Curious about the quant tho.

    Q8 from unsloth.

    Something like Qwen3.5-122b

    My go to model for knowledge. Definitely much faster at Q5 but it lacks the tool calling quality of the Qwen3.6 models. Really hoping we see a Qwen3.6-122b soon…

    • robber@lemmy.ml
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 hours ago

      In case you missed the Ornith 1.0 release (Qwen and Gemma RL finetunes for agentic / coding workloads), they look interesting to bridge the gap until we see larger 3.6 models or a 3.7 release. I didn’t test them yet but according to benchmarks, the 35b MoE seems to be more or less on par with Qwen3.6 27b dense, while ofc a lot faster.