• april@lemmy.world
    link
    fedilink
    English
    arrow-up
    50
    arrow-down
    2
    ·
    5 months ago

    Only the GPU and primarily the vram matters for LLMs. So this wouldn’t help at all.

    • mozz@mbin.grits.dev
      link
      fedilink
      arrow-up
      13
      arrow-down
      3
      ·
      edit-2
      5 months ago

      You’re the only one talking sense and you are sitting here with your 2 upvotes

      The AI company business model is 100% unsustainable. It’s hard to say when they will get sick of hemorrhaging money by giving away this stuff more or less for free, but it might be soon. That’s totally separate from any legal issues that might come up. If you care about this stuff, learning about doing it locally and having a self hosted solution in place might not be a bad idea.

      But upgrading anything aside from your GPU+VRAM is a pure and unfettered waste of money in that endeavor.

    • Time@sh.itjust.worksOP
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      2
      ·
      edit-2
      5 months ago

      Don’t you need tons of RAM to run LLMs? I thought the newer models needed up to 64GB RAM? Also, what about Stable Diffusion?

      • Pumpkin Escobar@lemmy.world
        link
        fedilink
        English
        arrow-up
        6
        ·
        5 months ago

        Taking ollama for instance, either the whole model runs in vram and compute is done on the gpu, or it runs in system ram and compute is done on the cpu. Running models on CPU is horribly slow. You won’t want to do it for large models

        LM studio and others allow you to run part of the model on GPU and part on CPU, splitting memory requirements but still pretty slow.

        Even the smaller 7B parameter models run pretty slow in CPU and the huge models are orders of magnitude slower

        So technically more system ram will let you run some larger models but you will quickly figure out you just don’t want to do it.

      • Findmysec@infosec.pub
        link
        fedilink
        English
        arrow-up
        2
        ·
        5 months ago

        They do, but VRAM. Unfortunately, the cards that do have that much of memory are used by OEMs/corporations and are insanely pricey

      • april@lemmy.world
        link
        fedilink
        English
        arrow-up
        9
        arrow-down
        1
        ·
        edit-2
        5 months ago

        Ram is important but it has to be vram not system ram.

        Only MacBooks can use the system ram because they have an integrated GPU rather than a dedicated one.

        Stable diffusion is the same situation.

    • cybersandwich@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      1
      ·
      edit-2
      5 months ago

      GPU with a ton of vran is what you need, BUT

      An alternate solution is something like a Mac mini with an m series chip and 16gb of unified memory. The neural cores on apple silicon are actually pretty impressive and since they use unified memory the models would have access to whatever the system has.

      I only mention it because a Mac mini might be cheaper than GPU with tons of vram by a couple hundred bucks.

      And it will sip power comparatively.

      4090 with 24gb of vram is $1900 M2 Mac mini with 24gb is $1000

      • L_Acacia@lemmy.one
        link
        fedilink
        English
        arrow-up
        3
        ·
        5 months ago

        Buying second hand 3090/7090xtx will be cheaper for better performances if you are not building the rest of the machine.

    • enkers@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      5 months ago

      One minor caveat where CPU could matter is AVX support. I couldn’t get ollama to run well on my system, despite having a decent GPU, because I’m using an ancient processor.