How to selfhost a llm?

juli@programming.dev · 1 year ago

How to selfhost a llm?

DontNoodles@discuss.tchncs.de · 1 year ago

I’ve heard good things about H2O AI if you want to self host and tweak the model by uploading documents of your own (so that you get answers based on your dataset). I’m not sure how difficult it is. Maybe someone more knowledgeable will chime in.

Jvrava9@lemmy.dbzer0.com · 1 year ago

Maybe Serge would fit your use case.

das@lemellem.dasonic.xyz · 1 year ago

Surge is probably the easiest way to get a basic setup. If you just want to download a model and chat, I recommend it.

c10l@lemmy.world · 1 year ago

It’s pretty easy with Ollama. Install it, then ollama run mistral-7b (or another model, there’s a few available ootb). https://ollama.ai/

Another option is Llamafile. https://github.com/Mozilla-Ocho/llamafile

hazeebabee@slrpnk.net · 1 year ago

Sounds like a really cool project, sadly i dont have much knowledge to contribute. Still, what kind of issues have you run into? Any specific errors or problems?

das@lemellem.dasonic.xyz · 1 year ago

If you want to be able to get into the nitty gritty or play with options besides just a chat, I recommend Text Generation WebUI.

Installing is pretty easy, then you just download your desired model from Hugging Face.

Or if you want to use it for roleplay or adventure style games, KoboldCPP is easy to set up.

Sims@lemmy.ml · 1 year ago

If low on hw then look into petals or the kobold horde frameworks. Both share models in a p2p fashion afaik.

Petals at least, lets you create private networks, so you could host some of a model on your 24/7 server, some on your laptop CPU and the rest on your laptop GPU - as an example.

Haven’t tried tho, so good luck ;)