Most users still believe you need an NVIDIA RTX 3090 to run a decent 13B model. That is false.
If you want to run this model today using the latest version of llama.cpp , LM Studio, or Ollama, you should convert the old .bin file to the modern format.
You may need an older commit of the nomic-ai/gpt4all repository that still supports the .bin format.
Have you created or used a repacked LoRA quantized model? Let me know in the comments or find me on the GPT4All Discord.
Most users still believe you need an NVIDIA RTX 3090 to run a decent 13B model. That is false.
If you want to run this model today using the latest version of llama.cpp , LM Studio, or Ollama, you should convert the old .bin file to the modern format.
You may need an older commit of the nomic-ai/gpt4all repository that still supports the .bin format.
Have you created or used a repacked LoRA quantized model? Let me know in the comments or find me on the GPT4All Discord.