CaptainExplosion said:
Is compression a processing power improvement? |
Hmm. I'm not intimately familiar with LLMs, but maybe. I've often heard how e.g. 8 GB of VRAM isn't great for running LLMs locally, but with compression, that VRAM might be able to go a longer way. I'm guessing the main bottlenecks are elsewhere, but perhaps improved compression techniques could also help. I'm really not qualified enough to give more than that. AI itself suggests my line of thinking is correct, but as is often the case, it sounds to be quite a bit more nuanced than that.







