• 1 Post
  • 44 Comments
Joined 7 months ago
cake
Cake day: July 4th, 2024

help-circle




  • So… as far as I understand from this thread, it’s basically a finished model (llama or qwen) which is then fine tuned using an unknown dataset? That’d explain the claimed 6M training cost, hiding the fact that the heavy lifting has been made by others (US of A’s Meta in this case). Nothing revolutionary to see here, I guess. Small improvements are nice to have, though. I wonder how their smallest models perform, are they any better than llama3.2:8b?

















  • Meanwhile MS EXCEL: This file with random macros from a shady website could gain admin rights, install 3500 viruses, lock you out, join a botnet, put a million dollar ransom on your PC all within the first minute after opening without you even noticing. Please click ok if you are fine with that.