Country: China Type: large-model
Tag: QLoRA
Chinese Websites: https://github.com/lyogavin/Anima Enter The Website
The first open-source 33B Chinese language model based on QLoRA.
The AI community has always been very open, and the development of AI today cannot be separated from many important open source works, open shared papers, or open source data and code. We believe that the future of AI will also be open. I hope to make some contributions to the open source community.
Why is the 33B model important? Is QLoRA a Game Changer?
Previously, most of the open-source models that could be finetuned were relatively small models 7B or 13b, although they could perform well through finetune training on some simple chatbot evaluation sets. However, due to the limited scale of these models, the reasoning ability of LLM core is relatively weak. That's why many of these small-scale models behave like toys in practical application scenarios. As discussed in this work, the chatbot evaluation set is relatively simple, and there is still a clear gap between small and large models in complex logical reasoning and mathematical problems that truly test the model's ability.
Therefore, we believe that the work of QLoRA is very important, to the point where it could be a Game Changer. Through the optimization method of QLoRA, for the first time, a 33B scale model can be trained with a more democratic and low-cost finetune, and widely used. We believe that the 33B model can not only leverage the strong reasoning ability of large-scale models, but also flexibly fine tune training for private business domain data to enhance control over LLM.