Home>China>All Categories

Xuanyuan - An open-source Chinese financial dialogue model

Xuanyuan, as an open-source Chinese financial dialogue model, is limited to non-commercial purposes only. The original intention of designing this model is to promote the application of non-commercial fields such as academic research, technological exploration, and personal learning. We encourage academia, developers, and researchers to use Xuan Yuan to drive progress in dialogue systems and finance. Among them, commercial use includes but is not limited to using Xuan Yuan for activities related to commercial interests such as products, services, consulting, etc. We do not assume any responsibility for the comments generated by the Xuan Yuan model. When using Xuan Yuan for non-commercial purposes, users need to bear potential risks on their own and always maintain caution. We recommend that users independently verify and judge the information output by the model, and make decisions based on their personal needs and context. We hope to provide a useful tool for academia and the developer community through Xuan Yuan's open source release, and promote the development of dialogue systems and financial technology. We encourage everyone to actively explore and innovate to further expand and apply the potential of Xuanyuan, and jointly promote research and practice of artificial intelligence in the financial field. We encourage users to cite Xuan Yuan in related work to promote knowledge exchange and sharing, and to advance the development of Chinese financial dialogue systems. The release of Xuan Yuan will provide strong support for applications and research in the financial field, and make important contributions to the development of Chinese financial dialogue systems. We look forward to seeing more innovation and applications to enhance financial services and user experience, and further promote the development of artificial intelligence technology in the financial sector.

Tag: finance

Reading: 60 2024-11-09

ChatLaw: Chinese Legal Model

Under the wave of ChatGPT, the continuous expansion and development of artificial intelligence have provided fertile soil for the spread of LLM. Currently, the fields of healthcare, education, and finance have gradually developed their own models, but there has been no significant progress in the legal field. In order to promote open research on the application of LLM in law and other vertical fields, this project has open-source the Chinese legal model and provided a reasonable solution for the combination of LLM and knowledge base in legal scenarios. The current open source versions of ChatLaw legal model for academic reference are Jiangziya-13B and Anima-33B. We use a large amount of original texts such as legal news, legal forums, laws, judicial interpretations, legal consultations, legal exam questions, and judgment documents to construct dialogue data. The model based on Jiangziya-13B is the first version of the model. Thanks to Jiang Ziya's excellent Chinese language ability and our strict requirements for data cleaning and data augmentation processes, we perform well in logically simple legal tasks, but often perform poorly in complex logical legal reasoning tasks. Subsequently, based on Anima-33B, we added training data and created ChatLaw-33B, which showed a significant improvement in logical reasoning ability. Therefore, it can be seen that large parameter Chinese LLM is crucial. Our technical report is here: arXiv: ChatLaw The version trained based on commercially available models will be used as the internal integration version for our subsequent products and is not open source to the outside world. You can try out the open source version of the model here

Tag: law

Reading: 55 2024-11-09

Anima: The first open-source 33B Chinese language model based on QLoRA

The first open-source 33B Chinese language model based on QLoRA. The AI community has always been very open, and the development of AI today cannot be separated from many important open source works, open shared papers, or open source data and code. We believe that the future of AI will also be open. I hope to make some contributions to the open source community. Why is the 33B model important? Is QLoRA a Game Changer? Previously, most of the open-source models that could be finetuned were relatively small models 7B or 13b, although they could perform well through finetune training on some simple chatbot evaluation sets. However, due to the limited scale of these models, the reasoning ability of LLM core is relatively weak. That's why many of these small-scale models behave like toys in practical application scenarios. As discussed in this work, the chatbot evaluation set is relatively simple, and there is still a clear gap between small and large models in complex logical reasoning and mathematical problems that truly test the model's ability. Therefore, we believe that the work of QLoRA is very important, to the point where it could be a Game Changer. Through the optimization method of QLoRA, for the first time, a 33B scale model can be trained with a more democratic and low-cost finetune, and widely used. We believe that the 33B model can not only leverage the strong reasoning ability of large-scale models, but also flexibly fine tune training for private business domain data to enhance control over LLM.

Tag: QLoRA

Reading: 22 2024-11-09

BayLing - a large-scale language model that follows instructions

BayLing is a large language model that follows instructions and has multilingual and multi round interaction capabilities consistent with humans. Large scale language models (LLMs) have demonstrated extraordinary abilities in language comprehension and generation. From base LLM to instruction following LLM, instruction fine-tuning plays a crucial role in aligning LLM with human preferences. However, existing LLMs typically focus on English, resulting in poorer performance in non English languages. In order to improve the performance of non English languages, it is necessary to collect language specific training data for the base LLM and fine tune instructions to build language specific instructions, both of which come at a significant cost. In order to minimize manual workload, we propose to transfer the ability of language generation and instruction following from English to other languages through interactive translation tasks. We have released Bailing, an LLM based on LLaMA that follows instructions. We automatically built interactive translation instructions to fine tune them. Adequate evaluation experiments have shown that Bailing achieves performance comparable to GPT-3.5-turbo, and uses much smaller parameter sizes (only 13 billion). The experimental results of the translation task showed that Bailing achieved 95% single round translation ability compared to GPT-4 under automatic evaluation, and 96% interactive translation ability compared to GPT-3.5-turbo under manual evaluation. To evaluate the performance of the model on general tasks, we created a multi round instruction test set called Bailing-80. The experimental results on Bailing-80 showed that Bailing achieved 89% performance compared to GPT-3.5-turbo. Bailing also performed well in the knowledge assessment test sets of Chinese college entrance examination and English SAT, second only to GPT-3.5-turbo among many LLMs that follow instructions. We have publicly released our training, inference, and evaluation code on GitHub. In addition, we also publicly disclosed the model weights of Bailing-7B and Bailing-13B on HuggingFace. We have built an online demonstration system to make it easier for the research community to use our model. In addition, we have also released the Bailing-80 test set we have developed, which includes 80 two round instructions in both Chinese and English, and can be used to comprehensively evaluate the multilingual and multi round interaction capabilities of LLMs. For more experiments and more detailed findings, please refer to our blog and papers.

Reading: 24 2024-11-09

New AI makes your business easier, Xiaoduo AI e-commerce expert big model - "XiaoModel" XPT

XiaoModel XPT is a large-scale language model developed by Xiaoduo Technology based on big language modeling technology, which is perpendicular to the e-commerce industry. It can understand and generate natural language and provide intelligent services and integrated marketing solutions for e-commerce enterprises. The XPT model has the following characteristics: High professionalism: XPT adopts a deep learning model of generative pre trained Transformer, which has the same core technology architecture as ChatGPT, but has undergone extensive domain knowledge training in the e-commerce industry, improving the professionalism and accuracy of the model. At the same time, with the powerful computing resources of the National Supercomputing Chengdu Center, the XPT model can quickly and accurately understand and respond to user needs in complex e-commerce scenarios, while ensuring more secure information. High adaptability: The Xiao model XPT utilizes Xiao Duo's accumulated domain knowledge of over 700 million tokens related to the e-commerce industry over the years for training, enabling the model to better understand the characteristics of the e-commerce field and potential customer needs, and provide better services to customers. Xiao Model XPT can also automatically adjust language style and content based on different industries, categories, brands, user profiles, and other characteristics, generate personalized service and marketing copy, and improve user satisfaction and conversion rates. High innovation: The Xiao model XPT can not only complete traditional tasks such as Q&A, recommendation, and search, but also generate creative content such as product descriptions and purchase points, providing e-commerce companies with more inspiration and choices. Therefore, the Xiao model performs better in a series of tasks such as assisting customer service reception, generating selling points, marketing follow-up, quality inspection, and intelligent after-sales analysis. Whether it is customer service, operations, supervisors, or decision-makers, they can easily achieve the intelligent upgrade of the new paradigm of e-commerce through the Xiao model, improve customer satisfaction and performance, and truly experience the beauty and convenience of the AI era

Reading: 29 2024-11-09

MiniMax Open Platform

MiniMax is named after an algorithm, an AI startup founded in December 2021, known as All in AGI. Actively participate in the tide of the rapid development of China's artificial intelligence technology, and strive to become the infrastructure builder and content application creator of the general artificial intelligence era. As a technology company in China that has the ability to integrate multiple modalities of text, speech, and vision into a universal large model engine and connects the entire product chain, the MiniMax team is committed to using leading universal artificial intelligence engine technology to improve user feedback levels and integrate multimodal AI technology through multi scenario and multi-dimensional applications and interactions, promoting the transformation of the new paradigm of universal artificial intelligence technology. Born over a year ago, we have independently developed basic model architectures for three modalities: text to visual, text to audio, and text to text. We have built a computational inference platform on top of these basic models. It is said that MiniMax employees first gave the large model a nickname called "ABAB" to imitate the language ability training. In the early stages, the model had unclear speech like a baby, only knowing "Aba Aba". The specific capabilities of the model were first tested on the company's to C virtual chat software product Glow, which was released in November last year. In Glow, users can create intelligent agents with background settings and specific personalities based on their preferences. Through content generation and user feedback, the AI capabilities behind the large model are constantly iterated, similar to the RLHF behind ChatGPT. Within four months of its launch, Glow's user base approached 5 million.

Reading: 264 2024-11-09

Peach and Plum Blossoms: A Large Model of International Chinese Education

Peach and Plum ": A Large Model of International Chinese Education With the widespread attention of ChatGPT and the emergence of various Large Language Models, natural language processing tasks in the general field have achieved great success, attracting widespread attention in the international Chinese education field. International Chinese educators have launched discussions on big models: can big models provide appropriate language expressions based on learners' levels, or provide detailed answers to learners' questions, thereby assisting or even serving as learning partners and language teachers to a certain extent? However, the effectiveness of large-scale models in the general domain is still limited in the vertical domain. To address the aforementioned issues, we have fully launched the large model "Taoli" 1.0, which is suitable for the international Chinese education field. This model has undergone additional training on data from the international Chinese education field. We have built an international Chinese education resource library based on over 500 international Chinese education textbooks and teaching aids, Chinese proficiency test questions, and Chinese learner dictionaries currently circulating in the international Chinese education field. We have set up various forms of instructions to fully utilize knowledge, constructed a high-quality international Chinese education question and answer dataset of 88000 items, and used the collected data to fine tune the model's instructions, enabling the model to acquire the ability to apply international Chinese education knowledge to specific scenarios.

Reading: 52 2024-11-09

Recommend