Fengshen List (Fengshen List Large Model)

Fengshen List (Fengshen List Large Model)

The "Fengshen List" is a long-term open source project jointly maintained by a team of engineers, researchers, and interns from the Cognitive Computing and Natural Language Center of the International Digital Economy Academy (IDEA) in the Greater Bay Area of Guangdong, Hong Kong, and Macau. The "Fengshen List" open-source system will re-examine the entire Chinese pre training big model open-source community, comprehensively promote the development of the entire Chinese big model community, and aim to become the infrastructure for Chinese cognitive intelligence. Ziya's Universal Large Model V1 is a large-scale pre trained model based on LLaMa with 13 billion parameters, capable of translation, programming, text classification, information extraction, summarization, copy generation, common sense Q&A, and mathematical calculations. At present, the Jiang Ziya general large model has completed a three-stage training process including large-scale pre training (PT), multi task supervised fine-tuning (SFT), and human feedback learning (RLHF). Ziya's universal big model can assist human-machine collaboration in multiple application scenarios such as digital humans, copywriting, chatbots, business assistants, Q&A, and code generation, improving work and production efficiency. The "Fengshen List" is the largest open-source pre training model system in Chinese, with over 98 open-source pre training models currently available. At present, the first Chinese Stable Diffusion and CLIP model have been open sourced. Models such as Erlangshen UniMC have won multiple championships on lists such as FewCLUE/ZeroCLUE. Accumulate data and computing power into pre trained models with cognitive abilities, with the goal of becoming a solid foundation for massive downstream tasks and various algorithm innovation research. The GTS model production platform focuses on the field of natural language processing, serving numerous business scenarios such as intelligent customer service, data semantic analysis, recommendation systems, etc. It supports tasks such as e-commerce comment sentiment analysis, scientific literature subject classification, news classification, content review, etc Under the GTS training system, only a small number of training samples need to be input, and there is no need for AI model training related knowledge to obtain a lightweight small model that can be directly deployed.

Tag: IDEA CCNL

Reading: 42 2023-07-22

IFlytek Spark Cognitive Big Model

IFlytek Spark Cognitive Big Model

The new generation of cognitive intelligence model launched by iFlytek has cross domain knowledge and language comprehension ability, which can understand and execute tasks based on natural dialogue. Continuously evolving from massive data and knowledge, achieving a closed-loop process from proposal, planning to problem-solving. Language comprehension Machine Translation Translate text into multiple languages, including commonly used languages such as English, Chinese, French, German, Spanish, etc Text Summary Extract concise and accurate summaries from the text to quickly understand the core points of the article Grammar check Check for grammar errors and provide correct grammar advice to make writing more standardized and professional Emotion analysis Analyze the emotional colors in the text, such as positive, negative, or neutral, to better understand the content, viewpoints, and attitudes Knowledge Q&A Common Sense of Life Provide knowledge about daily life, such as advice on diet, exercise, travel, etc Work skills Provide advice on work-related knowledge, such as communication skills, time management skills, team collaboration, etc Medical knowledge Provide basic health knowledge and advice on disease prevention, diagnosis, and treatment History and Humanities Provide copywriting related to historical events, cultural heritage, celebrity stories, famous quotes and proverbs, etc Logical reasoning Thinking reasoning Derive answers or solutions by analyzing the premises and assumptions of a problem, and provide new ideas and insights Scientific reasoning Basic tasks in scientific research, such as using existing data and information for inference, prediction, and validation Common sense reasoning When engaging in dialogue and communication, use existing common sense knowledge to analyze, explain, and respond to users' questions or needs Math problem solving Equation solving Including quadratic equations, quadratic equations, cubic equations, and so on Geometric problems Plane geometry (such as properties of lines, circles, triangles, etc.) and solid geometry (such as volume, surface area, projection, etc.) Calculus Dealing with calculus related problems such as derivatives and integrals, involving basic concepts such as limits, continuity, derivatives, etc Probability statistics Content related to random variables, probability distributions, hypothesis testing, etc Code Understanding and Writing Code Understanding Help users understand the vast majority of programming languages, algorithms, and data structures, and quickly provide the required answers Code modification Modify or optimize existing code, provide suggestions and guidance, identify potential issues and provide solutions Code Writing Help users quickly write simple code snippets such as functions, classes, or loops Step compilation Provide documentation and tools about programming languages, such as syntax rules, function libraries, autocomplete code tools, etc

Reading: 40 2023-07-22

New AI makes your business easier, Xiaoduo AI e-commerce expert big model - "XiaoModel" XPT

New AI makes your business easier, Xiaoduo AI e-commerce expert big model - "XiaoModel" XPT

XiaoModel XPT is a large-scale language model developed by Xiaoduo Technology based on big language modeling technology, which is perpendicular to the e-commerce industry. It can understand and generate natural language and provide intelligent services and integrated marketing solutions for e-commerce enterprises. The XPT model has the following characteristics: High professionalism: XPT adopts a deep learning model of generative pre trained Transformer, which has the same core technology architecture as ChatGPT, but has undergone extensive domain knowledge training in the e-commerce industry, improving the professionalism and accuracy of the model. At the same time, with the powerful computing resources of the National Supercomputing Chengdu Center, the XPT model can quickly and accurately understand and respond to user needs in complex e-commerce scenarios, while ensuring more secure information. High adaptability: The Xiao model XPT utilizes Xiao Duo's accumulated domain knowledge of over 700 million tokens related to the e-commerce industry over the years for training, enabling the model to better understand the characteristics of the e-commerce field and potential customer needs, and provide better services to customers. Xiao Model XPT can also automatically adjust language style and content based on different industries, categories, brands, user profiles, and other characteristics, generate personalized service and marketing copy, and improve user satisfaction and conversion rates. High innovation: The Xiao model XPT can not only complete traditional tasks such as Q&A, recommendation, and search, but also generate creative content such as product descriptions and purchase points, providing e-commerce companies with more inspiration and choices. Therefore, the Xiao model performs better in a series of tasks such as assisting customer service reception, generating selling points, marketing follow-up, quality inspection, and intelligent after-sales analysis. Whether it is customer service, operations, supervisors, or decision-makers, they can easily achieve the intelligent upgrade of the new paradigm of e-commerce through the Xiao model, improve customer satisfaction and performance, and truly experience the beauty and convenience of the AI era

Reading: 29 2023-07-23

Anima: The first open-source 33B Chinese language model based on QLoRA

Anima: The first open-source 33B Chinese language model based on QLoRA

The first open-source 33B Chinese language model based on QLoRA. The AI community has always been very open, and the development of AI today cannot be separated from many important open source works, open shared papers, or open source data and code. We believe that the future of AI will also be open. I hope to make some contributions to the open source community. Why is the 33B model important? Is QLoRA a Game Changer? Previously, most of the open-source models that could be finetuned were relatively small models 7B or 13b, although they could perform well through finetune training on some simple chatbot evaluation sets. However, due to the limited scale of these models, the reasoning ability of LLM core is relatively weak. That's why many of these small-scale models behave like toys in practical application scenarios. As discussed in this work, the chatbot evaluation set is relatively simple, and there is still a clear gap between small and large models in complex logical reasoning and mathematical problems that truly test the model's ability. Therefore, we believe that the work of QLoRA is very important, to the point where it could be a Game Changer. Through the optimization method of QLoRA, for the first time, a 33B scale model can be trained with a more democratic and low-cost finetune, and widely used. We believe that the 33B model can not only leverage the strong reasoning ability of large-scale models, but also flexibly fine tune training for private business domain data to enhance control over LLM.

Tag: QLoRA

Reading: 21 2023-07-23

Recommend