Clicky
Artificial intelligence

Discover Baichuan-13B: the great Chinese open source language model to compete with OpenAI

Wang Xiaochuan, the founder of Chinese search engine Sogou, has released a huge new language model called Baichuan-13B through his company, Baichuan Intelligence. Commercial use by programmers and researchers is currently restricted. Sogou founder Wang Xiaochuan recently posted on Weibo that “China needs its own OpenAI.” The Chinese businessman is on the verge of realizing his vision after his start-up company, Baichuan Intelligence, launched Baichuan-13B, its next-generation grand language model. Baichuan was launched three months ago and quickly attracted a group of investors ready to contribute $50 million. Thanks to the founder’s exceptional IT skills, his organization is now considered one of China’s most promising creators of gigantic language models.

The Baichuan-13B follows the same transformer design as the GPT and most local Chinese variants. In addition to being trained on Chinese and English data, its 13 billion parameters (variables used in text production and analysis) are bilingual. The model is open source and can be used for profit, and it was built using data from GitHub.

After the success of Baichuan-7B, Baichuan Intelligent Technology created Baichuan-13B, a commercially available open-source large-scale language model with 13 billion parameters. On met Chinese and English standards, it outperforms competitors of similar size. Both base (Baichuan-13B-Base) and alignment (Baichuan-13B-Chat) versions are included in this deployment.

Features

  • Baichuan-13B builds on Baichuan-7B by increasing the number of parameters to 13 billion, and it trained 1.4 trillion tokens on high-quality corpora, which is 40% more than LLaMA-13B. Currently, under the open source size 13B, it is the model with the most training data. It uses ALiBi positional encoding and a 4096 byte context window and works in Chinese and English.
  • The pre-training model serves as a “base” for developers, while the model aligned with dialog features is more in demand by regular users. Therefore, the aligned model (Baichuan-13B-Chat) is included in this open source version, offering powerful dialog features, being ready to use and requiring only a few lines of code to deploy.
  • The researchers also offer int8 and int4 quantized versions, which are even more efficient for inference, to encourage widespread user use. They can be implemented on consumer graphics cards like the Nvidia 3090, but the unquantized version requires much more powerful hardware.
  • Free for public use with no restrictions on resale or modification: If a developer requests an official commercial license via email, they can use Baichuan-13B for commercial purposes free of charge.

About 1.4 billion tokens are used to teach Baichuan-13. ChatGPT-3, according to OpenAI, was reportedly trained on 300 billion tokens. The Baichuan team doubled in size in three months, reaching fifty members, and publicly demonstrated its model, Baichuan-7B, which has seven billion parameters, last month. The Baichuan-13B version, released two days ago, is the simplest version. It is now offered free of charge to researchers and programmers who have obtained legal permission to use it for commercial purposes. The future of the official version of the model for widespread use has not yet been discovered.

The basic Baichuan-13B model is now freely available to researchers and programmers who have obtained the necessary legal permissions for commercial use. In light of recent US restrictions against Chinese artificial intelligence (AI) chipmakers, the fact that variants of this model can run on consumer hardware like Nvidia’s 3090 graphics cards is particularly noteworthy.

Baichuan Intelligent Technology researchers confirm that their group has not yet created Baichuan-13B-based applications for any platform, including iOS, Android, web or others. Users are requested not to use the Baichuan-13B model for illegal or harmful purposes, such as endangering national or social security. Users are also encouraged to refrain from using the Baichuan-13B model for Internet services without the necessary security audits and filings. They rely on anyone who abides by this rule to keep technological progress within the bounds of the law.


Check GitHub link. Don’t forget to join our 26k+ ML subreddit, Discord Channel, And E-mail, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, please feel free to email us at Asif@marktechpost.com

🚀 Discover over 900 AI tools in AI Tools Club


Dhanshree Shenwai is a Computer Engineer with good experience in FinTech companies spanning Finance, Cards & Payments and Banking with keen interest in AI applications. She is enthusiastic about exploring new technologies and advancements in today’s changing world, making everyone’s life easier.


Leave a Reply