Alibaba unveils open-sourced AI models to rival Meta's Llama 2

share on

twitter /
facebook /
linkedin /
- email
- telegram
- whatsapp
- wechat
- pinterest
- line
- snapchat
- reddit

Alibaba Cloud, the digital technology and intelligence backbone of Alibaba Group, has rolled out two open-sourced AI models, including the pre-trained 7-billion-parameter model, Qwen-7B, and its conversationally fine-tuned version, Qwen-7B-Chat to compete with Meta's open-sourced model Llama 2.

In an effort to democratise AI technologies, the two new models’ code, model weights and documentation will be freely accessible to academics, researchers and commercial institutions worldwide. For commercial uses, the models will be free to use for companies with fewer than 100 million monthly active users. Programmes with more users can request a license from Alibaba Cloud.

The Qwen-7B was pre-trained on over two trillion tokens, including Chinese, English and other multilingual materials, code, and mathematics, covering general and professional fields. Its context length reaches 8K.

In training, the Qwen-7B-Chat model was aligned with human instructions. Both Qwen-7B and Qwen-7B-Chat models can be deployed on cloud and on-premises infrastructures. This aims to enable users to fine-tune the models and build their own high-quality generative models effectively and cost-efficiently.

Given the variety of skills that the new model is capable of, the pre-trained Qwen-7B model distinguished itself in the Massive Multi-task Language Understanding (MMLU) benchmark, scoring a notable 56.7 out of 100, outperforming other major pre-trained open-source models with similar scales or even some larger-size models, according to the release.

This benchmark assesses a text model's multitask accuracy across 57 varied tasks, encompassing fields such as elementary mathematics, computer science and law. Moreover, Qwen-7B achieved the highest score among models with equivalent parameters in the leaderboard of C-Eval, a comprehensive Chinese evaluation suite for foundational models.

"By open-sourcing our proprietary large language models, we aim to promote inclusive technologies and enable more developers and SMEs to reap the benefits of generative AI," said Jingren Zhou, CTO of Alibaba Cloud Intelligence.

He added: "As a determined long-term champion of open-source initiatives, we hope that this open approach can also bring collective wisdom to further help open-source communities thrive."

This comes following the introduction of Alibaba Cloud’s proprietary LLM, Tongyi Qianwen, earlier in April. The model, which is capable of generating human-like content in both Chinese and English, has different model sizes, including seven billion and above parameters. It also comes after tech giant Meta released a similar open-sourced model Llama 2 last month.

According to Meta, Llama 2 was trained on 40% more data than Llama 1, and has double the context length than Llama 1. Its fine-tuned models have been trained on over 1 million human annotations. Llama 2 outperforms other open source language models on many external benchmarks, including reasoning, coding, proficiency, and knowledge tests.

The biggest conference is back! Experience the future of marketing with 500+ brilliant minds at Digital Marketing Asia on 28 - 30 November in Singapore. Uncover groundbreaking strategies that connect leading brands with their target audiences effectively.

share on