UAE’s G42 launches open source Arabic language AI model

Silicon Valley chip startup Cerebras unveils AI supercomputer

Startup Cerebras System’s new AI supercomputer Andromeda is seen at a data center in Santa Clara, California, U.S. October 2022. Rebecca Lewington/Cerebras Systems/Handout via REUTERS/File Photo Acquire Licensing Rights

Aug 30 (Reuters) – A group of engineers, researchers and a Silicon Valley-based chip company collaborated to release advanced Arabic language software that can power generative AI applications.

The new large language model called Jais contains 13 billion parameters that was made from a big batch of data combining Arabic and English, a portion of which is from computer code.

The group, which included academics and engineers embarked, on the project in part because they said there are few large language models that are bilingual.

The new language model was created with the help of supercomputers produced by the Silicon Valley-based Cerebras Systems, which designs dinner plate-sized chips that compete with Nvidia’s (NVDA.O) powerful AI hardware. Nvidia’s chips are in short supply, which has driven companies around the world to seek alternatives.

Named after the highest peak in the United Arab Emirates, Jais is a collaboration between Cerebras, Mohamed bin Zayed University of Artificial Intelligence and a subsidiary of the Abu Dhabi-based tech conglomerate G42 called Inception, which focuses on AI.

Because there is not enough Arabic data to train a model of Jais’ size, the computer code within the English language data helped train the model’s ability to reason, according to Mohamed bin Zayed University of Artificial Intelligence professor Timothy Baldwin.

“(Code) gives the model a big leg up in terms of reasoning abilities, because it spells out the (logical) steps,” Baldwin told Reuters.

Jais will be available via an open source license.

The group trained the Jais model on a Cerebras supercomputer called a Condor Galaxy built in partnership with G42. This year Cerebras announced it had agreed to build three such units with G42, with the first scheduled to arrive this year and two additional units to be delivered in 2024.

“This model was trained, from start to finish, of 13 billion (parameters), in three and a half days,” Cerebras CEO Andrew Feldman said. “But there was months of work before that.”

Reporting by Max A. Cherney in San Francisco; Editing by Josie Kao and Mark Porter

Our Standards: The Thomson Reuters Trust Principles.

Acquire Licensing Rights, opens new tab