Tensoic AI Releases Kan-Llama: A 7B Llama-2 LoRA PreTrained and FineTuned on ‘Kannada’ Tokens

0


Tensoic has recently introduced Kannada Llama (Kan-LLaMA) to address the limitations of language models (LLMs), focusing specifically on proprietary characteristics, computational resources, and barriers to broader research community contributions. Emphasize the importance of open models using mouth to facilitate innovation in natural language processing (NLP) and machine translation with emphasis. Despite the success of models such as META LAMA 2, there are inherent limitations when it comes to native support for non-English languages, which require expansion of language capacity

Current LLM projects, while impressive, often pose challenges due to their own nature and the need for multiple resources for training and implementation. The paper introduces Kannada as a solution, aiming to spread Llama-2 powerfully for less important Indian languages, especially Kannada, incorporate modification of the vocabulary of the model through a phrase fragment tokenizer, use low-level optimization (LoRA) for efficient training, and solve model optimize it to scale with specific data structures to increase its conversational capabilities, emphasizing the release of rules, datasets, and ultimately documentation.

The proposed method enhances the efficiency of Llama-2 vocabulary for efficient processing of Kannada texts. The sentence fragment tokenizer is trained on the Kannada text corpus and integrated with the existing Llama-2 tokenizer. Researchers use low-level optimization (LoRA) during pretraining to conserve the weight of previously trained models and reduce the total number of trainable parameters This effective training method enables computational training of LLMs low-level objects. Pretraining is done on about 600 million Kannada tokens from CulturaX Dataset using Nvidia A100 80GB instances and takes about 50 hours at an estimated cost of $170.

In conclusion, the paper addresses the challenges associated with LLMs, emphasizing the importance of using open-source models to foster innovation. The introduction of the Kannada Lama indicates a concerted effort to spread linguistic knowledge, especially in the case of less important Indian languages. A comprehensive approach including terminology optimization, minimum optimization, and maintenance optimization implies a circular approach to addressing the limitations of existing models Commitment to modeling openness and collaboration with organizations such as Microsoft to make LLMs more accessible for research and public use Reflects broader objectives, contributing to the development of state-of-the-art models of language.

Pragati Jhunjhunwala is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT), Kharagpur. She is a tech enthusiast and has a keen interest in the scope of software and data science applications. She is always reading about the developments in different field of AI and ML.

🧑‍💻 [FREE AI WEBINAR]’LangChain for Multimodal Apps: Chat With Text/Image Data’ (Jan 26, 2024)



Source link

You might also like
Leave A Reply

Your email address will not be published.