MahaMarathi 7B: Empowering Marathi Fluency with 7 Billion Parameters!

Hello Learners…

Welcome to the blog…

Table Of Contents

  • Introduction
  • MahaMarathi 7B: Empowering Marathi Fluency with 7 Billion Parameters!
  • Summary
  • References

Introduction

In this post we discuss about a large language model MahaMarathi 7B: Empowering Marathi Fluency with 7 Billion Parameters!

Dive into the linguistic marvel of MahaMarathi 7B, a 7-billion parameter model designed to enhance native Marathi fluency seamlessly.

MahaMarathi 7B: Empowering Marathi Fluency with 7 Billion Parameters!

MahaMarathi 7B, is a domain adapted, continually pre-trained, and instruction fine-tuned native Marathi large language model (LLM) with 7 billion parameters based on Llama2+Mistral, and trained on a large corpus of Marathi text.

The developers released this model as a base model and advise against using it as is. It is recommended to first fine-tune it for specific tasks.

Here is the python code for this model.

# Usage
import torch
from transformers import LlamaTokenizer, LlamaForCausalLM

tokenizer = LlamaTokenizer.from_pretrained('marathi-llm/MahaMarathi-7B-v24.01-Base')
model = LlamaForCausalLM.from_pretrained('marathi-llm/MahaMarathi-7B-v24.01-Base', torch_dtype=torch.bfloat16)

prompt = "मी एक ए. आय. द्वारा तयार केलेले महाभाषा समीकरण संच आहे."
inputs = tokenizer(prompt, return_tensors="pt")

# Generate
generate_ids = model.generate(inputs.input_ids, max_length=30)
tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]

We can use this LLM, also using Transformers Library

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="marathi-llm/MahaMarathi-7B-v24.01-Base")

Also we can use this by loading model directly,

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("marathi-llm/MahaMarathi-7B-v24.01-Base")
model = AutoModelForCausalLM.from_pretrained("marathi-llm/MahaMarathi-7B-v24.01-Base")

To use this model we have get access first,

Go to the below URL,

Summary

MahaMarathi 7B, a native Marathi large language model, boasts 7 billion parameters for enhanced language understanding. Pre-trained on diverse Marathi text, it’s ideal for various applications. Note: Fine-tune before use.

Also you can refer other LLMs,

References

2 thoughts on “MahaMarathi 7B: Empowering Marathi Fluency with 7 Billion Parameters!”

Leave a Comment