Meditron An Open Source Medical Large Language Models (LLMs)

Hello Learners…

Welcome to the blog…

Introduction
Meditron An Open Source Medical Large Language Models (LLMs)
Meditron Model Details
Packages Requirements For Meditron
How to use Meditron?
Summary
References

Introduction

In this post, we discuss about new large language model Meditron An Open Source Medical Large Language Models (LLMs).

MEDITRON builds on Llama-2 (through our adaptation of Nvidia’s Megatron-LM distributed trainer), and extends pretraining on a comprehensively curated medical corpus, including selected PubMed articles, abstracts, and internationally-recognized medical guidelines

Meditron An Open Source Medical Large Language Models (LLMs)

Large language models (LLMs) can potentially democratize access to medical knowledge.

While many efforts have been made to harness and improve LLMs’ medical knowledge and reasoning capacities, the resulting models are either closed-source (e.g., PaLM, GPT-4) or limited in scale (<= 13B parameters), which restricts their abilities.

In this work, They improve access to large-scale medical LLMs by releasing MEDITRON: a suite of open-source LLMs with 7B and 70B parameters adapted to the medical domain

GitHub URL:

https://github.com/epfLLM/meditron

Research Paper URL:

https://arxiv.org/abs/2311.16079

Meditron Model Details

Developed by: EPFL LLM Team
Model type: Causal decoder-only transformer language model
Language(s): English (mainly)
Model License: LLAMA 2 COMMUNITY LICENSE AGREEMENT
Code License: APACHE 2.0 LICENSE
Continue-pretrained from model: Llama-2-70B
Context length: 4k tokens
Input: Text only data
Output: Model generates text only
Status: This is a static model trained on an offline dataset. Future versions of the tuned models will be released as we enhance model’s performance.
Knowledge Cutoff: August 2023
Trainer: epflLLM/Megatron-LLM
Paper: Meditron-70B: Scaling Medical Pretraining for Large Language Models

Packages Requirements For Meditron

To run this model, we have to install the necessary packages:

vllm >= 0.2.1
transformers >= 4.34.0
datasets >= 2.14.6
torch >= 2.0.1

How to use Meditron?

We can load Meditron model directly from the HuggingFace model hub as follows:

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("epfl-llm/meditron-70B")
model = AutoModelForCausalLM.from_pretrained("epfl-llm/meditron-70B")

Medical Training Data For Meditron

They release code to download and pre-process the data used to train Meditron.

MediTron’s domain-adaptive pre-training corpus GAP-Replay combines 48.1B tokens from four corpora:

Clinical Guidelines: a new corpus of 46K clinical practice guidelines from various healthcare-related sources, including hospitals and international organizations,
Paper Abstracts: 16.1M abstracts extracted from closed-access PubMed and PubMed Central papers,
Medical Papers: full-text articles extracted from 5M publicly available PubMed and PubMed Central papers.
Replay dataset: 400M tokens of general domain pretraining data sampled from RedPajama-v1.

Uses Of Meditron

Meditron-70B is being made available for further testing and assessment as an AI assistant to enhance clinical decision-making and democratize access to an LLM for healthcare use. Potential use cases may include but are not limited to:

Medical exam question answering
Supporting differential diagnosis
Disease information (symptoms, cause, treatment) query
General health information query

It is possible to use this model to generate text, which is useful for experimentation and understanding its capabilities. It should not be used directly for production or work that may impact people.

They do not recommend using this model for natural language generation in a production environment, finetuned or otherwise.

MEDITRON achieves a 6% absolute performance gain over the best public baseline in its parameter class and 3% over the strongest baseline from Llama-2. Compared to closed-source LLMs, MEDITRON-70B outperforms GPT-3.5 and Med-PaLM and is within 5% of GPT-4 and 10% of Med-PaLM-2

Summary

The blog emphasizes the importance of open-source development and provides access to the code for corpus curation and the MEDITRON model weights.

https://galaxyofai.com/tag/llms/

M	T	W	T	F	S	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30

Meditron An Open Source Medical Large Language Models (LLMs)

Table Of Contents

Introduction

Meditron An Open Source Medical Large Language Models (LLMs)

Meditron Model Details

Packages Requirements For Meditron

How to use Meditron?

Medical Training Data For Meditron

Uses Of Meditron

Summary

References

Leave a Comment Cancel reply