Meditron An Open Source Medical Large Language Models (LLMs)

Hello Learners…

Welcome to the blog…

Table Of Contents

  • Introduction
  • Meditron An Open Source Medical Large Language Models (LLMs)
  • Meditron Model Details
  • Packages Requirements For Meditron
  • How to use Meditron?
  • Summary
  • References

Introduction

In this post, we discuss about new large language model Meditron An Open Source Medical Large Language Models (LLMs).

MEDITRON builds on Llama-2 (through our adaptation of Nvidia’s Megatron-LM distributed trainer), and extends pretraining on a comprehensively curated medical corpus, including selected PubMed articles, abstracts, and internationally-recognized medical guidelines

Meditron An Open Source Medical Large Language Models (LLMs)

Large language models (LLMs) can potentially democratize access to medical knowledge.

While many efforts have been made to harness and improve LLMs’ medical knowledge and reasoning capacities, the resulting models are either closed-source (e.g., PaLM, GPT-4) or limited in scale (<= 13B parameters), which restricts their abilities.

In this work, They improve access to large-scale medical LLMs by releasing MEDITRON: a suite of open-source LLMs with 7B and 70B parameters adapted to the medical domain

GitHub URL:

Research Paper URL:

Meditron Model Details

Packages Requirements For Meditron

To run this model, we have to install the necessary packages:

vllm >= 0.2.1
transformers >= 4.34.0
datasets >= 2.14.6
torch >= 2.0.1

How to use Meditron?

We can load Meditron model directly from the HuggingFace model hub as follows:

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("epfl-llm/meditron-70B")
model = AutoModelForCausalLM.from_pretrained("epfl-llm/meditron-70B")

Medical Training Data For Meditron

They release code to download and pre-process the data used to train Meditron.

MediTron’s domain-adaptive pre-training corpus GAP-Replay combines 48.1B tokens from four corpora:

  • Clinical Guidelines: a new corpus of 46K clinical practice guidelines from various healthcare-related sources, including hospitals and international organizations,
  • Paper Abstracts: 16.1M abstracts extracted from closed-access PubMed and PubMed Central papers,
  • Medical Papers: full-text articles extracted from 5M publicly available PubMed and PubMed Central papers.
  • Replay dataset: 400M tokens of general domain pretraining data sampled from RedPajama-v1.
Medical Training Data For Meditron

Uses Of Meditron

Meditron-70B is being made available for further testing and assessment as an AI assistant to enhance clinical decision-making and democratize access to an LLM for healthcare use. Potential use cases may include but are not limited to:

  • Medical exam question answering
  • Supporting differential diagnosis
  • Disease information (symptoms, cause, treatment) query
  • General health information query

It is possible to use this model to generate text, which is useful for experimentation and understanding its capabilities. It should not be used directly for production or work that may impact people.

They do not recommend using this model for natural language generation in a production environment, finetuned or otherwise.

MEDITRON achieves a 6% absolute performance gain over the best public baseline in its parameter class and 3% over the strongest baseline from Llama-2. Compared to closed-source LLMs, MEDITRON-70B outperforms GPT-3.5 and Med-PaLM and is within 5% of GPT-4 and 10% of Med-PaLM-2

Summary

The blog emphasizes the importance of open-source development and provides access to the code for corpus curation and the MEDITRON model weights.

References

Leave a Comment