Skip to content

karthiksoman/zebra-Llama

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

66 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Zebra-Llama

Zebra-Llama (v0.2) is a specialized version of the Llama-3.1-8b-instruct model, fine-tuned with data specific to the rare disease Ehlers-Danlos Syndrome (EDS) - a rare connective tissue disorder. We utilized textual information from over 4,000 EDS papers from PubMed, more than 8,000 Reddit posts about EDS, and over 5,000 posts from the Inspire forum to gather real-world concerns/questions related to EDS, which were used to fine-tune the model. As a result, this model is adept at providing accurate responses to questions regarding EDS.

The model is trained using a specialized approach called "context-aware training," where we provided context for each question from a custom vector database during the training phase. This approach enabled the model to demonstrate high precision and recall during the inference phase when utilizing the RAG context. Additionally, the model showed a higher likelihood of generating correct citations compared to the base model.

Try Zebra-Llama

Here is the Jupyter Notebook Demo for Zebra-Llama.

Here is the API for the RAG knowledge base that we built for rare diseases, currently focussing on EDS.

Hugging face Model card

https://huggingface.co/zebraLLAMA/zebra-Llama-v0.2

Training details

Refer to config file to know the training parameters

We have also provided the training script that was used to fine-tune the Llama-3.1-8B-Instruct model

fig2_v2

Citation

@misc{soman2024zebrallamacontextawarelargelanguage,
      title={Zebra-Llama: A Context-Aware Large Language Model for Democratizing Rare Disease Knowledge}, 
      author={Karthik Soman and Andrew Langdon and Catalina Villouta and Chinmay Agrawal and Lashaw Salta and Braian Peetoom and Gianmarco Bellucci and Orion J Buske},
      year={2024},
      eprint={2411.02657},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2411.02657}, 
}

Team behind zebra-Llama

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published