Zebra-Llama

Zebra-Llama (v0.2) is a specialized version of the Llama-3.1-8b-instruct model, fine-tuned with data specific to the rare disease Ehlers-Danlos Syndrome (EDS) - a rare connective tissue disorder. We utilized textual information from over 4,000 EDS papers from PubMed, more than 8,000 Reddit posts about EDS, and over 5,000 posts from the Inspire forum to gather real-world concerns/questions related to EDS, which were used to fine-tune the model. As a result, this model is adept at providing accurate responses to questions regarding EDS.

The model is trained using a specialized approach called "context-aware training," where we provided context for each question from a custom vector database during the training phase. This approach enabled the model to demonstrate high precision and recall during the inference phase when utilizing the RAG context. Additionally, the model showed a higher likelihood of generating correct citations compared to the base model.

Try Zebra-Llama

Here is the Jupyter Notebook Demo for Zebra-Llama.

Here is the API for the RAG knowledge base that we built for rare diseases, currently focussing on EDS.

Hugging face Model card

https://huggingface.co/zebraLLAMA/zebra-Llama-v0.2

Training details

Refer to config file to know the training parameters

We have also provided the training script that was used to fine-tune the Llama-3.1-8B-Instruct model

Citation

@misc{soman2024zebrallamacontextawarelargelanguage,
      title={Zebra-Llama: A Context-Aware Large Language Model for Democratizing Rare Disease Knowledge}, 
      author={Karthik Soman and Andrew Langdon and Catalina Villouta and Chinmay Agrawal and Lashaw Salta and Braian Peetoom and Gianmarco Bellucci and Orion J Buske},
      year={2024},
      eprint={2411.02657},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2411.02657}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
.github/workflows		.github/workflows
code		code
data		data
lambda-pipeline		lambda-pipeline
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Zebra-Llama

Try Zebra-Llama

Hugging face Model card

Training details

Citation

Team behind zebra-Llama

About

Releases

Packages

Contributors 3

Languages

karthiksoman/zebra-Llama

Folders and files

Latest commit

History

Repository files navigation

Zebra-Llama

Try Zebra-Llama

Hugging face Model card

Training details

Citation

Team behind zebra-Llama

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages