cgt-llm-chatbot / bot.py

Commit History

Skip model loading when using Inference API
b5d5a5b

arahrooh commited on

Optimize memory usage: use float16 on CPU and fix double loading
1553f78

arahrooh commited on

Add HF_TOKEN support for gated models
084bec8

arahrooh commited on

Deploy CGT-LLM-Beta RAG Chatbot with vector database
086ffee

arahrooh commited on