Pipeline MoE: A Flexible MoE Implementation with Pipeline Parallelism Paper • 2304.11414 • Published Apr 22, 2023 • 2
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models Paper • 2309.14717 • Published Sep 26, 2023 • 45