SignRoundV2: Closing the Performance Gap in Extremely Low-Bit Post-Training Quantization for LLMs Paper β’ 2512.04746 β’ Published 4 days ago β’ 11 β’ 2
A dynamic parallel method for performance optimization on hybrid CPUs Paper β’ 2411.19542 β’ Published Nov 29, 2024 β’ 5 β’ 2
Efficient Post-training Quantization with FP8 Formats Paper β’ 2309.14592 β’ Published Sep 26, 2023 β’ 11 β’ 2
Effective Quantization for Diffusion Models on CPUs Paper β’ 2311.16133 β’ Published Nov 2, 2023 β’ 4 β’ 1
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs Paper β’ 2309.05516 β’ Published Sep 11, 2023 β’ 10 β’ 2