Qwen3-30B-A3B-TopK4-Compressed
์ด ๋ชจ๋ธ์ Qwen3-30B-A3B Mixture of Experts ๋ชจ๋ธ์ Top-k๋ฅผ 8์์ 4๋ก ๊ฐ์์์ผ ์์ถํ ๋ฒ์ ์ ๋๋ค.
๋ชจ๋ธ ์ ๋ณด
- ๊ธฐ๋ณธ ๋ชจ๋ธ: Qwen3-30B-A3B
- ์์ถ ๋ฐฉ์: Top-k Reduction (8 โ 4)
- ์์ถ๋ฅ : 59.5%
- MMLU ์ ํ๋: 42.9% (7๊ฐ ์นดํ ๊ณ ๋ฆฌ ํ๊ท )
์์ถ ์ธ๋ถ์ฌํญ
- ์๋ณธ Top-k: 8๊ฐ ์ ๋ฌธ๊ฐ ํ์ฑํ
- ์์ถ Top-k: 4๊ฐ ์ ๋ฌธ๊ฐ ํ์ฑํ
- ์์ถ ํจ๊ณผ: ๋ชจ๋ธ ํฌ๊ธฐ์ ์ถ๋ก ๋น์ฉ์ ํฌ๊ฒ ์ค์ด๋ฉด์๋ ํฉ๋ฆฌ์ ์ธ ์ฑ๋ฅ ์ ์ง
์ฑ๋ฅ ํ๊ฐ
MMLU ๋ฒค์น๋งํฌ 7๊ฐ ์นดํ ๊ณ ๋ฆฌ์์ ํ ์คํธ:
- abstract_algebra
- anatomy
- high_school_mathematics
- formal_logic
- professional_medicine
- high_school_macroeconomics
- global_facts
์ฌ์ฉ๋ฒ
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = "kyne0127/Qwen3-30B-A3B-TopK4-Compressed"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True)
# ์ถ๋ก ์์
input_text = "What is the capital of France?"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
๋ผ์ด์ ์ค
Apache 2.0 License
์ ์์
kyne0127
- Downloads last month
- 8