Qwen3-30B-A3B-TopK4-Compressed

์ด ๋ชจ๋ธ์€ Qwen3-30B-A3B Mixture of Experts ๋ชจ๋ธ์˜ Top-k๋ฅผ 8์—์„œ 4๋กœ ๊ฐ์†Œ์‹œ์ผœ ์••์ถ•ํ•œ ๋ฒ„์ „์ž…๋‹ˆ๋‹ค.

๋ชจ๋ธ ์ •๋ณด

  • ๊ธฐ๋ณธ ๋ชจ๋ธ: Qwen3-30B-A3B
  • ์••์ถ• ๋ฐฉ์‹: Top-k Reduction (8 โ†’ 4)
  • ์••์ถ•๋ฅ : 59.5%
  • MMLU ์ •ํ™•๋„: 42.9% (7๊ฐœ ์นดํ…Œ๊ณ ๋ฆฌ ํ‰๊ท )

์••์ถ• ์„ธ๋ถ€์‚ฌํ•ญ

  • ์›๋ณธ Top-k: 8๊ฐœ ์ „๋ฌธ๊ฐ€ ํ™œ์„ฑํ™”
  • ์••์ถ• Top-k: 4๊ฐœ ์ „๋ฌธ๊ฐ€ ํ™œ์„ฑํ™”
  • ์••์ถ• ํšจ๊ณผ: ๋ชจ๋ธ ํฌ๊ธฐ์™€ ์ถ”๋ก  ๋น„์šฉ์„ ํฌ๊ฒŒ ์ค„์ด๋ฉด์„œ๋„ ํ•ฉ๋ฆฌ์ ์ธ ์„ฑ๋Šฅ ์œ ์ง€

์„ฑ๋Šฅ ํ‰๊ฐ€

MMLU ๋ฒค์น˜๋งˆํฌ 7๊ฐœ ์นดํ…Œ๊ณ ๋ฆฌ์—์„œ ํ…Œ์ŠคํŠธ:

  • abstract_algebra
  • anatomy
  • high_school_mathematics
  • formal_logic
  • professional_medicine
  • high_school_macroeconomics
  • global_facts

์‚ฌ์šฉ๋ฒ•

from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "kyne0127/Qwen3-30B-A3B-TopK4-Compressed"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True)

# ์ถ”๋ก  ์˜ˆ์‹œ
input_text = "What is the capital of France?"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

๋ผ์ด์„ ์Šค

Apache 2.0 License

์ œ์ž‘์ž

kyne0127

Downloads last month
8
Safetensors
Model size
31B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for kyne0127/Qwen3-30B-A3B-TopK4-Compressed

Finetuned
Qwen/Qwen3-30B-A3B
Finetuned
(31)
this model
Quantizations
2 models