Phú Võ's picture

84 12

Phú Võ

phuvo

·

phuvo

AI & ML interests

None yet

Recent Activity

upvoted a paper about 8 hours ago

Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

upvoted a paper about 8 hours ago

DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

upvoted a paper 5 months ago

Fast and Simplex: 2-Simplicial Attention in Triton

View all activity

Organizations

None yet

upvoted 2 papers about 8 hours ago

Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

Paper • 2511.22699 • Published 11 days ago • 160

DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

Paper • 2512.02556 • Published 6 days ago • 178

upvoted a paper 5 months ago

Fast and Simplex: 2-Simplicial Attention in Triton

Paper • 2507.02754 • Published Jul 3 • 26

upvoted 2 papers 6 months ago

MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Paper • 2506.13585 • Published Jun 16 • 272

Time Blindness: Why Video-Language Models Can't See What Humans Can?

Paper • 2505.24867 • Published May 30 • 80

upvoted a paper 7 months ago

BitNet v2: Native 4-bit Activations with Hadamard Transformation for 1-bit LLMs

Paper • 2504.18415 • Published Apr 25 • 47

upvoted a paper 8 months ago

BitNet b1.58 2B4T Technical Report

Paper • 2504.12285 • Published Apr 16 • 75

upvoted 2 papers 9 months ago

RWKV-7 "Goose" with Expressive Dynamic State Evolution

Paper • 2503.14456 • Published Mar 18 • 153

LongRoPE2: Near-Lossless LLM Context Window Scaling

Paper • 2502.20082 • Published Feb 27 • 38

upvoted 2 papers 10 months ago

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Paper • 2502.11089 • Published Feb 16 • 166

NoLiMa: Long-Context Evaluation Beyond Literal Matching

Paper • 2502.05167 • Published Feb 7 • 15

upvoted 5 papers about 1 year ago

Addition is All You Need for Energy-efficient Language Models

Paper • 2410.00907 • Published Oct 1, 2024 • 151

Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models

Paper • 2409.18943 • Published Sep 27, 2024 • 29

VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models

Paper • 2409.17066 • Published Sep 25, 2024 • 28

MIO: A Foundation Model on Multimodal Tokens

Paper • 2409.17692 • Published Sep 26, 2024 • 53

Emu3: Next-Token Prediction is All You Need

Paper • 2409.18869 • Published Sep 27, 2024 • 95

upvoted 4 papers over 1 year ago

OLMoE: Open Mixture-of-Experts Language Models

Paper • 2409.02060 • Published Sep 3, 2024 • 78

LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs

Paper • 2408.07055 • Published Aug 13, 2024 • 67

SAM 2: Segment Anything in Images and Videos

Paper • 2408.00714 • Published Aug 1, 2024 • 119

The Llama 3 Herd of Models

Paper • 2407.21783 • Published Jul 31, 2024 • 117