Yaseen's picture

1 7 1

Yaseen

myaseen

·

http://www.myaseen.me

AI & ML interests

None yet

Recent Activity

upvoted a paper 5 days ago

OLoRA: Orthonormal Low-Rank Adaptation of Large Language Models

reacted to sergiopaniego's post with 🔥 7 days ago

nanochat is now in transformers! The LLM by @karpathy is officially in the library, and we wrote a blog covering: how did we port the model, differences from the original, and how to run or train it. go read it 🤓 https://huggingface.co/spaces/nanochat-students/transformers

upvoted a paper 7 days ago

A Cartography of Open Collaboration in Open Source AI: Mapping Practices, Motivations, and Governance in 14 Open Large Language Model Projects

View all activity

Organizations

upvoted a paper 5 days ago

OLoRA: Orthonormal Low-Rank Adaptation of Large Language Models

Paper • 2406.01775 • Published Jun 3, 2024 • 2

reacted to sergiopaniego's post with 🔥 7 days ago

Post

1692

nanochat is now in transformers!

The LLM by @karpathy is officially in the library, and we wrote a blog covering: how did we port the model, differences from the original, and how to run or train it.

go read it 🤓

nanochat-students/transformers

upvoted a paper 7 days ago

A Cartography of Open Collaboration in Open Source AI: Mapping Practices, Motivations, and Governance in 14 Open Large Language Model Projects

Paper • 2509.25397 • Published Sep 29 • 12

upvoted an article 8 days ago

Article

Continuous batching from first principles

+1

13 days ago

•

248

reacted to danielhanchen's post with 🔥 8 days ago

Post

8171

Qwen3-Next can now be Run locally! (30GB RAM)
Instruct GGUF: unsloth/Qwen3-Next-80B-A3B-Instruct-GGUF

The models come in Thinking and Instruct versions and utilize a new architecture, allowing it to have ~10x faster inference than Qwen32B.
💜 Step-by-step Guide: https://docs.unsloth.ai/models/qwen3-next

Thinking GGUF: unsloth/Qwen3-Next-80B-A3B-Thinking-GGUF

liked a model 26 days ago

BAAI/bge-reranker-v2-m3

Text Classification • 0.6B • Updated Jun 24, 2024 • 2.83M • • 821

upvoted a paper about 2 months ago

Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6 • 493

upvoted 2 articles 2 months ago

Article

SmolVLM - small yet mighty Vision Language Model

+3

Nov 26, 2024

•

389

Article

Xet is on the Hub

+4

Mar 18

•

79

upvoted a paper 9 months ago

Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

Paper • 2503.09573 • Published Mar 12 • 74

New activity in openai/whisper-large-v3-turbo about 1 year ago

Torch compile + dynamo error

#11 opened about 1 year ago by

Torch compile + dynamo error

#11 opened about 1 year ago by