NeuralRemaster: Phase-Preserving Diffusion for Structure-Aligned Generation Paper • 2512.05106 • Published 4 days ago • 13
WorldMM: Dynamic Multimodal Memory Agent for Long Video Reasoning Paper • 2512.02425 • Published 6 days ago • 22
Ministral 3 Collection A collection of edge models, with Base, Instruct and Reasoning variants, in 3 different sizes: 3B, 8B and 14B. All with vision capabilities. • 9 items • Updated 6 days ago • 116
Mistral Large 3 Collection A state-of-the-art, open-weight, general-purpose multimodal model with a granular Mixture-of-Experts architecture. • 4 items • Updated 6 days ago • 70
Apriel-H1 Collection Introducing Apriel-H1 hybrids each blending Attention and Mamba State Space layers in varying proportions. • 8 items • Updated Nov 5 • 7
Apertus LLM Collection Democratizing Open and Compliant LLMs for Global Language Environments: 8B and 70B open-data open-weights models, multilingual in >1000 languages • 4 items • Updated Oct 1 • 304
AFM-Models Collection The models and training dataset of the paper: Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL • 12 items • Updated Aug 6 • 16
DINOv3 Collection DINOv3: foundation models producing excellent dense features, outperforming SotA w/o fine-tuning - https://arxiv.org/abs/2508.10104 • 13 items • Updated Aug 21 • 398
qqWen-Series Collection Based off the Qwen-2.5 Series - model finetuned for the Q programming language. • 12 items • Updated Oct 22 • 10
gpt-oss Collection Open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases. • 2 items • Updated Aug 7 • 391
Falcon-H1 Collection Falcon-H1 Family of Hybrid-Head Language Models (Transformer-SSM), including 0.5B, 1.5B, 1.5B-Deep, 3B, 7B, and 34B (pretrained & instruction-tuned). • 38 items • Updated Nov 6 • 56