tarn59/book_flatten_and_crop_qwen_image_edit_2509 Image-to-Image โข Updated 22 days ago โข 233 โข โข 36
Running on Zero Featured 155 ReconViaGen ๐ฅ 155 High-fidelity 3D Geometry Generation from multi-view images
VibeVoice Collection Frontier Text-to-Speech Models https://microsoft.github.io/VibeVoice/ โข 8 items โข Updated 6 days ago โข 159
Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence Paper โข 2505.23747 โข Published May 29 โข 68
Distilling LLM Agent into Small Models with Retrieval and Code Tools Paper โข 2505.17612 โข Published May 23 โข 81
Runtime error 61 TRELLIS - Multiple Imagen a 3D ๐ 61 Scalable and Versatile 3D Generation from images
docling-project/SmolDocling-256M-preview Image-Text-to-Text โข 0.3B โข Updated Sep 17 โข 126k โข 1.6k
view article Article Llama can now see and run on your device - welcome Llama 3.2 +5 Sep 25, 2024 โข 191
meta-llama/Llama-3.2-11B-Vision-Instruct Image-Text-to-Text โข 11B โข Updated Dec 4, 2024 โข 181k โข โข 1.55k
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 โข 15 items โข Updated Dec 6, 2024 โข 647