|
|
--- |
|
|
license: apache-2.0 |
|
|
pipeline_tag: text-generation |
|
|
language: |
|
|
- en |
|
|
- he |
|
|
tags: |
|
|
- pretrained |
|
|
inference: |
|
|
parameters: |
|
|
temperature: 0.6 |
|
|
--- |
|
|
|
|
|
[<img src="https://i.ibb.co/5Lbwyr1/dicta-logo.jpg" width="300px"/>](https://dicta.org.il) |
|
|
|
|
|
# Dicta-LM 3.0: Advancing The Frontier of Hebrew Sovereign LLMs |
|
|
|
|
|
Dicta-LM 3.0 is a powerful open-weight collection of LLMs, trained on extensive corpora of Hebrew and English texts. The models are available for download and for unlimited use. The models set a new SOTA for their weight-class for Hebrew, both as base models and chat models. |
|
|
|
|
|
This is the 1.7-billion-parameter *reasoning* model, originally initialized from [Qwen3-1.7B-Base](https://huggingface.co/Qwen/Qwen3-1.7B-Base). |
|
|
|
|
|
This version of the model is quantized to 4-bits (with 16-bit activations), allowing for inference with significantly less memory although with weaker performance. |
|
|
|
|
|
This model is a reasoning chat model, which means that before responding to any given message from the user, the model first thinks out the right way to respond in a designated thinking block. |
|
|
|
|
|
For full details of this model please read our [release blog post](https://dicta.org.il/dicta-lm-3) or the [technical report](https://www.dicta.org.il/publications/DictaLM_3_0___Techincal_Report.pdf). |
|
|
|
|
|
You can view and access the full collection of base/instruct unquantized/quantized versions of `DictaLM 3.0` [here](https://huggingface.co/collections/dicta-il/dictalm-30-collection). |
|
|
|
|
|
## Instruction format |
|
|
|
|
|
In order to leverage instruction fine-tuning, your prompt should be rendered using the chat template specified for this model. Most libraries deal with this automatically, so you can just let them do it. |
|
|
|
|
|
## Usage |
|
|
|
|
|
We recommend using vLLM, but you can use Transformers as well: |
|
|
|
|
|
### Transformers |
|
|
|
|
|
### vLLM |
|
|
|
|
|
```bash |
|
|
vllm serve dicta-il/DictaLM-3.0-1.7B-Thinking-W4A16 --enable-auto-tool-choice --tool-call-parser hermes --reasoning_parser deepseek_r1 |
|
|
``` |
|
|
|
|
|
And then you can access it via the openai library: |
|
|
|
|
|
```python |
|
|
from openai import OpenAI |
|
|
|
|
|
client = OpenAI( |
|
|
base_url="http://localhost:8000/v1", |
|
|
api_key="sk-no-key-required" |
|
|
) |
|
|
|
|
|
response = client.chat.completions.create( |
|
|
model="dicta-il/DictaLM-3.0-1.7B-Thinking-W4A16", |
|
|
messages=[ |
|
|
{"role": "user", "content": "Hello, how are you?"} |
|
|
], |
|
|
) |
|
|
|
|
|
print(response.choices[0].message.content) |
|
|
``` |
|
|
|
|
|
> The reasoning traces should be available in the response structure in the designated fild. |
|
|
|
|
|
The model supports tool-calling, enabling integration with external tools and APIs. For example how to use the tool calling, see the [vLLM documentation](https://docs.vllm.ai/en/stable/features/tool_calling/#tool-calling). |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this model, please cite: |
|
|
|
|
|
```bibtex |
|
|
@article{Shmidman2025DictaLM3, |
|
|
title={{Dicta-LM 3.0: Advancing The Frontier of Hebrew Sovereign LLMs}}, |
|
|
author={Shaltiel Shmidman and Avi Shmidman and Amir DN Cohen and Moshe Koppel}, |
|
|
year={2025}, |
|
|
publisher={{DICTA / Jerusalem, Israel}}, |
|
|
note={https://www.dicta.org.il/publications/DictaLM_3_0___Techincal_Report.pdf} |
|
|
} |
|
|
``` |