🇮🇷 Persian TTS — Piper EN Base → ManaTTS (v1)
Model name: fa-ir-tts-piper-en-mantatts-v1
Previous name: kiarashQ/fa_IR-mantatts
Sampling rate: 22,050 Hz
Base checkpoint:ar/ar_JO/kareem/medium/epoch=5079-step=1682020.ckpt (Piper AR, medium)
This is a Persian (fa-IR) single-speaker TTS model fine-tuned from the Arabic Piper medium checkpoint on the ManaTTS dataset.
⭐ Highlights
- ✔️ Arabic phoneme system provides better accuracy for certain Persian words
- ✔️ Produces stable, smooth speech
- ✔️ Complements the EN-based model — each excels at different phonemes
- ✔️ Output at 22.05 kHz
🧪 Training Details
Training script: piper_train
Hardware: 1× GPU A4000
Dataset: ManaTTS
Batch size: 16
Precision: 32-bit
Validation split: 1%
Test samples: 5
Training epochs: 20
Logging: every 2000 steps
Quality setting: medium
Checkpoint frequency: every 1 epoch
No resume checkpoint (fresh fine-tune)
Training command:
piper_train \
--dataset-dir /workspace/piper_full/piper_dataset \
--accelerator gpu --devices 1 \
--batch-size 16 \
--validation-split 0.01 \
--num-test-examples 5 \
--quality medium \
--checkpoint-epochs 1 \
--max_epochs 20 \
--precision 32 \
--log_every_n_steps 2000
🔊 Inference Example
piper \
--model model.onnx \
--config config.json \
--text "سلام! حال شما چطور است؟" \
--output_file out.wav
Python:
import subprocess
text = "سلام! امروز هوا چطور است؟"
subprocess.run([
"piper", "--model", "model.onnx", "--config", "config.json",
"--text", text, "--output_file", "out.wav"
])
🔍 Observations
- The Arabic-base version sometimes pronounces Persian words more correctly than the EN-base model.
- Slightly lower overall accent naturalness compared to EN-base.
- Useful as a complementary voice to EN-base.
📜 License
Apache-2.0
🙏 Credits
- Piper TTS
- ManaTTS dataset
- Model fine-tuning by @kiarashQ
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for kiarashQ/fa-ir-tts-piper-ar-mantatts-v1
Base model
rhasspy/piper-voices