AgriModel QA
Fine-tune DistilBERT for extractive QA on agriculture Q/A data and run evaluation or inference.
Files
kcc_to_squad.py: Buildsagriculture_qa.jsonfrom CSV.models/train_qa.py: Fine-tunes the QA model.models/evaluate_qa.py: Computes EM/F1 on the dataset using the saved model.models/infer_qa.py: Simple interactive CLI for inference.models/agri_qa_model/: Directory with the trained model and tokenizer.
Setup
- Create a Python environment and install dependencies:
python -m venv .venv; .\.venv\Scripts\Activate.ps1; pip install -U pip; pip install -r requirements.txt
If PyTorch CPU wheels are slow for you, consider installing a CUDA-enabled build matching your GPU.
Train
models/train_qa.py expects data/agriculture_qa.json present. It saves to models/agri_qa_model/ by default.
python .\models\train_qa.py
You can override via env vars: BASE_MODEL, OUTPUT_DIR, EPOCHS, MAX_STEPS, MAX_LEN, DOC_STRIDE.
Evaluate
python .\models\evaluate_qa.py --model_dir models/agri_qa_model --data data\agriculture_qa.json --max_examples 200
Outputs EM and F1, and writes eval_predictions.csv.
Inference
Interactive mode (paste context once per question if not provided):
python .\models\infer_qa.py --model_dir models/agri_qa_model
Or provide a fixed context file:
python .\models\infer_qa.py --model_dir models/agri_qa_model --context sample_context.txt
Notes
- The training data builder includes "Q:"/"A:" markers in context; the trainer also aligns answers robustly.
- For production, consider exporting to ONNX or using
torch.compileand caching.
- Downloads last month
- 13