danielkorat
/

DeepSeek-R1-Distill-Qwen-1.5B-PRM-MathShepherd

Model card Files Files and versions

DeepSeek-R1-Distill-Qwen-1.5B-PRM-MathShepherd

3.1 GB

1 contributor

History: 7 commits

danielkorat's picture

Training in progress, step 3000

ca03878 verified 10 months ago

.gitattributes

1.57 kB

Training in progress, step 500 10 months ago
config.json

805 Bytes

Training in progress, step 500 10 months ago
model.safetensors

3.09 GB
xet

Training in progress, step 3000 10 months ago
special_tokens_map.json

485 Bytes

Training in progress, step 500 10 months ago
tokenizer.json

11.4 MB
xet

Training in progress, step 500 10 months ago
tokenizer_config.json

6.76 kB

Training in progress, step 500 10 months ago
training_args.bin
Detected Pickle imports (14)
- "transformers.integrations.deepspeed.HfDeepSpeedConfig",
- "trl.trainer.prm_config.PRMConfig",
- "transformers.trainer_pt_utils.AcceleratorConfig",
- "accelerate.utils.dataclasses.DistributedType",
- "transformers.trainer_utils.SaveStrategy",
- "transformers.trainer_utils.HubStrategy",
- "transformers.integrations.deepspeed.HfTrainerDeepSpeedConfig",
- "transformers.trainer_utils.SchedulerType",
- "transformers.training_args.OptimizerNames",
- "accelerate.state.PartialState",
- "accelerate.utils.dataclasses.DeepSpeedPlugin",
- "torch.device",
- "torch.bfloat16",
- "transformers.trainer_utils.IntervalStrategy"
How to fix it?
6.84 kB
xet

Training in progress, step 500 10 months ago