mcanoglu
/

deepseek-ai-deepseek-coder-1.3b-base-finetuned-defect-cwe-group-detection

@@ -5,7 +5,6 @@ tags:
 - generated_from_trainer
 metrics:
 - accuracy
-- f1
 - precision
 - recall
 model-index:
@@ -20,11 +19,10 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [deepseek-ai/deepseek-coder-1.3b-base](https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-base) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.5777
-- Accuracy: 0.7586
-- F1: 0.7499
-- Precision: 0.7513
-- Recall: 0.7586
 ## Model description
@@ -44,28 +42,30 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
-- train_batch_size: 2
-- eval_batch_size: 2
 - seed: 4711
-- gradient_accumulation_steps: 16
 - total_train_batch_size: 32
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 3
 - mixed_precision_training: Native AMP
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss | Accuracy | F1     | Precision | Recall |
-|:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|:---------:|:------:|
-| No log        | 1.0   | 462  | 0.4832          | 0.7743   | 0.7594 | 0.7720    | 0.7743 |
-| 0.5829        | 2.0   | 924  | 0.4705          | 0.7788   | 0.7700 | 0.7737    | 0.7788 |
-| 0.3078        | 3.0   | 1386 | 0.5777          | 0.7586   | 0.7499 | 0.7513    | 0.7586 |
 ### Framework versions
-- Transformers 4.37.0
-- Pytorch 2.1.2+cu121
-- Datasets 2.16.1
-- Tokenizers 0.15.1

 - generated_from_trainer
 metrics:
 - accuracy
 - precision
 - recall
 model-index:
 This model is a fine-tuned version of [deepseek-ai/deepseek-coder-1.3b-base](https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-base) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.6902
+- Accuracy: 0.7715
+- Precision: 0.8036
+- Recall: 0.5867
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
+- train_batch_size: 8
+- eval_batch_size: 8
 - seed: 4711
+- gradient_accumulation_steps: 4
 - total_train_batch_size: 32
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 5
 - mixed_precision_training: Native AMP
 ### Training results
+| Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision | Recall |
+|:-------------:|:-----:|:----:|:---------------:|:--------:|:---------:|:------:|
+| No log        | 1.0   | 462  | 0.4904          | 0.7800   | 0.6028    | 0.5178 |
+| 0.5739        | 2.0   | 925  | 0.4917          | 0.7985   | 0.8159    | 0.5552 |
+| 0.3111        | 3.0   | 1387 | 0.6582          | 0.7918   | 0.7907    | 0.5901 |
+| 0.2395        | 4.0   | 1850 | 0.6238          | 0.7800   | 0.8018    | 0.6132 |
+| 0.2047        | 4.99  | 2310 | 0.6902          | 0.7715   | 0.8036    | 0.5867 |
 ### Framework versions
+- Transformers 4.38.1
+- Pytorch 2.2.0+cu121
+- Datasets 2.17.1
+- Tokenizers 0.15.2

model-00001-of-00002.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a209cfe6427ea97e774bc64ad0d1a2e36d201e528ad7a3957d568d7aad544fc5
 size 4986380064

 version https://git-lfs.github.com/spec/v1
+oid sha256:4517032bacda9d229aaec266da3b3315be027e458708b004757aa9f853307bde
 size 4986380064

model-00002-of-00002.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:37a2bfdc5d4bac777ff692b2a728046862284a7f87ae73e9455be85302563498
 size 135332592

 version https://git-lfs.github.com/spec/v1
+oid sha256:3346e3051767df587fa5782269d01bb45c6b1ca6dcf5479cd2ed922c3ddf08f6
 size 135332592